This is amazing. This number means each table could have thousands of partitions on average. At this scale, I feel the helix UI could be the bottleneck to monitor them :)
Not a related question, but just considered it after seeing the numbers... In the helix code, I did not see how it balance partitions in terms of how busy a resource could be, but helix simply ensures numbers of shards are evenly distributed. What happens if some resources are busy for some reason, and their distribution make all other resources are busy because they share nodes. I am new to helix, and its tutorial says we can use semi-auto or customized balancing strategies, but am still curious at practice the semi-auto approaches are scaled. On Sun, Oct 20, 2019 at 9:31 PM kishore g <[email protected]> wrote: > At LinkedIn, Helix manages thousands on nodes. 1 million segments > (equivalent of partitions) across thousands of tables. > > On Sun, Oct 20, 2019 at 9:16 PM Jianzhou Zhao <[email protected]> wrote: > >> Cool. >> >> In the rocksplicator case, how many partitions, replica + nodes a helix >> cluster can manage? >> >> On Sun, Oct 20, 2019 at 9:05 PM Bo Liu <[email protected]> wrote: >> >>> We extensively use Helix at Pinterest. This blog post has more details >>> and some tips. >>> >>> https://medium.com/pinterest-engineering/automated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833 >>> >>> On Sun, Oct 20, 2019 at 8:30 PM Jianzhou Zhao <[email protected]> wrote: >>> >>>> Hi, >>>> >>>> I was looking for who uses helix, and got >>>> https://cwiki.apache.org/confluence/display/HELIX/Powered+By+Helix >>>> >>>> The link did not get updated after 2017. Is the list still update to >>>> date? >>>> >>>> Thank you, j >>>> >>> >>> >>> -- >>> Best regards, >>> Bo >>> >>>
