Hi, We are serving a few different resources whose total # of partitions is ~ 30K. We just did a rolling restart fo the cluster and the clients which use the RoutingTableProvider are stuck in a bad state where they are constantly subscribing to changes in the external view of a cluster. Here is the helix log on the client after our rolling restart was finished - the client is constantly polling ZK. The zookeeper node is pushing 300mbps right now and most of the traffic is being pulled by clients. Is this a race condition - also is there an easy way to make the clients not poll so aggressively. We restarted one of the clients and we don't see these same messages anymore. Also is it possible to just propagate external view diffs instead of the whole big znode ?
15/02/03 00:21:18 INFO zk.CallbackHandler: 104 END:INVOKE /main_a/EXTERNALVIEW listener:org.apache.helix.spectator.RoutingTableProvider Took: 3340ms 15/02/03 00:21:18 INFO zk.CallbackHandler: 104 START:INVOKE /main_a/EXTERNALVIEW listener:org.apache.helix.spectator.RoutingTableProvider 15/02/03 00:21:18 INFO zk.CallbackHandler: pinacle2084 subscribes child-change. path: /main_a/EXTERNALVIEW, listener: org.apache.helix.spectator.RoutingTableProvider@76984879 15/02/03 00:21:22 INFO zk.CallbackHandler: 104 END:INVOKE /main_a/EXTERNALVIEW listener:org.apache.helix.spectator.RoutingTableProvider Took: 3371ms 15/02/03 00:21:22 INFO zk.CallbackHandler: 104 START:INVOKE /main_a/EXTERNALVIEW listener:org.apache.helix.spectator.RoutingTableProvider 15/02/03 00:21:22 INFO zk.CallbackHandler: pinacle2084 subscribes child-change. path: /main_a/EXTERNALVIEW, listener: org.apache.helix.spectator.RoutingTableProvider@76984879 15/02/03 00:21:25 INFO zk.CallbackHandler: 104 END:INVOKE /main_a/EXTERNALVIEW listener:org.apache.helix.spectator.RoutingTableProvider Took: 3281ms 15/02/03 00:21:25 INFO zk.CallbackHandler: 104 START:INVOKE /main_a/EXTERNALVIEW listener:org.apache.helix.spectator.RoutingTableProvider
