One more question for the routing table provider - is it possible to distinguish b/w add/modify and delete - I essentially want to ignore the delete events - can that be found by looking at the list of ExternalView(s) being passed ?
Thanks Varun On Thu, Feb 5, 2015 at 8:48 PM, Varun Sharma <[email protected]> wrote: > I see - one more thing - there was talk of a batching mode where Helix can > batch updates - can it batch multiple updates to the external view and > write once into zookeeper instead of writing for every update. For example, > consider the case when lots of partitions are being onlined - if we could > batch updates to the external view into batches of 100 ? Is that supported > in Helix 0.6.4 > > Thanks ! > Varun > > On Thu, Feb 5, 2015 at 5:23 PM, Zhen Zhang <[email protected]> wrote: > >> Yes. the listener will be notified on add/delete/modify. You can >> distinguish if you have a local cache and compare to get the delta. >> Currently the API doesn't expose this. >> >> ------------------------------ >> *From:* Varun Sharma [[email protected]] >> *Sent:* Thursday, February 05, 2015 1:53 PM >> >> *To:* [email protected] >> *Subject:* Re: Excessive ZooKeeper load >> >> I assume that it also gets called when external views get modified ? >> How can i distinguish if there was an Add, a modify or a delete ? >> >> Thanks >> Varun >> >> On Thu, Feb 5, 2015 at 9:27 AM, Zhen Zhang <[email protected]> wrote: >> >>> Yes. It will get invoked when external views are added or deleted. >>> ------------------------------ >>> *From:* Varun Sharma [[email protected]] >>> *Sent:* Thursday, February 05, 2015 1:27 AM >>> >>> *To:* [email protected] >>> *Subject:* Re: Excessive ZooKeeper load >>> >>> I had another question - does the RoutingTableProvider >>> onExternalViewChange call get invoked when a resource gets deleted (and >>> hence its external view znode) ? >>> >>> On Wed, Feb 4, 2015 at 10:54 PM, Zhen Zhang <[email protected]> wrote: >>> >>>> Yes. I think we did this in the incubating stage or even before. It's >>>> probably in a separate branch for some performance evaluation. >>>> >>>> ------------------------------ >>>> *From:* kishore g [[email protected]] >>>> *Sent:* Wednesday, February 04, 2015 9:54 PM >>>> >>>> *To:* [email protected] >>>> *Subject:* Re: Excessive ZooKeeper load >>>> >>>> Jason, I remember having the ability to compress/decompress and >>>> before we added the support to bucketize, compression was used to support >>>> large number of partitions. However I dont see the code anywhere. Did we do >>>> this on a separate branch? >>>> >>>> thanks, >>>> Kishore G >>>> >>>> On Wed, Feb 4, 2015 at 3:30 PM, Zhen Zhang <[email protected]> wrote: >>>> >>>>> Hi Varun, we can certainly add compression and have a config for >>>>> turning it on/off. We do have implemented compression in our own zkclient >>>>> before. The issue for compression might be: >>>>> 1) cpu consumption on controller will increase. >>>>> 2) hard to debug >>>>> >>>>> Thanks, >>>>> Jason >>>>> ------------------------------ >>>>> *From:* kishore g [[email protected]] >>>>> *Sent:* Wednesday, February 04, 2015 3:08 PM >>>>> >>>>> *To:* [email protected] >>>>> *Subject:* Re: Excessive ZooKeeper load >>>>> >>>>> we do have the ability to compress the data. I am not sure if >>>>> there is a easy way to turn on/off the compression. >>>>> >>>>> On Wed, Feb 4, 2015 at 2:49 PM, Varun Sharma <[email protected]> >>>>> wrote: >>>>> >>>>>> I am wondering if its possible to gzip the external view znode - a >>>>>> simple gzip cut down the data size by 25X. Is it possible to plug in >>>>>> compression/decompression as zookeeper nodes are read ? >>>>>> >>>>>> Varun >>>>>> >>>>>> On Mon, Feb 2, 2015 at 8:53 PM, kishore g <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> There are multiple options we can try here. >>>>>>> what if we used cacheddataaccessor for this use case?.clients will >>>>>>> only read if node has changed. This optimization can benefit all use >>>>>>> cases. >>>>>>> >>>>>>> What about batching the watch triggers. Not sure which version of >>>>>>> helix has this option. >>>>>>> >>>>>>> Another option is to use a poll based roundtable instead of watch >>>>>>> based. This can coupled with cacheddataaccessor can be over efficient. >>>>>>> >>>>>>> Thanks, >>>>>>> Kishore G >>>>>>> On Feb 2, 2015 8:17 PM, "Varun Sharma" <[email protected]> wrote: >>>>>>> >>>>>>>> My total external view across all resources is roughly 3M in size >>>>>>>> and there are 100 clients downloading it twice for every node restart - >>>>>>>> thats 600M of data for every restart. So I guess that is causing this >>>>>>>> issue. We are thinking of doing some tricks to limit the # of clients >>>>>>>> to 1 >>>>>>>> from 100. I guess that should help significantly. >>>>>>>> >>>>>>>> Varun >>>>>>>> >>>>>>>> On Mon, Feb 2, 2015 at 7:37 PM, Zhen Zhang <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hey Varun, >>>>>>>>> >>>>>>>>> I guess your external view is pretty large, since each external >>>>>>>>> view callback takes ~3s. The RoutingTableProvider is callback >>>>>>>>> based, so only when there is a change in the external view, >>>>>>>>> RoutingTableProvider will read the entire external view from ZK. >>>>>>>>> During the >>>>>>>>> rolling upgrade, there are lots of live instance change, which may >>>>>>>>> lead to >>>>>>>>> a lot of changes in the external view. One possible way to mitigate >>>>>>>>> the >>>>>>>>> issue is to smooth the traffic by having some delays in between >>>>>>>>> bouncing >>>>>>>>> nodes. We can do a rough estimation on how many external view changes >>>>>>>>> you >>>>>>>>> might have during the upgrade, how many listeners you have, and how >>>>>>>>> large >>>>>>>>> is the external views. Once we have these numbers, we might know the >>>>>>>>> ZK >>>>>>>>> bandwidth requirement. ZK read bandwidth can be scaled by adding ZK >>>>>>>>> observers. >>>>>>>>> >>>>>>>>> ZK watcher is one time only, so every time a listener receives a >>>>>>>>> callback, it will re-register its watcher again to ZK. >>>>>>>>> >>>>>>>>> It's normally unreliable to depend on delta changes instead of >>>>>>>>> reading the entire znode. There might be some corner cases where you >>>>>>>>> would >>>>>>>>> lose delta changes if you depend on that. >>>>>>>>> >>>>>>>>> For the ZK connection issue, do you have any log on the ZK >>>>>>>>> server side regarding this connection? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Jason >>>>>>>>> >>>>>>>>> ------------------------------ >>>>>>>>> *From:* Varun Sharma [[email protected]] >>>>>>>>> *Sent:* Monday, February 02, 2015 4:41 PM >>>>>>>>> *To:* [email protected] >>>>>>>>> *Subject:* Re: Excessive ZooKeeper load >>>>>>>>> >>>>>>>>> I believe there is a misbehaving client. Here is a stack trace >>>>>>>>> - it probably lost connection and is now stampeding it: >>>>>>>>> >>>>>>>>> "ZkClient-EventThread-104-terrapinzk001a:2181,terrapinzk >>>>>>>>> 002b:2181,terrapinzk003e:2181" daemon prio=10 >>>>>>>>> tid=0x00007f534144b800 nid=0x7db5 in Object.wait() >>>>>>>>> [0x00007f52ca9c3000] >>>>>>>>> >>>>>>>>> java.lang.Thread.State: WAITING (on object monitor) >>>>>>>>> >>>>>>>>> at java.lang.Object.wait(Native Method) >>>>>>>>> >>>>>>>>> at java.lang.Object.wait(Object.java:503) >>>>>>>>> >>>>>>>>> at >>>>>>>>> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309) >>>>>>>>> >>>>>>>>> - locked <0x00000004fb0d8c38> (a >>>>>>>>> org.apache.zookeeper.ClientCnxn$Packet) >>>>>>>>> >>>>>>>>> at >>>>>>>>> org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1036) >>>>>>>>> >>>>>>>>> at >>>>>>>>> org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069) >>>>>>>>> >>>>>>>>> at org.I0Itec.zk >>>>>>>>> client.ZkConnection.exists(ZkConnection.java:95) >>>>>>>>> >>>>>>>>> at org.I0Itec.zkclient.ZkClient$11.call(ZkClient.java:823) >>>>>>>>> >>>>>>>>> * at >>>>>>>>> org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)* >>>>>>>>> >>>>>>>>> * at >>>>>>>>> org.I0Itec.zkclient.ZkClient.watchForData(ZkClient.java:820)* >>>>>>>>> >>>>>>>>> * at >>>>>>>>> org.I0Itec.zkclient.ZkClient.subscribeDataChanges(ZkClient.java:136)* >>>>>>>>> >>>>>>>>> at org.apache.helix.manager.zk >>>>>>>>> .CallbackHandler.subscribeDataChange(CallbackHandler.java:241) >>>>>>>>> >>>>>>>>> at org.apache.helix.manager.zk >>>>>>>>> .CallbackHandler.subscribeForChanges(CallbackHandler.java:287) >>>>>>>>> >>>>>>>>> at org.apache.helix.manager.zk >>>>>>>>> .CallbackHandler.invoke(CallbackHandler.java:202) >>>>>>>>> >>>>>>>>> - locked <0x000000056b75a948> (a org.apache.helix.manager. >>>>>>>>> zk.ZKHelixManager) >>>>>>>>> >>>>>>>>> at org.apache.helix.manager.zk >>>>>>>>> .CallbackHandler.handleDataChange(CallbackHandler.java:338) >>>>>>>>> >>>>>>>>> at org.I0Itec.zkclient.ZkClient$6.run(ZkClient.java:547) >>>>>>>>> >>>>>>>>> at org.I0Itec.zk >>>>>>>>> client.ZkEventThread.run(ZkEventThread.java:71) >>>>>>>>> >>>>>>>>> On Mon, Feb 2, 2015 at 4:28 PM, Varun Sharma <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> I am wondering what is causing the zk subscription to happen >>>>>>>>>> every 2-3 seconds - is this a new watch being established every 3 >>>>>>>>>> seconds ? >>>>>>>>>> >>>>>>>>>> Thanks >>>>>>>>>> Varun >>>>>>>>>> >>>>>>>>>> On Mon, Feb 2, 2015 at 4:23 PM, Varun Sharma <[email protected] >>>>>>>>>> > wrote: >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> We are serving a few different resources whose total # of >>>>>>>>>>> partitions is ~ 30K. We just did a rolling restart fo the cluster >>>>>>>>>>> and the >>>>>>>>>>> clients which use the RoutingTableProvider are stuck in a bad state >>>>>>>>>>> where >>>>>>>>>>> they are constantly subscribing to changes in the external view of a >>>>>>>>>>> cluster. Here is the helix log on the client after our rolling >>>>>>>>>>> restart was >>>>>>>>>>> finished - the client is constantly polling ZK. The zookeeper node >>>>>>>>>>> is >>>>>>>>>>> pushing 300mbps right now and most of the traffic is being pulled by >>>>>>>>>>> clients. Is this a race condition - also is there an easy way to >>>>>>>>>>> make the >>>>>>>>>>> clients not poll so aggressively. We restarted one of the clients >>>>>>>>>>> and we >>>>>>>>>>> don't see these same messages anymore. Also is it possible to just >>>>>>>>>>> propagate external view diffs instead of the whole big znode ? >>>>>>>>>>> >>>>>>>>>>> 15/02/03 00:21:18 INFO zk.CallbackHandler: 104 END:INVOKE >>>>>>>>>>> /main_a/EXTERNALVIEW >>>>>>>>>>> listener:org.apache.helix.spectator.RoutingTableProvider Took: >>>>>>>>>>> 3340ms >>>>>>>>>>> >>>>>>>>>>> 15/02/03 00:21:18 INFO zk.CallbackHandler: 104 START:INVOKE >>>>>>>>>>> /main_a/EXTERNALVIEW >>>>>>>>>>> listener:org.apache.helix.spectator.RoutingTableProvider >>>>>>>>>>> >>>>>>>>>>> 15/02/03 00:21:18 INFO zk.CallbackHandler: pinacle2084 >>>>>>>>>>> subscribes child-change. path: /main_a/EXTERNALVIEW, listener: >>>>>>>>>>> org.apache.helix.spectator.RoutingTableProvider@76984879 >>>>>>>>>>> >>>>>>>>>>> 15/02/03 00:21:22 INFO zk.CallbackHandler: 104 END:INVOKE >>>>>>>>>>> /main_a/EXTERNALVIEW >>>>>>>>>>> listener:org.apache.helix.spectator.RoutingTableProvider Took: >>>>>>>>>>> 3371ms >>>>>>>>>>> >>>>>>>>>>> 15/02/03 00:21:22 INFO zk.CallbackHandler: 104 START:INVOKE >>>>>>>>>>> /main_a/EXTERNALVIEW >>>>>>>>>>> listener:org.apache.helix.spectator.RoutingTableProvider >>>>>>>>>>> >>>>>>>>>>> 15/02/03 00:21:22 INFO zk.CallbackHandler: pinacle2084 >>>>>>>>>>> subscribes child-change. path: /main_a/EXTERNALVIEW, listener: >>>>>>>>>>> org.apache.helix.spectator.RoutingTableProvider@76984879 >>>>>>>>>>> >>>>>>>>>>> 15/02/03 00:21:25 INFO zk.CallbackHandler: 104 END:INVOKE >>>>>>>>>>> /main_a/EXTERNALVIEW >>>>>>>>>>> listener:org.apache.helix.spectator.RoutingTableProvider Took: >>>>>>>>>>> 3281ms >>>>>>>>>>> >>>>>>>>>>> 15/02/03 00:21:25 INFO zk.CallbackHandler: 104 START:INVOKE >>>>>>>>>>> /main_a/EXTERNALVIEW >>>>>>>>>>> listener:org.apache.helix.spectator.RoutingTableProvider >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>>> >>>> >>> >> >
