Re: Excessive ZooKeeper load

kishore g Fri, 06 Feb 2015 10:47:28 -0800

Yes, that is correct. If you want to know the changes. Simply maintain a
local map of <resourceId, version> the version is the zookeeper znode
version, every time there is a change  in external view for a resource, the
version gets incremented.


On Fri, Feb 6, 2015 at 10:40 AM, Varun Sharma <[email protected]> wrote:

> So does the routing table provider receive updates for all the external
> views - does the "list<ExternalView>" always contain all the external views
> ?
>
> Varun
>
> On Fri, Feb 6, 2015 at 10:37 AM, Zhen Zhang <[email protected]> wrote:
>
>>  It doesn't distinguish. RoutingTableProvide is always trying to keep
>> its content the same as those on ZK.
>>
>>  ------------------------------
>> *From:* Varun Sharma [[email protected]]
>> *Sent:* Friday, February 06, 2015 10:27 AM
>>
>> *To:* [email protected]
>> *Subject:* Re: Excessive ZooKeeper load
>>
>>   How does the original RoutingTableProvider distinguish deletions from
>> add/updates ?
>>
>> On Fri, Feb 6, 2015 at 10:23 AM, Zhen Zhang <[email protected]> wrote:
>>
>>>  Hi Varun, for the batching update. Helix controller is not updating
>>> external view on every update. Normally Helix controller will aggregate
>>> updates during a period of time. Say for 100 partitions, if they are
>>> updated roughly as the same time, then Helix controller will update
>>> external view only once. For routing table, what do you mean by ignoring
>>> delete events? RoutingTable will always be updated by ZK callbacks and sync
>>> up with the corresponding external views on ZK.
>>>
>>>  Thanks,
>>> Jason
>>>
>>>  ------------------------------
>>> *From:* Varun Sharma [[email protected]]
>>> *Sent:* Thursday, February 05, 2015 9:17 PM
>>>
>>> *To:* [email protected]
>>> *Subject:* Re: Excessive ZooKeeper load
>>>
>>>    One more question for the routing table provider - is it possible to
>>> distinguish b/w add/modify and delete - I essentially want to ignore the
>>> delete events - can that be found by looking at the list of ExternalView(s)
>>> being passed ?
>>>
>>>  Thanks
>>> Varun
>>>
>>> On Thu, Feb 5, 2015 at 8:48 PM, Varun Sharma <[email protected]>
>>> wrote:
>>>
>>>> I see - one more thing - there was talk of a batching mode where Helix
>>>> can batch updates - can it batch multiple updates  to the external view and
>>>> write once into zookeeper instead of writing for every update. For example,
>>>> consider the case when lots of partitions are being onlined - if we could
>>>> batch updates to the external view into batches of 100 ? Is that supported
>>>> in Helix 0.6.4
>>>>
>>>>  Thanks !
>>>>  Varun
>>>>
>>>> On Thu, Feb 5, 2015 at 5:23 PM, Zhen Zhang <[email protected]> wrote:
>>>>
>>>>>  Yes. the listener will be notified on add/delete/modify. You can
>>>>> distinguish if you have a local cache and compare to get the delta.
>>>>> Currently the API doesn't expose this.
>>>>>
>>>>>  ------------------------------
>>>>> *From:* Varun Sharma [[email protected]]
>>>>> *Sent:* Thursday, February 05, 2015 1:53 PM
>>>>>
>>>>> *To:* [email protected]
>>>>> *Subject:* Re: Excessive ZooKeeper load
>>>>>
>>>>>    I assume that it also gets called when external views get modified
>>>>> ? How can i distinguish if there was an Add, a modify or a delete ?
>>>>>
>>>>>  Thanks
>>>>> Varun
>>>>>
>>>>> On Thu, Feb 5, 2015 at 9:27 AM, Zhen Zhang <[email protected]>
>>>>> wrote:
>>>>>
>>>>>>  Yes. It will get invoked when external views are added or deleted.
>>>>>>  ------------------------------
>>>>>> *From:* Varun Sharma [[email protected]]
>>>>>> *Sent:* Thursday, February 05, 2015 1:27 AM
>>>>>>
>>>>>> *To:* [email protected]
>>>>>> *Subject:* Re: Excessive ZooKeeper load
>>>>>>
>>>>>>    I had another question - does the RoutingTableProvider
>>>>>> onExternalViewChange call get invoked when a resource gets deleted (and
>>>>>> hence its external view znode) ?
>>>>>>
>>>>>> On Wed, Feb 4, 2015 at 10:54 PM, Zhen Zhang <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>>  Yes. I think we did this in the incubating stage or even before.
>>>>>>> It's probably in a separate branch for some performance evaluation.
>>>>>>>
>>>>>>>  ------------------------------
>>>>>>> *From:* kishore g [[email protected]]
>>>>>>> *Sent:* Wednesday, February 04, 2015 9:54 PM
>>>>>>>
>>>>>>> *To:* [email protected]
>>>>>>> *Subject:* Re: Excessive ZooKeeper load
>>>>>>>
>>>>>>>    Jason, I remember having the ability to compress/decompress and
>>>>>>> before we added the support to bucketize, compression was used to 
>>>>>>> support
>>>>>>> large number of partitions. However I dont see the code anywhere. Did 
>>>>>>> we do
>>>>>>> this on a separate branch?
>>>>>>>
>>>>>>>  thanks,
>>>>>>> Kishore G
>>>>>>>
>>>>>>> On Wed, Feb 4, 2015 at 3:30 PM, Zhen Zhang <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>  Hi Varun, we can certainly add compression and have a config for
>>>>>>>> turning it on/off. We do have implemented compression in our own 
>>>>>>>> zkclient
>>>>>>>> before. The issue for compression might be:
>>>>>>>> 1) cpu consumption on controller will increase.
>>>>>>>> 2) hard to debug
>>>>>>>>
>>>>>>>>  Thanks,
>>>>>>>> Jason
>>>>>>>>  ------------------------------
>>>>>>>> *From:* kishore g [[email protected]]
>>>>>>>> *Sent:* Wednesday, February 04, 2015 3:08 PM
>>>>>>>>
>>>>>>>> *To:* [email protected]
>>>>>>>> *Subject:* Re: Excessive ZooKeeper load
>>>>>>>>
>>>>>>>>    we do have the ability to compress the data. I am not sure if
>>>>>>>> there is a easy way to turn on/off the compression.
>>>>>>>>
>>>>>>>> On Wed, Feb 4, 2015 at 2:49 PM, Varun Sharma <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I am wondering if its possible to gzip the external view znode - a
>>>>>>>>> simple gzip cut down the data size by 25X. Is it possible to plug in
>>>>>>>>> compression/decompression as zookeeper nodes are read ?
>>>>>>>>>
>>>>>>>>>  Varun
>>>>>>>>>
>>>>>>>>> On Mon, Feb 2, 2015 at 8:53 PM, kishore g <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> There are multiple options we can try here.
>>>>>>>>>> what if we used cacheddataaccessor for this use case?.clients
>>>>>>>>>> will only read if node has changed. This optimization can benefit 
>>>>>>>>>> all use
>>>>>>>>>> cases.
>>>>>>>>>>
>>>>>>>>>> What about batching the watch triggers. Not sure which version of
>>>>>>>>>> helix has this option.
>>>>>>>>>>
>>>>>>>>>> Another option is to use a poll based roundtable instead of watch
>>>>>>>>>> based. This can coupled with cacheddataaccessor can be over 
>>>>>>>>>> efficient.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Kishore G
>>>>>>>>>>  On Feb 2, 2015 8:17 PM, "Varun Sharma" <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> My total external view across all resources is roughly 3M in
>>>>>>>>>>> size and there are 100 clients downloading it twice for every node 
>>>>>>>>>>> restart
>>>>>>>>>>> - thats 600M of data for every restart. So I guess that is causing 
>>>>>>>>>>> this
>>>>>>>>>>> issue. We are thinking of doing some tricks to limit the # of 
>>>>>>>>>>> clients to 1
>>>>>>>>>>> from 100. I guess that should help significantly.
>>>>>>>>>>>
>>>>>>>>>>>  Varun
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Feb 2, 2015 at 7:37 PM, Zhen Zhang <[email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>>  Hey Varun,
>>>>>>>>>>>>
>>>>>>>>>>>>  I guess your external view is pretty large, since each
>>>>>>>>>>>> external view callback takes ~3s. The RoutingTableProvider is
>>>>>>>>>>>> callback based, so only when there is a change in the external 
>>>>>>>>>>>> view,
>>>>>>>>>>>> RoutingTableProvider will read the entire external view from ZK. 
>>>>>>>>>>>> During the
>>>>>>>>>>>> rolling upgrade, there are lots of live instance change, which may 
>>>>>>>>>>>> lead to
>>>>>>>>>>>> a lot of changes in the external view. One possible way to 
>>>>>>>>>>>> mitigate the
>>>>>>>>>>>> issue is to smooth the traffic by having some delays in between 
>>>>>>>>>>>> bouncing
>>>>>>>>>>>> nodes. We can do a rough estimation on how many external view 
>>>>>>>>>>>> changes you
>>>>>>>>>>>> might have during the upgrade, how many listeners you have, and 
>>>>>>>>>>>> how large
>>>>>>>>>>>> is the external views. Once we have these numbers, we might know 
>>>>>>>>>>>> the ZK
>>>>>>>>>>>> bandwidth requirement. ZK read bandwidth can be scaled by adding ZK
>>>>>>>>>>>> observers.
>>>>>>>>>>>>
>>>>>>>>>>>>  ZK watcher is one time only, so every time a listener
>>>>>>>>>>>> receives a callback, it will re-register its watcher again to ZK.
>>>>>>>>>>>>
>>>>>>>>>>>>  It's normally unreliable to depend on delta changes instead
>>>>>>>>>>>> of reading the entire znode. There might be some corner cases 
>>>>>>>>>>>> where you
>>>>>>>>>>>> would lose delta changes if you depend on that.
>>>>>>>>>>>>
>>>>>>>>>>>>  For the ZK connection issue, do you have any log on the ZK
>>>>>>>>>>>> server side regarding this connection?
>>>>>>>>>>>>
>>>>>>>>>>>>  Thanks,
>>>>>>>>>>>> Jason
>>>>>>>>>>>>
>>>>>>>>>>>>   ------------------------------
>>>>>>>>>>>> *From:* Varun Sharma [[email protected]]
>>>>>>>>>>>> *Sent:* Monday, February 02, 2015 4:41 PM
>>>>>>>>>>>> *To:* [email protected]
>>>>>>>>>>>> *Subject:* Re: Excessive ZooKeeper load
>>>>>>>>>>>>
>>>>>>>>>>>>    I believe there is a misbehaving client. Here is a stack
>>>>>>>>>>>> trace - it probably lost connection and is now stampeding it:
>>>>>>>>>>>>
>>>>>>>>>>>>  "ZkClient-EventThread-104-terrapinzk001a:2181,terrapinzk
>>>>>>>>>>>> 002b:2181,terrapinzk003e:2181" daemon prio=10
>>>>>>>>>>>> tid=0x00007f534144b800 nid=0x7db5 in Object.wait() 
>>>>>>>>>>>> [0x00007f52ca9c3000]
>>>>>>>>>>>>
>>>>>>>>>>>>    java.lang.Thread.State: WAITING (on object monitor)
>>>>>>>>>>>>
>>>>>>>>>>>>         at java.lang.Object.wait(Native Method)
>>>>>>>>>>>>
>>>>>>>>>>>>         at java.lang.Object.wait(Object.java:503)
>>>>>>>>>>>>
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
>>>>>>>>>>>>
>>>>>>>>>>>>         - locked <0x00000004fb0d8c38> (a
>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn$Packet)
>>>>>>>>>>>>
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1036)
>>>>>>>>>>>>
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069)
>>>>>>>>>>>>
>>>>>>>>>>>>         at org.I0Itec.zk
>>>>>>>>>>>> client.ZkConnection.exists(ZkConnection.java:95)
>>>>>>>>>>>>
>>>>>>>>>>>>         at org.I0Itec.zk
>>>>>>>>>>>> client.ZkClient$11.call(ZkClient.java:823)
>>>>>>>>>>>>
>>>>>>>>>>>> *        at
>>>>>>>>>>>> org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)*
>>>>>>>>>>>>
>>>>>>>>>>>> *        at
>>>>>>>>>>>> org.I0Itec.zkclient.ZkClient.watchForData(ZkClient.java:820)*
>>>>>>>>>>>>
>>>>>>>>>>>> *        at
>>>>>>>>>>>> org.I0Itec.zkclient.ZkClient.subscribeDataChanges(ZkClient.java:136)*
>>>>>>>>>>>>
>>>>>>>>>>>>         at org.apache.helix.manager.zk
>>>>>>>>>>>> .CallbackHandler.subscribeDataChange(CallbackHandler.java:241)
>>>>>>>>>>>>
>>>>>>>>>>>>         at org.apache.helix.manager.zk
>>>>>>>>>>>> .CallbackHandler.subscribeForChanges(CallbackHandler.java:287)
>>>>>>>>>>>>
>>>>>>>>>>>>         at org.apache.helix.manager.zk
>>>>>>>>>>>> .CallbackHandler.invoke(CallbackHandler.java:202)
>>>>>>>>>>>>
>>>>>>>>>>>>         - locked <0x000000056b75a948> (a
>>>>>>>>>>>> org.apache.helix.manager.zk.ZKHelixManager)
>>>>>>>>>>>>
>>>>>>>>>>>>         at org.apache.helix.manager.zk
>>>>>>>>>>>> .CallbackHandler.handleDataChange(CallbackHandler.java:338)
>>>>>>>>>>>>
>>>>>>>>>>>>         at org.I0Itec.zk
>>>>>>>>>>>> client.ZkClient$6.run(ZkClient.java:547)
>>>>>>>>>>>>
>>>>>>>>>>>>         at org.I0Itec.zk
>>>>>>>>>>>> client.ZkEventThread.run(ZkEventThread.java:71)
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Feb 2, 2015 at 4:28 PM, Varun Sharma <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I am wondering what is causing the zk subscription to happen
>>>>>>>>>>>>> every 2-3 seconds - is this a new watch being established every 3 
>>>>>>>>>>>>> seconds ?
>>>>>>>>>>>>>
>>>>>>>>>>>>>  Thanks
>>>>>>>>>>>>>  Varun
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Feb 2, 2015 at 4:23 PM, Varun Sharma <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  We are serving a few different resources whose total # of
>>>>>>>>>>>>>> partitions is ~ 30K. We just did a rolling restart fo the 
>>>>>>>>>>>>>> cluster and the
>>>>>>>>>>>>>> clients which use the RoutingTableProvider are stuck in a bad 
>>>>>>>>>>>>>> state where
>>>>>>>>>>>>>> they are constantly subscribing to changes in the external view 
>>>>>>>>>>>>>> of a
>>>>>>>>>>>>>> cluster. Here is the helix log on the client after our rolling 
>>>>>>>>>>>>>> restart was
>>>>>>>>>>>>>> finished - the client is constantly polling ZK. The zookeeper 
>>>>>>>>>>>>>> node is
>>>>>>>>>>>>>> pushing 300mbps right now and most of the traffic is being 
>>>>>>>>>>>>>> pulled by
>>>>>>>>>>>>>> clients. Is this a race condition - also is there an easy way to 
>>>>>>>>>>>>>> make the
>>>>>>>>>>>>>> clients not poll so aggressively. We restarted one of the 
>>>>>>>>>>>>>> clients and we
>>>>>>>>>>>>>> don't see these same messages anymore. Also is it possible to 
>>>>>>>>>>>>>> just
>>>>>>>>>>>>>> propagate external view diffs instead of the whole big znode ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  15/02/03 00:21:18 INFO zk.CallbackHandler: 104 END:INVOKE
>>>>>>>>>>>>>> /main_a/EXTERNALVIEW
>>>>>>>>>>>>>> listener:org.apache.helix.spectator.RoutingTableProvider Took: 
>>>>>>>>>>>>>> 3340ms
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 15/02/03 00:21:18 INFO zk.CallbackHandler: 104 START:INVOKE
>>>>>>>>>>>>>> /main_a/EXTERNALVIEW
>>>>>>>>>>>>>> listener:org.apache.helix.spectator.RoutingTableProvider
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 15/02/03 00:21:18 INFO zk.CallbackHandler: pinacle2084
>>>>>>>>>>>>>> subscribes child-change. path: /main_a/EXTERNALVIEW, listener:
>>>>>>>>>>>>>> org.apache.helix.spectator.RoutingTableProvider@76984879
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 15/02/03 00:21:22 INFO zk.CallbackHandler: 104 END:INVOKE
>>>>>>>>>>>>>> /main_a/EXTERNALVIEW
>>>>>>>>>>>>>> listener:org.apache.helix.spectator.RoutingTableProvider Took: 
>>>>>>>>>>>>>> 3371ms
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 15/02/03 00:21:22 INFO zk.CallbackHandler: 104 START:INVOKE
>>>>>>>>>>>>>> /main_a/EXTERNALVIEW
>>>>>>>>>>>>>> listener:org.apache.helix.spectator.RoutingTableProvider
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 15/02/03 00:21:22 INFO zk.CallbackHandler: pinacle2084
>>>>>>>>>>>>>> subscribes child-change. path: /main_a/EXTERNALVIEW, listener:
>>>>>>>>>>>>>> org.apache.helix.spectator.RoutingTableProvider@76984879
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 15/02/03 00:21:25 INFO zk.CallbackHandler: 104 END:INVOKE
>>>>>>>>>>>>>> /main_a/EXTERNALVIEW
>>>>>>>>>>>>>> listener:org.apache.helix.spectator.RoutingTableProvider Took: 
>>>>>>>>>>>>>> 3281ms
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 15/02/03 00:21:25 INFO zk.CallbackHandler: 104 START:INVOKE
>>>>>>>>>>>>>> /main_a/EXTERNALVIEW
>>>>>>>>>>>>>> listener:org.apache.helix.spectator.RoutingTableProvider
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Excessive ZooKeeper load

Reply via email to