Re: Excessive ZooKeeper load

Varun Sharma Fri, 06 Feb 2015 10:40:50 -0800

So does the routing table provider receive updates for all the external
views - does the "list<ExternalView>" always contain all the external views
?


Varun

On Fri, Feb 6, 2015 at 10:37 AM, Zhen Zhang <[email protected]> wrote:

>  It doesn't distinguish. RoutingTableProvide is always trying to keep its
> content the same as those on ZK.
>
>  ------------------------------
> *From:* Varun Sharma [[email protected]]
> *Sent:* Friday, February 06, 2015 10:27 AM
>
> *To:* [email protected]
> *Subject:* Re: Excessive ZooKeeper load
>
>   How does the original RoutingTableProvider distinguish deletions from
> add/updates ?
>
> On Fri, Feb 6, 2015 at 10:23 AM, Zhen Zhang <[email protected]> wrote:
>
>>  Hi Varun, for the batching update. Helix controller is not updating
>> external view on every update. Normally Helix controller will aggregate
>> updates during a period of time. Say for 100 partitions, if they are
>> updated roughly as the same time, then Helix controller will update
>> external view only once. For routing table, what do you mean by ignoring
>> delete events? RoutingTable will always be updated by ZK callbacks and sync
>> up with the corresponding external views on ZK.
>>
>>  Thanks,
>> Jason
>>
>>  ------------------------------
>> *From:* Varun Sharma [[email protected]]
>> *Sent:* Thursday, February 05, 2015 9:17 PM
>>
>> *To:* [email protected]
>> *Subject:* Re: Excessive ZooKeeper load
>>
>>    One more question for the routing table provider - is it possible to
>> distinguish b/w add/modify and delete - I essentially want to ignore the
>> delete events - can that be found by looking at the list of ExternalView(s)
>> being passed ?
>>
>>  Thanks
>> Varun
>>
>> On Thu, Feb 5, 2015 at 8:48 PM, Varun Sharma <[email protected]> wrote:
>>
>>> I see - one more thing - there was talk of a batching mode where Helix
>>> can batch updates - can it batch multiple updates  to the external view and
>>> write once into zookeeper instead of writing for every update. For example,
>>> consider the case when lots of partitions are being onlined - if we could
>>> batch updates to the external view into batches of 100 ? Is that supported
>>> in Helix 0.6.4
>>>
>>>  Thanks !
>>>  Varun
>>>
>>> On Thu, Feb 5, 2015 at 5:23 PM, Zhen Zhang <[email protected]> wrote:
>>>
>>>>  Yes. the listener will be notified on add/delete/modify. You can
>>>> distinguish if you have a local cache and compare to get the delta.
>>>> Currently the API doesn't expose this.
>>>>
>>>>  ------------------------------
>>>> *From:* Varun Sharma [[email protected]]
>>>> *Sent:* Thursday, February 05, 2015 1:53 PM
>>>>
>>>> *To:* [email protected]
>>>> *Subject:* Re: Excessive ZooKeeper load
>>>>
>>>>    I assume that it also gets called when external views get modified
>>>> ? How can i distinguish if there was an Add, a modify or a delete ?
>>>>
>>>>  Thanks
>>>> Varun
>>>>
>>>> On Thu, Feb 5, 2015 at 9:27 AM, Zhen Zhang <[email protected]> wrote:
>>>>
>>>>>  Yes. It will get invoked when external views are added or deleted.
>>>>>  ------------------------------
>>>>> *From:* Varun Sharma [[email protected]]
>>>>> *Sent:* Thursday, February 05, 2015 1:27 AM
>>>>>
>>>>> *To:* [email protected]
>>>>> *Subject:* Re: Excessive ZooKeeper load
>>>>>
>>>>>    I had another question - does the RoutingTableProvider
>>>>> onExternalViewChange call get invoked when a resource gets deleted (and
>>>>> hence its external view znode) ?
>>>>>
>>>>> On Wed, Feb 4, 2015 at 10:54 PM, Zhen Zhang <[email protected]>
>>>>> wrote:
>>>>>
>>>>>>  Yes. I think we did this in the incubating stage or even before.
>>>>>> It's probably in a separate branch for some performance evaluation.
>>>>>>
>>>>>>  ------------------------------
>>>>>> *From:* kishore g [[email protected]]
>>>>>> *Sent:* Wednesday, February 04, 2015 9:54 PM
>>>>>>
>>>>>> *To:* [email protected]
>>>>>> *Subject:* Re: Excessive ZooKeeper load
>>>>>>
>>>>>>    Jason, I remember having the ability to compress/decompress and
>>>>>> before we added the support to bucketize, compression was used to support
>>>>>> large number of partitions. However I dont see the code anywhere. Did we 
>>>>>> do
>>>>>> this on a separate branch?
>>>>>>
>>>>>>  thanks,
>>>>>> Kishore G
>>>>>>
>>>>>> On Wed, Feb 4, 2015 at 3:30 PM, Zhen Zhang <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>>  Hi Varun, we can certainly add compression and have a config for
>>>>>>> turning it on/off. We do have implemented compression in our own 
>>>>>>> zkclient
>>>>>>> before. The issue for compression might be:
>>>>>>> 1) cpu consumption on controller will increase.
>>>>>>> 2) hard to debug
>>>>>>>
>>>>>>>  Thanks,
>>>>>>> Jason
>>>>>>>  ------------------------------
>>>>>>> *From:* kishore g [[email protected]]
>>>>>>> *Sent:* Wednesday, February 04, 2015 3:08 PM
>>>>>>>
>>>>>>> *To:* [email protected]
>>>>>>> *Subject:* Re: Excessive ZooKeeper load
>>>>>>>
>>>>>>>    we do have the ability to compress the data. I am not sure if
>>>>>>> there is a easy way to turn on/off the compression.
>>>>>>>
>>>>>>> On Wed, Feb 4, 2015 at 2:49 PM, Varun Sharma <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I am wondering if its possible to gzip the external view znode - a
>>>>>>>> simple gzip cut down the data size by 25X. Is it possible to plug in
>>>>>>>> compression/decompression as zookeeper nodes are read ?
>>>>>>>>
>>>>>>>>  Varun
>>>>>>>>
>>>>>>>> On Mon, Feb 2, 2015 at 8:53 PM, kishore g <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> There are multiple options we can try here.
>>>>>>>>> what if we used cacheddataaccessor for this use case?.clients will
>>>>>>>>> only read if node has changed. This optimization can benefit all use 
>>>>>>>>> cases.
>>>>>>>>>
>>>>>>>>> What about batching the watch triggers. Not sure which version of
>>>>>>>>> helix has this option.
>>>>>>>>>
>>>>>>>>> Another option is to use a poll based roundtable instead of watch
>>>>>>>>> based. This can coupled with cacheddataaccessor can be over efficient.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Kishore G
>>>>>>>>>  On Feb 2, 2015 8:17 PM, "Varun Sharma" <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> My total external view across all resources is roughly 3M in size
>>>>>>>>>> and there are 100 clients downloading it twice for every node 
>>>>>>>>>> restart -
>>>>>>>>>> thats 600M of data for every restart. So I guess that is causing this
>>>>>>>>>> issue. We are thinking of doing some tricks to limit the # of 
>>>>>>>>>> clients to 1
>>>>>>>>>> from 100. I guess that should help significantly.
>>>>>>>>>>
>>>>>>>>>>  Varun
>>>>>>>>>>
>>>>>>>>>> On Mon, Feb 2, 2015 at 7:37 PM, Zhen Zhang <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>>  Hey Varun,
>>>>>>>>>>>
>>>>>>>>>>>  I guess your external view is pretty large, since each
>>>>>>>>>>> external view callback takes ~3s. The RoutingTableProvider is
>>>>>>>>>>> callback based, so only when there is a change in the external view,
>>>>>>>>>>> RoutingTableProvider will read the entire external view from ZK. 
>>>>>>>>>>> During the
>>>>>>>>>>> rolling upgrade, there are lots of live instance change, which may 
>>>>>>>>>>> lead to
>>>>>>>>>>> a lot of changes in the external view. One possible way to mitigate 
>>>>>>>>>>> the
>>>>>>>>>>> issue is to smooth the traffic by having some delays in between 
>>>>>>>>>>> bouncing
>>>>>>>>>>> nodes. We can do a rough estimation on how many external view 
>>>>>>>>>>> changes you
>>>>>>>>>>> might have during the upgrade, how many listeners you have, and how 
>>>>>>>>>>> large
>>>>>>>>>>> is the external views. Once we have these numbers, we might know 
>>>>>>>>>>> the ZK
>>>>>>>>>>> bandwidth requirement. ZK read bandwidth can be scaled by adding ZK
>>>>>>>>>>> observers.
>>>>>>>>>>>
>>>>>>>>>>>  ZK watcher is one time only, so every time a listener receives
>>>>>>>>>>> a callback, it will re-register its watcher again to ZK.
>>>>>>>>>>>
>>>>>>>>>>>  It's normally unreliable to depend on delta changes instead of
>>>>>>>>>>> reading the entire znode. There might be some corner cases where 
>>>>>>>>>>> you would
>>>>>>>>>>> lose delta changes if you depend on that.
>>>>>>>>>>>
>>>>>>>>>>>  For the ZK connection issue, do you have any log on the ZK
>>>>>>>>>>> server side regarding this connection?
>>>>>>>>>>>
>>>>>>>>>>>  Thanks,
>>>>>>>>>>> Jason
>>>>>>>>>>>
>>>>>>>>>>>   ------------------------------
>>>>>>>>>>> *From:* Varun Sharma [[email protected]]
>>>>>>>>>>> *Sent:* Monday, February 02, 2015 4:41 PM
>>>>>>>>>>> *To:* [email protected]
>>>>>>>>>>> *Subject:* Re: Excessive ZooKeeper load
>>>>>>>>>>>
>>>>>>>>>>>    I believe there is a misbehaving client. Here is a stack
>>>>>>>>>>> trace - it probably lost connection and is now stampeding it:
>>>>>>>>>>>
>>>>>>>>>>>  "ZkClient-EventThread-104-terrapinzk001a:2181,terrapinzk
>>>>>>>>>>> 002b:2181,terrapinzk003e:2181" daemon prio=10
>>>>>>>>>>> tid=0x00007f534144b800 nid=0x7db5 in Object.wait() 
>>>>>>>>>>> [0x00007f52ca9c3000]
>>>>>>>>>>>
>>>>>>>>>>>    java.lang.Thread.State: WAITING (on object monitor)
>>>>>>>>>>>
>>>>>>>>>>>         at java.lang.Object.wait(Native Method)
>>>>>>>>>>>
>>>>>>>>>>>         at java.lang.Object.wait(Object.java:503)
>>>>>>>>>>>
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
>>>>>>>>>>>
>>>>>>>>>>>         - locked <0x00000004fb0d8c38> (a
>>>>>>>>>>> org.apache.zookeeper.ClientCnxn$Packet)
>>>>>>>>>>>
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1036)
>>>>>>>>>>>
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069)
>>>>>>>>>>>
>>>>>>>>>>>         at org.I0Itec.zk
>>>>>>>>>>> client.ZkConnection.exists(ZkConnection.java:95)
>>>>>>>>>>>
>>>>>>>>>>>         at org.I0Itec.zk
>>>>>>>>>>> client.ZkClient$11.call(ZkClient.java:823)
>>>>>>>>>>>
>>>>>>>>>>> *        at
>>>>>>>>>>> org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)*
>>>>>>>>>>>
>>>>>>>>>>> *        at
>>>>>>>>>>> org.I0Itec.zkclient.ZkClient.watchForData(ZkClient.java:820)*
>>>>>>>>>>>
>>>>>>>>>>> *        at
>>>>>>>>>>> org.I0Itec.zkclient.ZkClient.subscribeDataChanges(ZkClient.java:136)*
>>>>>>>>>>>
>>>>>>>>>>>         at org.apache.helix.manager.zk
>>>>>>>>>>> .CallbackHandler.subscribeDataChange(CallbackHandler.java:241)
>>>>>>>>>>>
>>>>>>>>>>>         at org.apache.helix.manager.zk
>>>>>>>>>>> .CallbackHandler.subscribeForChanges(CallbackHandler.java:287)
>>>>>>>>>>>
>>>>>>>>>>>         at org.apache.helix.manager.zk
>>>>>>>>>>> .CallbackHandler.invoke(CallbackHandler.java:202)
>>>>>>>>>>>
>>>>>>>>>>>         - locked <0x000000056b75a948> (a
>>>>>>>>>>> org.apache.helix.manager.zk.ZKHelixManager)
>>>>>>>>>>>
>>>>>>>>>>>         at org.apache.helix.manager.zk
>>>>>>>>>>> .CallbackHandler.handleDataChange(CallbackHandler.java:338)
>>>>>>>>>>>
>>>>>>>>>>>         at org.I0Itec.zkclient.ZkClient$6.run(ZkClient.java:547)
>>>>>>>>>>>
>>>>>>>>>>>         at org.I0Itec.zk
>>>>>>>>>>> client.ZkEventThread.run(ZkEventThread.java:71)
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Feb 2, 2015 at 4:28 PM, Varun Sharma <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I am wondering what is causing the zk subscription to happen
>>>>>>>>>>>> every 2-3 seconds - is this a new watch being established every 3 
>>>>>>>>>>>> seconds ?
>>>>>>>>>>>>
>>>>>>>>>>>>  Thanks
>>>>>>>>>>>>  Varun
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Feb 2, 2015 at 4:23 PM, Varun Sharma <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>>  We are serving a few different resources whose total # of
>>>>>>>>>>>>> partitions is ~ 30K. We just did a rolling restart fo the cluster 
>>>>>>>>>>>>> and the
>>>>>>>>>>>>> clients which use the RoutingTableProvider are stuck in a bad 
>>>>>>>>>>>>> state where
>>>>>>>>>>>>> they are constantly subscribing to changes in the external view 
>>>>>>>>>>>>> of a
>>>>>>>>>>>>> cluster. Here is the helix log on the client after our rolling 
>>>>>>>>>>>>> restart was
>>>>>>>>>>>>> finished - the client is constantly polling ZK. The zookeeper 
>>>>>>>>>>>>> node is
>>>>>>>>>>>>> pushing 300mbps right now and most of the traffic is being pulled 
>>>>>>>>>>>>> by
>>>>>>>>>>>>> clients. Is this a race condition - also is there an easy way to 
>>>>>>>>>>>>> make the
>>>>>>>>>>>>> clients not poll so aggressively. We restarted one of the clients 
>>>>>>>>>>>>> and we
>>>>>>>>>>>>> don't see these same messages anymore. Also is it possible to just
>>>>>>>>>>>>> propagate external view diffs instead of the whole big znode ?
>>>>>>>>>>>>>
>>>>>>>>>>>>>  15/02/03 00:21:18 INFO zk.CallbackHandler: 104 END:INVOKE
>>>>>>>>>>>>> /main_a/EXTERNALVIEW
>>>>>>>>>>>>> listener:org.apache.helix.spectator.RoutingTableProvider Took: 
>>>>>>>>>>>>> 3340ms
>>>>>>>>>>>>>
>>>>>>>>>>>>> 15/02/03 00:21:18 INFO zk.CallbackHandler: 104 START:INVOKE
>>>>>>>>>>>>> /main_a/EXTERNALVIEW
>>>>>>>>>>>>> listener:org.apache.helix.spectator.RoutingTableProvider
>>>>>>>>>>>>>
>>>>>>>>>>>>> 15/02/03 00:21:18 INFO zk.CallbackHandler: pinacle2084
>>>>>>>>>>>>> subscribes child-change. path: /main_a/EXTERNALVIEW, listener:
>>>>>>>>>>>>> org.apache.helix.spectator.RoutingTableProvider@76984879
>>>>>>>>>>>>>
>>>>>>>>>>>>> 15/02/03 00:21:22 INFO zk.CallbackHandler: 104 END:INVOKE
>>>>>>>>>>>>> /main_a/EXTERNALVIEW
>>>>>>>>>>>>> listener:org.apache.helix.spectator.RoutingTableProvider Took: 
>>>>>>>>>>>>> 3371ms
>>>>>>>>>>>>>
>>>>>>>>>>>>> 15/02/03 00:21:22 INFO zk.CallbackHandler: 104 START:INVOKE
>>>>>>>>>>>>> /main_a/EXTERNALVIEW
>>>>>>>>>>>>> listener:org.apache.helix.spectator.RoutingTableProvider
>>>>>>>>>>>>>
>>>>>>>>>>>>> 15/02/03 00:21:22 INFO zk.CallbackHandler: pinacle2084
>>>>>>>>>>>>> subscribes child-change. path: /main_a/EXTERNALVIEW, listener:
>>>>>>>>>>>>> org.apache.helix.spectator.RoutingTableProvider@76984879
>>>>>>>>>>>>>
>>>>>>>>>>>>> 15/02/03 00:21:25 INFO zk.CallbackHandler: 104 END:INVOKE
>>>>>>>>>>>>> /main_a/EXTERNALVIEW
>>>>>>>>>>>>> listener:org.apache.helix.spectator.RoutingTableProvider Took: 
>>>>>>>>>>>>> 3281ms
>>>>>>>>>>>>>
>>>>>>>>>>>>> 15/02/03 00:21:25 INFO zk.CallbackHandler: 104 START:INVOKE
>>>>>>>>>>>>> /main_a/EXTERNALVIEW
>>>>>>>>>>>>> listener:org.apache.helix.spectator.RoutingTableProvider
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Excessive ZooKeeper load

Reply via email to