Hi Varun, The change is already made. We will start working on the release.
Any volunteer to make the release? Thanks Kishore G On Mar 9, 2015 2:17 PM, "Zhen Zhang" <[email protected]> wrote: > Hi Varun, > > Kishore already checked in a fix for that: > > https://git-wip-us.apache.org/repos/asf?p=helix.git;a=commit;h=99baacf7f19a09d972754902c50f1618fc8b804c > > It's in 0.6.x branch. > > Thanks, > Jason > > ------------------------------ > *From:* Varun Sharma [[email protected]] > *Sent:* Monday, March 09, 2015 2:11 PM > *To:* [email protected] > *Subject:* Re: RoutingTableProvider dropping callbacks > > Just pinging this thread to check on the hot fix to not remove > externalview znode and release for the same. Is there a JIRA tracking that ? > > On Sun, Mar 8, 2015 at 11:46 PM, Varun Sharma <[email protected]> wrote: > >> If I recall correctly from a previous thread, it seems like we don't even >> support changing of bucket sizes for the same resource - so it seems we >> should probably not be deleting the znode in this case ? >> >> On Sun, Mar 8, 2015 at 11:43 PM, Zhen Zhang <[email protected]> wrote: >> >>> @Kishore, I think the remove is used in case bucket size is changed, >>> so we can clean all the buckets for old size and set it using new size. >>> >>> The issue seems like a race condition in setting bucketized external >>> view and add watches on child paths. Will investigate more. >>> >>> Thanks, >>> Jason >>> ------------------------------ >>> *From:* Varun Sharma [[email protected]] >>> *Sent:* Saturday, March 07, 2015 11:07 PM >>> >>> *To:* [email protected] >>> *Subject:* Re: RoutingTableProvider dropping callbacks >>> >>> Please find the attached log file with the above trace. >>> >>> On Sat, Mar 7, 2015 at 8:12 PM, kishore g <[email protected]> wrote: >>> >>>> Another thing is that the RoutingTable is logging this line "Resetting >>>> the routing table.". Looks like this happens when we fail to set the >>>> watcher. >>>> >>>> thanks, >>>> Kishore G >>>> >>>> On Sat, Mar 7, 2015 at 8:05 PM, kishore g <[email protected]> wrote: >>>> >>>>> Your explanation makes sense. >>>>> >>>>> >>>>> https://github.com/apache/helix/blob/helix-0.6.4/helix-core/src/main/java/org/apache/helix/manager/zk/ZKHelixDataAccessor.java. >>>>> For bucketized resource we see that path is deleted and set again. Jason, >>>>> any idea why we are removing the path? >>>>> >>>>> case EXTERNALVIEW: if (value.getBucketSize() == 0) { records. >>>>> add(value.getRecord()); } else { _baseDataAccessor.remove(path, >>>>> options); >>>>> >>>>> On Sat, Mar 7, 2015 at 4:03 PM, Varun Sharma <[email protected]> >>>>> wrote: >>>>> >>>>>> How does the writing of externalview work for bucketized resources >>>>>> -is it possible that the top level znode for the resource is first >>>>>> deleted >>>>>> and then rewritten with the latest external view ? >>>>>> >>>>>> On Sat, Mar 7, 2015 at 3:56 PM, Varun Sharma <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Here is the stack trace - there is a zookeeper race and the detailed >>>>>>> stack trace appears for bucketized resources. I saw that the ideal state >>>>>>> for the resource was created on 26th Feb and was modified on 7th March. >>>>>>> However, the external view for the resource is showing up as created on >>>>>>> 7th >>>>>>> march as well as modified on 7th march. The external view is created at >>>>>>> 10:36:04 on 7th march which is 20 seconds after this log message stack >>>>>>> trace is spit out. After this the routing table provider no longer >>>>>>> receives >>>>>>> any more zk callbacks. >>>>>>> >>>>>>> 2015-03-07 10:35:43,735 [main-EventThread] >>>>>>> (ZkAsyncCallbacks.java:127) WARN >>>>>>> org.apache.helix.manager.zk.ZkAsyncCallbacks$SetDataCallbackHandler@3c8589f0, >>>>>>> rc:NONODE, path: >>>>>>> /main_a/EXTERNALVIEW/$terrapin$data$visual_seo_joins_staging$1422384697040 >>>>>>> >>>>>>> 2015-03-07 10:35:43,736 [main-EventThread] >>>>>>> (ZkAsyncCallbacks.java:127) WARN >>>>>>> org.apache.helix.manager.zk.ZkAsyncCallbacks$SetDataCallbackHandler@63230a9a, >>>>>>> rc:NONODE, path: >>>>>>> /main_a/EXTERNALVIEW/$terrapin$data$recommendation_p2p_exp_candset_1$1425671237739 >>>>>>> >>>>>>> 2015-03-07 10:35:43,736 [main-EventThread] >>>>>>> (ZkAsyncCallbacks.java:127) WARN >>>>>>> org.apache.helix.manager.zk.ZkAsyncCallbacks$SetDataCallbackHandler@118d374f, >>>>>>> rc:NONODE, path: /main_a/EXTERNALVIEW/$terrapin$data$None$1422308641250 >>>>>>> >>>>>>> 2015-03-07 10:35:43,736 >>>>>>> [ZkClient-EventThread-17-terrapinzk001a:2181] (CallbackHandler.java:304) >>>>>>> WARN fail to subscribe child/data change. path: /main_a/EXTERNALVIEW, >>>>>>> listener: >>>>>>> com.pinterest.terrapin.controller.TerrapinRoutingTableProvider@2c6691da >>>>>>> >>>>>>> *org.I0Itec.zkclient.exception.ZkNoNodeException: >>>>>>> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = >>>>>>> NoNode for /main_a/EXTERNALVIEW/$terrapin$data$None$1422308641250* >>>>>>> >>>>>>> at >>>>>>> org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47) >>>>>>> >>>>>>> at >>>>>>> org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685) >>>>>>> >>>>>>> at >>>>>>> org.apache.helix.manager.zk.ZkClient.getChildren(ZkClient.java:210) >>>>>>> >>>>>>> at >>>>>>> org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:409) >>>>>>> >>>>>>> at >>>>>>> org.apache.helix.manager.zk.CallbackHandler.subscribeForChanges(CallbackHandler.java:279) >>>>>>> >>>>>>> at >>>>>>> org.apache.helix.manager.zk.CallbackHandler.invoke(CallbackHandler.java:202) >>>>>>> >>>>>>> at >>>>>>> org.apache.helix.manager.zk.CallbackHandler.handleChildChange(CallbackHandler.java:391) >>>>>>> >>>>>>> at org.I0Itec.zkclient.ZkClient$7.run(ZkClient.java:570) >>>>>>> >>>>>>> at >>>>>>> org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71) >>>>>>> >>>>>>> Caused by: org.apache.zookeeper.KeeperException$NoNodeException: >>>>>>> KeeperErrorCode = NoNode for >>>>>>> /main_a/EXTERNALVIEW/$terrapin$data$None$1422308641250 >>>>>>> >>>>>>> at >>>>>>> org.apache.zookeeper.KeeperException.create(KeeperException.java:102) >>>>>>> at >>>>>>> org.apache.zookeeper.KeeperException.create(KeeperException.java:42) >>>>>>> >>>>>>> at >>>>>>> org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1249) >>>>>>> >>>>>>> 2015-03-07 10:35:43,848 >>>>>>> [ZkClient-EventThread-17-terrapinzk001a:2181] >>>>>>> (RoutingTableProvider.java:99) INFO *Resetting* the routing table. >>>>>>> >>>>>>> On Thu, Mar 5, 2015 at 11:33 AM, Varun Sharma <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> I suspect the callbacks are not coming in, for a long time now. >>>>>>>> >>>>>>>> On Thu, Mar 5, 2015 at 11:30 AM, Varun Sharma <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> I grepped this and found nothing: >>>>>>>>> >>>>>>>>> sudo grep START:INVOKE.*EXTERNALVIEW >>>>>>>>> /var/log/terrapin/controller.log* >>>>>>>>> >>>>>>>>> I found a bunch of START:INVOKE for the IDEALSTATES znode though. >>>>>>>>> >>>>>>>>> On Thu, Mar 5, 2015 at 11:15 AM, Zhen Zhang <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Yes. you should see a pair of "START:INVOKE..." and >>>>>>>>>> "END:INVOKE:..." for each callback in your log. >>>>>>>>>> ------------------------------ >>>>>>>>>> *From:* Varun Sharma [[email protected]] >>>>>>>>>> *Sent:* Thursday, March 05, 2015 11:11 AM >>>>>>>>>> *To:* [email protected] >>>>>>>>>> *Subject:* Re: RoutingTableProvider dropping callbacks >>>>>>>>>> >>>>>>>>>> Ohk - is there a way to confirm that the callbacks are being >>>>>>>>>> processed (from the logs etc.) ? >>>>>>>>>> >>>>>>>>>> On Thu, Mar 5, 2015 at 10:50 AM, Zhen Zhang <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Varun, >>>>>>>>>>> >>>>>>>>>>> This should not be a problem. When we register a callback, we >>>>>>>>>>> are expecting a call back type of INIT first, followed by a >>>>>>>>>>> sequence of >>>>>>>>>>> CALLBACK types, and when you unregister the callback, you will >>>>>>>>>>> received a >>>>>>>>>>> FINALIZED type. Since unregister is an async operation, when you >>>>>>>>>>> receive a >>>>>>>>>>> FINALIZED type, you might still see a couple of CALLBACK type >>>>>>>>>>> callbacks, >>>>>>>>>>> which are simply ignored. The log is basically telling you that. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Jason >>>>>>>>>>> ------------------------------ >>>>>>>>>>> *From:* Varun Sharma [[email protected]] >>>>>>>>>>> *Sent:* Thursday, March 05, 2015 10:44 AM >>>>>>>>>>> *To:* [email protected] >>>>>>>>>>> *Subject:* RoutingTableProvider dropping callbacks >>>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> It seems that the RoutingTableProvider is dropping callbacks >>>>>>>>>>> in our case. Here is a log: >>>>>>>>>>> >>>>>>>>>>> [ZkClient-EventThread-17-terrapinzk001a:2181] >>>>>>>>>>> (CallbackHandler.java:130) WARN Skip processing callbacks for >>>>>>>>>>> listener: >>>>>>>>>>> com.pinterest.terrapin.controller.TerrapinRoutingTableProvider@7e7f8062, >>>>>>>>>>> path: /main_a/EXTERNALVIEW, expected types: [INIT] but was CALLBACK >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> We have a custom RoutingTableProvider to catch callbacks and >>>>>>>>>>> do some processing - this is causing a lot of issues for us. What >>>>>>>>>>> could be >>>>>>>>>>> causing this ? >>>>>>>>>>> >>>>>>>>>>> Thanks >>>>>>>>>>> Varun >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
