For me, I think this is #1. Everything is in OFFLINE state. So, dropResource should be done with the cluster up and running? (given we have embedded controllers?)
On Wed, May 20, 2015 at 3:42 PM, Hang Qi <[email protected]> wrote: > Hi Kishore, > > Fortunately, I found the zk dump file, I believe it is #2. > > The paths contains the dropped resources are in the following format > > /$cluster/INSTANCES/$instance/ERRORS/$sessionId/$resourceName/$partition > > Thanks > Hang Qi > > On Wed, May 20, 2015 at 1:41 PM, kishore g <[email protected]> wrote: > >> Hi, >> >> Here is what is happening in the code. >> >> listClusterInfo gets the resources under /IDEALSTATE >> listResourceInfo dumps the information for Resource from >> /IDEALSTATE/<resourceName> and /EXTERNALVIEW/<resourceName> >> >> This is what happens behind the scene when we drop a resource. >> >> - Idealstate is deleted first >> - Controller firsts brings all partitions to their initial state >> (OFFLINE) and then fire OFFLINE->DROPPED state. Once the OFFLINE-DROPPED >> state transition is successfully processed, its entry is deleted from >> ExternalView. >> - After all partitions handle the transitions correctly, the >> ExternalView should become empty. >> - Once the ExternalView is empty, controller deletes the ExternalView. >> >> If listResourceInfo is still showing the resource, it could be because of >> one of the following reasons: >> >> 1. The partitions have not yet reached DROPPED state. This should >> ideally finish in few seconds, depending on what is done as part of >> OFFLINE->DROPPED transition. >> 2. One of the partitions went into ERROR state. In this case, >> resource external view will continue to read. >> 3. No controller running to delete the external view after all >> partitions went to OFFLINE/DROPPED state. >> >> Vinod's cases is #3. Hang, do you remember if your case was #1 or #2? >> >> >> Thanks, >> Kishore G >> >> >> >> On Wed, May 20, 2015 at 1:18 PM, Hang Qi <[email protected]> wrote: >> >>> No, we have dedicated controllers. >>> >>> We first created one resource, and later on we decided to create a new >>> one, and dropped the previous one. After the drop, listClusterInfo did not >>> show that resource, but we were able to listResourceInfo by the dropped >>> one. While in the application, we were still receiving callback/transition >>> for dropped resource. >>> >>> Thanks >>> Hang Qi >>> >>> On Wed, May 20, 2015 at 6:44 AM, Vinoth Chandar <[email protected]> wrote: >>> >>>> Kishore and I chatted offline. The problem seems to be that there is >>>> still an external view for the resource, which Kishore tells me exists as >>>> long as a controller comes back up. (other info: no live instances around) >>>> >>>> I am running my app with a distributed/embedded controller, which means >>>> when I shut down my instances the controller(s) died as well. I will try to >>>> reproduce this locally and report back. >>>> >>>> @Hang, does this have any similarity to your usage? >>>> >>>> On Tue, May 19, 2015 at 1:43 PM, Vinoth Chandar <[email protected]> >>>> wrote: >>>> >>>>> I did a ZK dump before I cleared everything out.. Will investigate and >>>>> send more info out.. >>>>> >>>>> @Kishore, dropResource did not error out.. My memory is vague as it >>>>> was middle of the night :), but I think I shut everything down before I >>>>> issued the CLI command. >>>>> >>>>> Thanks >>>>> Vinoth >>>>> >>>>> On Tue, May 19, 2015 at 12:50 PM, Hang Qi <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi Vinoth, >>>>>> >>>>>> We met this issue before. What we did is using zk-dumper.sh to dump >>>>>> everything inside ZK, and see where does this resource exist, and remove >>>>>> those paths in ZK, and that works. >>>>>> >>>>>> Unfortunately, we did not keep the state, so It would be great if you >>>>>> can share the paths which contains the resource you dropped, that would >>>>>> be >>>>>> helpful for debugging. >>>>>> >>>>>> Thanks >>>>>> Hang Qi >>>>>> >>>>>> On Tue, May 19, 2015 at 11:10 AM, Vinoth Chandar <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I dropped the resource already, but still seeing callbacks firing.. >>>>>>> I cannot list the resource using listResources. >>>>>>> >>>>>>> $:~/helix-core-0.6.5$ bin/helix-admin.sh --zkSvr zkmaster:2181 >>>>>>> --dropResource streamio countLog >>>>>>> $:~/helix-core-0.6.5$ bin/helix-admin.sh --zkSvr zkmaster:2181 >>>>>>> --listResourceInfo streamio countLog | tail -10 >>>>>>> "simpleFields" : { >>>>>>> "BUCKET_SIZE" : "0", >>>>>>> "IDEAL_STATE_MODE" : "AUTO_REBALANCE", >>>>>>> "NUM_PARTITIONS" : "4096", >>>>>>> "REBALANCE_MODE" : "FULL_AUTO", >>>>>>> "REPLICAS" : "1", >>>>>>> "STATE_MODEL_DEF_REF" : "OnlineOffline", >>>>>>> "STATE_MODEL_FACTORY_NAME" : "DEFAULT" >>>>>>> } >>>>>>> } >>>>>>> $ bin/helix-admin.sh --zkSvr zkmaster:2181 --listResources streamio >>>>>>> | grep countLog | wc -l >>>>>>> 0 >>>>>>> >>>>>>> Any idea how to troubleshoot this? >>>>>>> >>>>>>> Thanks >>>>>>> Vinoth >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Qi hang >>>>>> >>>>> >>>>> >>>> >>> >>> >>> -- >>> Qi hang >>> >> >> > > > -- > Qi hang >
