Yes, your interpretation is correct.

I thought about using your meta-resource idea, but it suffers from a similar problem as what I am currently facing. The partitions of the meta-resource will be "placed" on some participant by Helix. Now imagine that participant fails. The partition of the meta-resource will go through its DROPPED state to get placed elsewhere. So that DROPPED state is still discernible from the partition being dropped to reflect dropping of the resource it represents. Either I misunderstood your solution (1) or I need to somehow have a participant that never fails.

I can see how to use the task framework to build a reliable workflow with a sequence of steps where one of the steps is to do the final cleanup. Is this already available in the codebase (if I was willing to make my own build)?

Thanks,
Vinayak



On 5/19/14, 9:11 AM, Kanak Biscuitwala wrote:
So if I understand correctly, you basically want to know if state should be 
kept in case the partition could move back, or not. Right now, Helix treats a 
dropped resource as if all its partitions have been dropped. Separately, Helix 
treats a moved partition as a dropped partition on one participant and an added 
partition on another participant. So they're currently very much linked.

This requires some more thought, but here's what comes to mind:

1. Have a meta-resource whose partitions are simply the names of the other 
resources in the cluster. When you drop a resource, the operation would be to 
simultaneously drop the resource and drop the partition from the meta resource. 
Then you can get a separate transition for dropped resource. I haven't thought 
about the race conditions here, and there could be some impact depending on 
your app.

2. In the upcoming task framework, create a task that manages the drop resource 
scenario from beginning to end, for instance call helixadmin#dropresource, wait 
for external view to converge, issue cleanup requests to participants. 
Participants would implement a cleanup callback. This is something we're get 
out the door this quarter.

3. Something that works, but you would like to avoid: ask HelixAdmin if the 
resource exists

Perhaps others can chime in with ideas.

----------------------------------------
Date: Sun, 18 May 2014 12:08:15 -0700
From: [email protected]
To: [email protected]
Subject: Need two kinds of DROPPED states?

Hi Guys,


It looks like when a partition that is on a participant (P1) is moved to
another participant (P2), P1 is sent a transition request from OFFLINE
-> DROPPED.

In an other scenario, when a resource is dropped using HelixAdmin, the
partitions undergo a similar transition to DROPPED.

As an application, one might need to do different things in those two
cases. For example, in the first case its being dropped to become live
somewhere else and so any shared state for the resource should not be
lost. On the other hand, in the second scenario, the application might
want to clean up all state associated with the resource.

Is there a way for the application to distinguish between the first kind
of DROPPED and the second kind? I am looking to have the state machine
itself handle both the scenarios without the need for the application to
trigger some special activity to perform the cleanup in the second scenario.

Thanks,
Vinayak
                                        


Reply via email to