Re: Long running jobs and node drain

santosh gujar Wed, 13 May 2020 11:25:19 -0700

Thanks a lot Lei, I assume by blocking you mean , blocking on in a method
call that is called


e.g. following pseudo code.

class MyModel extends StateModel  {
@Transition(from = "UP", to = "DRAIN")
public void offlineToSlave(Message message, NotificationContext context) {
//don't return until long long running job is running
}

On Wed, May 13, 2020 at 10:40 PM Lei Xia <[email protected]> wrote:

> Hi, Santosh
>
>   Thanks for explaining your case in detail. In this case, I would
> recommend you to use "OFFLINE->UP->DRAIN->OFFLINE" model. And you can set
> the constraint of your model to limit # of replica in UP state to be 1,
> i.e, Helix will make sure there is only 1 replica in UP at same time. When
> you are ready to drain an instance, disable the instance first, then Helix
> will transit all partitions (jobs) on that instance to DRAIN and then
> OFFLINE, you can block at DRAIN->OFFLINE transition until all jobs are
> completed.  On the other hand, once the old partition is in DRAIN state,
> Helix should bring up a new partition to UP (OFFLINE->UP) on a new node.
>
>
>
> Lei
>
> On Tue, May 12, 2020 at 10:58 AM santosh gujar <[email protected]>
> wrote:
>
>> Hi Hunter,
>>
>> For various limitations and constraints at this moment, I cannot go down
>> the path of Task Framework.
>>
>> Thanks,
>> Santosh
>>
>> On Tue, May 12, 2020 at 7:23 PM Hunter Lee <[email protected]> wrote:
>>
>>> Alternative idea:
>>>
>>> Have you considered using Task Framework's targeted jobs for this use
>>> case? You could make the jobs long-running, and this way, you save yourself
>>> the trouble of having to implement the routing layer (simply specifying
>>> which partition to target in your JobConfig would do it).
>>>
>>> Task Framework doesn't actively terminate running threads on the worker
>>> (Participant) nodes, so you could achieve the effect of "draining" the node
>>> by letting previously assigned tasks to finish by not actively canceling
>>> them in your cancel() logic.
>>>
>>> Hunter
>>>
>>> On Tue, May 12, 2020 at 1:02 AM santosh gujar <[email protected]>
>>> wrote:
>>>
>>>> Hi Lei,
>>>>
>>>> Thanks a lot for your time and response.
>>>>
>>>> Some more context about helix partition that i mentioned in my email
>>>> earlier.
>>>> My thinking is to my map multiple long jobs to a helix partition by
>>>> running some hash function (simplest is taking a mod of an job)
>>>>
>>>> " what exactly you need to do to bring a job from OFFLINE to STARTUP?"
>>>> I added STARTUP to distinguish the track the fact that a partition
>>>> could be hosted on two nodes simultaneously, I doubt offline->UP->OFFLINE
>>>> model can give me such information.
>>>>
>>>> " Once the job (partition) on node-1 goes OFFLINE, Helix will bring up
>>>> the job in node-2 (OFFLINE->UP)"
>>>> I think it may not work in my case. Here is what I see the implications.
>>>> 1. While node1 is in drain, old jobs continue to run, but i want new
>>>> jobs (for same partition) to be hosted by partition. Think of it as a
>>>> partition moves from one node to other but over a long time (hours) as
>>>> determined by when all existing jobs running on node1 finish.
>>>> 2. As per your suggestion,  node-2 serves the partition only when
>>>> node-1 is offline. But it cannot satisfy 1 above.
>>>> One workaround I can have is to handle up->offline transition event in
>>>> the application and save the information about the node1 somewhere, then
>>>> use this information later to distinguish old jobs and new jobs. But this
>>>> information is stored outside helix and i wanted to avoid it.  What
>>>> attracted me towards helix is it's auto re-balancing capability and it's a
>>>> central strorage for state of cluster which I can use for my routing logic.
>>>> 3. A job could be running for hours and thus drain can happen for a
>>>> long time.
>>>>
>>>>
>>>> "  How long you would expect OFFLINE->UP take here, if it is fast, the
>>>> switch should be fast. "
>>>> OFFLINE->UP is fast,  As I describe above, it's the drain on earlier
>>>> running node which is slow, the existing jobs cannot be pre-empted to move
>>>> to new node.
>>>>
>>>> Regards,
>>>> Santosh
>>>>
>>>> On Tue, May 12, 2020 at 10:40 AM Lei Xia <[email protected]> wrote:
>>>>
>>>>> Hi, Santosh
>>>>>
>>>>>   One question, what exactly you need to do to bring a job from
>>>>> OFFLINE to STARTUP? Can we simply use OFFLINE->UP->OFFINE model. From
>>>>> OFFLINE->UP you will get the job started and ready to serve request.  From
>>>>> UP->OFFLINE you will block there until job get drained.
>>>>>
>>>>>  With this state model, you can start to drain a node by disabling it.
>>>>> Once a node is disabled, Helix will send UP->OFFLINE transition to all
>>>>> partitions on that node, in your implementation of UP->OFFLINE transition,
>>>>> you block there until the job completes. Once the job (partition) on 
>>>>> node-1
>>>>> goes OFFLINE, Helix will bring up the job in node-2 (OFFLINE->UP).  Does
>>>>> this work for you?  How long you would expect OFFLINE->UP take here, if it
>>>>> is fast, the switch should be fast.
>>>>>
>>>>>
>>>>> Lei
>>>>>
>>>>>
>>>>>
>>>>> On Mon, May 11, 2020 at 9:02 PM santosh gujar <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Yes, there would be a database.
>>>>>> So far i have following state model for partition.
>>>>>> OFFLINE->STARTUP->UP->DRAIN->OFFLINE. But don't have / now to express
>>>>>> following
>>>>>> 1. How to Trigger Drain (This is for example we decide to get node
>>>>>> out for maintenance)
>>>>>> 2. Once a drain has started, I expect helix rebalancer to kick in and
>>>>>> move the partition simultaneously on another node in start_up mode.
>>>>>> 3. Once All jobs  on node1 are done, need a manual way to trigger it
>>>>>> to offline and move the other partition to UP state.
>>>>>>
>>>>>> It might be possible that my thinking is entirely wrong and how to
>>>>>> fit it in helix model,  but essentially above is the sequence of i want
>>>>>> achieve.  Any pointers will be of great help. The constraint is that 
>>>>>> it's a
>>>>>> long running jobs that cannot be moved immediately to other node.
>>>>>>
>>>>>> Regards,
>>>>>> Santosh
>>>>>>
>>>>>> On Tue, May 12, 2020 at 1:25 AM kishore g <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> I was thinking exactly in that direction - having two states is the
>>>>>>> right thing to do. Before we get there, one more question -
>>>>>>>
>>>>>>> - when you get a request for a job, how do you know if that job is
>>>>>>> old or new? Is there a database that provides the mapping between job 
>>>>>>> and
>>>>>>> node
>>>>>>>
>>>>>>> On Mon, May 11, 2020 at 12:44 PM santosh gujar <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Thank You Kishore,
>>>>>>>>
>>>>>>>> During drain process N2 will start new jobs, the requests related
>>>>>>>> to old jobs need to go to N1 and requests for new jobs need to go to 
>>>>>>>> N2.
>>>>>>>> Thus during drain on N1, the partition could be present on both nodes.
>>>>>>>>
>>>>>>>> My current thinking is that in helix somehow i need to model is
>>>>>>>> as Partition P with two different states on these two nodes. . e.g. N1
>>>>>>>> could have partition P in Drain State and N2 can have partition P in
>>>>>>>> START_UP state.
>>>>>>>> I don't know if my thinking about states is correct, but looking
>>>>>>>> for any pointers.
>>>>>>>>
>>>>>>>> Regards
>>>>>>>> Santosh
>>>>>>>>
>>>>>>>> On Tue, May 12, 2020 at 1:01 AM kishore g <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> what  happens to request during the drain process i.e when you put
>>>>>>>>> N1 out of service and while N2 is waiting for N1 to finish the jobs, 
>>>>>>>>> where
>>>>>>>>> will the requests for P go to - N1 or N2
>>>>>>>>>
>>>>>>>>> On Mon, May 11, 2020 at 12:19 PM santosh gujar <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> I am looking for some clues or inputs on how to achieve following
>>>>>>>>>>
>>>>>>>>>> I am working on a service that involves running a statetful long
>>>>>>>>>> running jobs on a node. These long running jobs cannot be preempted 
>>>>>>>>>> and
>>>>>>>>>> continue on other nodes.
>>>>>>>>>>
>>>>>>>>>> Problem Requirements :
>>>>>>>>>> 1. In helix nomenclature, I let's say an helix partition P that
>>>>>>>>>> involves J number of such jobs running on a node. (N1)
>>>>>>>>>> 2. When I put the node in a drain, I want helix to assign a new
>>>>>>>>>> node to this partition (P) is also started on the new node (N2).
>>>>>>>>>>
>>>>>>>>>> 3. N1 can be put out of service only when all running jobs (J) on
>>>>>>>>>> it are over, at this point only N2 will serve P request.
>>>>>>>>>>
>>>>>>>>>> Questions :
>>>>>>>>>> 1. Can drain process be modeled using helix?
>>>>>>>>>> 2. If yes, Is there any recipe / pointers for a helix state model?
>>>>>>>>>> 3. Is there any custom way to trigger state transitions? From
>>>>>>>>>> documentation, I gather that Helix controller in full auto mode, 
>>>>>>>>>> triggers
>>>>>>>>>> state transitions only when number of partitions change or cluster 
>>>>>>>>>> changes
>>>>>>>>>> (node addition or deletion)
>>>>>>>>>> 3.I guess  spectator will be needed, to custom routing logic in
>>>>>>>>>> such cases, any pointers for the the same?
>>>>>>>>>>
>>>>>>>>>> Thank You
>>>>>>>>>> Santosh
>>>>>>>>>>
>>>>>>>>>
>>>>>
>>>>> --
>>>>> Lei Xia
>>>>>
>>>>
>
> --
> Lei Xia
>

Re: Long running jobs and node drain

Reply via email to