Re: Long running jobs and node drain

santosh gujar Tue, 12 May 2020 01:03:00 -0700

Hi Lei,

Thanks a lot for your time and response.


Some more context about helix partition that i mentioned in my email
earlier.
My thinking is to my map multiple long jobs to a helix partition by running
some hash function (simplest is taking a mod of an job)

" what exactly you need to do to bring a job from OFFLINE to STARTUP?"
I added STARTUP to distinguish the track the fact that a partition could be
hosted on two nodes simultaneously, I doubt offline->UP->OFFLINE model can
give me such information.

" Once the job (partition) on node-1 goes OFFLINE, Helix will bring up the
job in node-2 (OFFLINE->UP)"
I think it may not work in my case. Here is what I see the implications.
1. While node1 is in drain, old jobs continue to run, but i want new jobs
(for same partition) to be hosted by partition. Think of it as a partition
moves from one node to other but over a long time (hours) as determined by
when all existing jobs running on node1 finish.
2. As per your suggestion,  node-2 serves the partition only when node-1 is
offline. But it cannot satisfy 1 above.
One workaround I can have is to handle up->offline transition event in the
application and save the information about the node1 somewhere, then use
this information later to distinguish old jobs and new jobs. But this
information is stored outside helix and i wanted to avoid it.  What
attracted me towards helix is it's auto re-balancing capability and it's a
central strorage for state of cluster which I can use for my routing logic.
3. A job could be running for hours and thus drain can happen for a long
time.


"  How long you would expect OFFLINE->UP take here, if it is fast, the
switch should be fast. "
OFFLINE->UP is fast,  As I describe above, it's the drain on earlier
running node which is slow, the existing jobs cannot be pre-empted to move
to new node.

Regards,
Santosh

On Tue, May 12, 2020 at 10:40 AM Lei Xia <[email protected]> wrote:

> Hi, Santosh
>
>   One question, what exactly you need to do to bring a job from OFFLINE to
> STARTUP? Can we simply use OFFLINE->UP->OFFINE model. From OFFLINE->UP you
> will get the job started and ready to serve request.  From UP->OFFLINE you
> will block there until job get drained.
>
>  With this state model, you can start to drain a node by disabling it.
> Once a node is disabled, Helix will send UP->OFFLINE transition to all
> partitions on that node, in your implementation of UP->OFFLINE transition,
> you block there until the job completes. Once the job (partition) on node-1
> goes OFFLINE, Helix will bring up the job in node-2 (OFFLINE->UP).  Does
> this work for you?  How long you would expect OFFLINE->UP take here, if it
> is fast, the switch should be fast.
>
>
> Lei
>
>
>
> On Mon, May 11, 2020 at 9:02 PM santosh gujar <[email protected]>
> wrote:
>
>> Yes, there would be a database.
>> So far i have following state model for partition.
>> OFFLINE->STARTUP->UP->DRAIN->OFFLINE. But don't have / now to express
>> following
>> 1. How to Trigger Drain (This is for example we decide to get node out
>> for maintenance)
>> 2. Once a drain has started, I expect helix rebalancer to kick in and
>> move the partition simultaneously on another node in start_up mode.
>> 3. Once All jobs  on node1 are done, need a manual way to trigger it to
>> offline and move the other partition to UP state.
>>
>> It might be possible that my thinking is entirely wrong and how to fit it
>> in helix model,  but essentially above is the sequence of i want achieve.
>> Any pointers will be of great help. The constraint is that it's a long
>> running jobs that cannot be moved immediately to other node.
>>
>> Regards,
>> Santosh
>>
>> On Tue, May 12, 2020 at 1:25 AM kishore g <[email protected]> wrote:
>>
>>> I was thinking exactly in that direction - having two states is the
>>> right thing to do. Before we get there, one more question -
>>>
>>> - when you get a request for a job, how do you know if that job is old
>>> or new? Is there a database that provides the mapping between job and node
>>>
>>> On Mon, May 11, 2020 at 12:44 PM santosh gujar <[email protected]>
>>> wrote:
>>>
>>>> Thank You Kishore,
>>>>
>>>> During drain process N2 will start new jobs, the requests related to
>>>> old jobs need to go to N1 and requests for new jobs need to go to N2. Thus
>>>> during drain on N1, the partition could be present on both nodes.
>>>>
>>>> My current thinking is that in helix somehow i need to model is
>>>> as Partition P with two different states on these two nodes. . e.g. N1
>>>> could have partition P in Drain State and N2 can have partition P in
>>>> START_UP state.
>>>> I don't know if my thinking about states is correct, but looking for
>>>> any pointers.
>>>>
>>>> Regards
>>>> Santosh
>>>>
>>>> On Tue, May 12, 2020 at 1:01 AM kishore g <[email protected]> wrote:
>>>>
>>>>> what  happens to request during the drain process i.e when you put N1
>>>>> out of service and while N2 is waiting for N1 to finish the jobs, where
>>>>> will the requests for P go to - N1 or N2
>>>>>
>>>>> On Mon, May 11, 2020 at 12:19 PM santosh gujar <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I am looking for some clues or inputs on how to achieve following
>>>>>>
>>>>>> I am working on a service that involves running a statetful long
>>>>>> running jobs on a node. These long running jobs cannot be preempted and
>>>>>> continue on other nodes.
>>>>>>
>>>>>> Problem Requirements :
>>>>>> 1. In helix nomenclature, I let's say an helix partition P that
>>>>>> involves J number of such jobs running on a node. (N1)
>>>>>> 2. When I put the node in a drain, I want helix to assign a new node
>>>>>> to this partition (P) is also started on the new node (N2).
>>>>>>
>>>>>> 3. N1 can be put out of service only when all running jobs (J) on it
>>>>>> are over, at this point only N2 will serve P request.
>>>>>>
>>>>>> Questions :
>>>>>> 1. Can drain process be modeled using helix?
>>>>>> 2. If yes, Is there any recipe / pointers for a helix state model?
>>>>>> 3. Is there any custom way to trigger state transitions? From
>>>>>> documentation, I gather that Helix controller in full auto mode, triggers
>>>>>> state transitions only when number of partitions change or cluster 
>>>>>> changes
>>>>>> (node addition or deletion)
>>>>>> 3.I guess  spectator will be needed, to custom routing logic in such
>>>>>> cases, any pointers for the the same?
>>>>>>
>>>>>> Thank You
>>>>>> Santosh
>>>>>>
>>>>>
>
> --
> Lei Xia
>

Re: Long running jobs and node drain

Reply via email to