Re: Long running jobs and node drain

santosh gujar Tue, 12 May 2020 10:58:23 -0700

Hi Hunter,

For various limitations and constraints at this moment, I cannot go down
the path of Task Framework.


Thanks,
Santosh

On Tue, May 12, 2020 at 7:23 PM Hunter Lee <[email protected]> wrote:

> Alternative idea:
>
> Have you considered using Task Framework's targeted jobs for this use
> case? You could make the jobs long-running, and this way, you save yourself
> the trouble of having to implement the routing layer (simply specifying
> which partition to target in your JobConfig would do it).
>
> Task Framework doesn't actively terminate running threads on the worker
> (Participant) nodes, so you could achieve the effect of "draining" the node
> by letting previously assigned tasks to finish by not actively canceling
> them in your cancel() logic.
>
> Hunter
>
> On Tue, May 12, 2020 at 1:02 AM santosh gujar <[email protected]>
> wrote:
>
>> Hi Lei,
>>
>> Thanks a lot for your time and response.
>>
>> Some more context about helix partition that i mentioned in my email
>> earlier.
>> My thinking is to my map multiple long jobs to a helix partition by
>> running some hash function (simplest is taking a mod of an job)
>>
>> " what exactly you need to do to bring a job from OFFLINE to STARTUP?"
>> I added STARTUP to distinguish the track the fact that a partition could
>> be hosted on two nodes simultaneously, I doubt offline->UP->OFFLINE model
>> can give me such information.
>>
>> " Once the job (partition) on node-1 goes OFFLINE, Helix will bring up
>> the job in node-2 (OFFLINE->UP)"
>> I think it may not work in my case. Here is what I see the implications.
>> 1. While node1 is in drain, old jobs continue to run, but i want new jobs
>> (for same partition) to be hosted by partition. Think of it as a partition
>> moves from one node to other but over a long time (hours) as determined by
>> when all existing jobs running on node1 finish.
>> 2. As per your suggestion,  node-2 serves the partition only when node-1
>> is offline. But it cannot satisfy 1 above.
>> One workaround I can have is to handle up->offline transition event in
>> the application and save the information about the node1 somewhere, then
>> use this information later to distinguish old jobs and new jobs. But this
>> information is stored outside helix and i wanted to avoid it.  What
>> attracted me towards helix is it's auto re-balancing capability and it's a
>> central strorage for state of cluster which I can use for my routing logic.
>> 3. A job could be running for hours and thus drain can happen for a long
>> time.
>>
>>
>> "  How long you would expect OFFLINE->UP take here, if it is fast, the
>> switch should be fast. "
>> OFFLINE->UP is fast,  As I describe above, it's the drain on earlier
>> running node which is slow, the existing jobs cannot be pre-empted to move
>> to new node.
>>
>> Regards,
>> Santosh
>>
>> On Tue, May 12, 2020 at 10:40 AM Lei Xia <[email protected]> wrote:
>>
>>> Hi, Santosh
>>>
>>>   One question, what exactly you need to do to bring a job from OFFLINE
>>> to STARTUP? Can we simply use OFFLINE->UP->OFFINE model. From OFFLINE->UP
>>> you will get the job started and ready to serve request.  From UP->OFFLINE
>>> you will block there until job get drained.
>>>
>>>  With this state model, you can start to drain a node by disabling it.
>>> Once a node is disabled, Helix will send UP->OFFLINE transition to all
>>> partitions on that node, in your implementation of UP->OFFLINE transition,
>>> you block there until the job completes. Once the job (partition) on node-1
>>> goes OFFLINE, Helix will bring up the job in node-2 (OFFLINE->UP).  Does
>>> this work for you?  How long you would expect OFFLINE->UP take here, if it
>>> is fast, the switch should be fast.
>>>
>>>
>>> Lei
>>>
>>>
>>>
>>> On Mon, May 11, 2020 at 9:02 PM santosh gujar <[email protected]>
>>> wrote:
>>>
>>>> Yes, there would be a database.
>>>> So far i have following state model for partition.
>>>> OFFLINE->STARTUP->UP->DRAIN->OFFLINE. But don't have / now to express
>>>> following
>>>> 1. How to Trigger Drain (This is for example we decide to get node out
>>>> for maintenance)
>>>> 2. Once a drain has started, I expect helix rebalancer to kick in and
>>>> move the partition simultaneously on another node in start_up mode.
>>>> 3. Once All jobs  on node1 are done, need a manual way to trigger it to
>>>> offline and move the other partition to UP state.
>>>>
>>>> It might be possible that my thinking is entirely wrong and how to fit
>>>> it in helix model,  but essentially above is the sequence of i want
>>>> achieve.  Any pointers will be of great help. The constraint is that it's a
>>>> long running jobs that cannot be moved immediately to other node.
>>>>
>>>> Regards,
>>>> Santosh
>>>>
>>>> On Tue, May 12, 2020 at 1:25 AM kishore g <[email protected]> wrote:
>>>>
>>>>> I was thinking exactly in that direction - having two states is the
>>>>> right thing to do. Before we get there, one more question -
>>>>>
>>>>> - when you get a request for a job, how do you know if that job is old
>>>>> or new? Is there a database that provides the mapping between job and node
>>>>>
>>>>> On Mon, May 11, 2020 at 12:44 PM santosh gujar <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Thank You Kishore,
>>>>>>
>>>>>> During drain process N2 will start new jobs, the requests related to
>>>>>> old jobs need to go to N1 and requests for new jobs need to go to N2. 
>>>>>> Thus
>>>>>> during drain on N1, the partition could be present on both nodes.
>>>>>>
>>>>>> My current thinking is that in helix somehow i need to model is
>>>>>> as Partition P with two different states on these two nodes. . e.g. N1
>>>>>> could have partition P in Drain State and N2 can have partition P in
>>>>>> START_UP state.
>>>>>> I don't know if my thinking about states is correct, but looking for
>>>>>> any pointers.
>>>>>>
>>>>>> Regards
>>>>>> Santosh
>>>>>>
>>>>>> On Tue, May 12, 2020 at 1:01 AM kishore g <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> what  happens to request during the drain process i.e when you put
>>>>>>> N1 out of service and while N2 is waiting for N1 to finish the jobs, 
>>>>>>> where
>>>>>>> will the requests for P go to - N1 or N2
>>>>>>>
>>>>>>> On Mon, May 11, 2020 at 12:19 PM santosh gujar <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> I am looking for some clues or inputs on how to achieve following
>>>>>>>>
>>>>>>>> I am working on a service that involves running a statetful long
>>>>>>>> running jobs on a node. These long running jobs cannot be preempted and
>>>>>>>> continue on other nodes.
>>>>>>>>
>>>>>>>> Problem Requirements :
>>>>>>>> 1. In helix nomenclature, I let's say an helix partition P that
>>>>>>>> involves J number of such jobs running on a node. (N1)
>>>>>>>> 2. When I put the node in a drain, I want helix to assign a new
>>>>>>>> node to this partition (P) is also started on the new node (N2).
>>>>>>>>
>>>>>>>> 3. N1 can be put out of service only when all running jobs (J) on
>>>>>>>> it are over, at this point only N2 will serve P request.
>>>>>>>>
>>>>>>>> Questions :
>>>>>>>> 1. Can drain process be modeled using helix?
>>>>>>>> 2. If yes, Is there any recipe / pointers for a helix state model?
>>>>>>>> 3. Is there any custom way to trigger state transitions? From
>>>>>>>> documentation, I gather that Helix controller in full auto mode, 
>>>>>>>> triggers
>>>>>>>> state transitions only when number of partitions change or cluster 
>>>>>>>> changes
>>>>>>>> (node addition or deletion)
>>>>>>>> 3.I guess  spectator will be needed, to custom routing logic in
>>>>>>>> such cases, any pointers for the the same?
>>>>>>>>
>>>>>>>> Thank You
>>>>>>>> Santosh
>>>>>>>>
>>>>>>>
>>>
>>> --
>>> Lei Xia
>>>
>>

Re: Long running jobs and node drain

Reply via email to