[ 
https://issues.apache.org/jira/browse/OOZIE-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13931400#comment-13931400
 ] 

Srikanth Sundarrajan commented on OOZIE-1533:
---------------------------------------------

[~rohini], Unless all coord actions are done, status transit service should't 
be updating the coord job. correct ? Perhaps we should keep updates to coord 
possible only via three routes (1. user action, 2. when all coord actions are 
in completed state, 3. Materialization) to prevent StatusTransitService from 
playing god.
{quote}
One problem that needs to be addressed before this was that there are lot of 
places in code where coord job is updated
{quote}

Regarding the CoordActionInputCheckXCommand, you bring up a really important 
concern, but to throttle it down through a coord lock seems to generally bring 
down the throughput and it might useful to keep it free of this lock. We should 
look at options to perform bulk checks for input to improve the scalability of 
this operation without hurting NN / DB

In practice I found that most commands resort to checking the coord status in 
verifyPrecondition(), so the odds of a coord action running while the coord 
being in killed state due to a user interrupt is negligible, however the 
possibility does exist.
{quote}
Another thing is interrupt commands like coord kill, etc will not be processed 
earlier if the lock is changed to the action id.
{quote}

> Coordinator action materialization is too slow due to coarse job level locks
> ----------------------------------------------------------------------------
>
>                 Key: OOZIE-1533
>                 URL: https://issues.apache.org/jira/browse/OOZIE-1533
>             Project: Oozie
>          Issue Type: Improvement
>            Reporter: Srikanth Sundarrajan
>            Assignee: Srikanth Sundarrajan
>              Labels: locking
>         Attachments: OOZIE-1533.patch
>
>
> Coord job level lock introduces high contention. Instead introduce coord 
> action level locking whenever appropriate



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to