[ https://issues.apache.org/jira/browse/OOZIE-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13931400#comment-13931400 ]
Srikanth Sundarrajan commented on OOZIE-1533: --------------------------------------------- [~rohini], Unless all coord actions are done, status transit service should't be updating the coord job. correct ? Perhaps we should keep updates to coord possible only via three routes (1. user action, 2. when all coord actions are in completed state, 3. Materialization) to prevent StatusTransitService from playing god. {quote} One problem that needs to be addressed before this was that there are lot of places in code where coord job is updated {quote} Regarding the CoordActionInputCheckXCommand, you bring up a really important concern, but to throttle it down through a coord lock seems to generally bring down the throughput and it might useful to keep it free of this lock. We should look at options to perform bulk checks for input to improve the scalability of this operation without hurting NN / DB In practice I found that most commands resort to checking the coord status in verifyPrecondition(), so the odds of a coord action running while the coord being in killed state due to a user interrupt is negligible, however the possibility does exist. {quote} Another thing is interrupt commands like coord kill, etc will not be processed earlier if the lock is changed to the action id. {quote} > Coordinator action materialization is too slow due to coarse job level locks > ---------------------------------------------------------------------------- > > Key: OOZIE-1533 > URL: https://issues.apache.org/jira/browse/OOZIE-1533 > Project: Oozie > Issue Type: Improvement > Reporter: Srikanth Sundarrajan > Assignee: Srikanth Sundarrajan > Labels: locking > Attachments: OOZIE-1533.patch > > > Coord job level lock introduces high contention. Instead introduce coord > action level locking whenever appropriate -- This message was sent by Atlassian JIRA (v6.2#6252)