[ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenhaodan updated OOZIE-3717:
------------------------------
    Description: 
Fork actions parallel submit, so will add ForkedActionStartXCommand and 

RecoveryService will check pending action may add ActionStartXCommand, if 
ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
action) in queue, it would be lose. The thread parallel submit actions block at 
CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
finish, but ForkedActionStartXCommand had lost and cause deadlock.
{code:java}
         Thread 1                   Thread 2
  (ForkedActionStartXCommand)      (ActionStartXCommand)
+----------------------------+          +---------+
| removeFromUniqueCallables  |          |  .....  |
+----------------------------+          +---------+
|           ......           |          |  queue  |
+----------------------------+          +---------+
|           queue            |       enqueue successed, in uniqueCallables
+----------------------------+ 
| wrapper.filterDuplicates() |
+----------------------------+

Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so 
ForkedActionStartXCommand would be lost, and block at 
CallableQueueService.blockingWait(). {code}

  was:
Fork actions parallel submit, so will add ForkedActionStartXCommand and 

RecoveryService will check pending action may add ActionStartXCommand, if 
ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
action) in queue, it would be lose. The thread parallel submit actions block at 
CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
finish, but ForkedActionStartXCommand had lost and cause deadlock.
{code:java}
         Thread 1                   Thread 2
  (ForkedActionStartXCommand)      (ActionStartXCommand)
+----------------------------+          +---------+
| removeFromUniqueCallables  |          |  ..... |
+----------------------------+          +---------+
|         ......        |          |  queue  |
+----------------------------+          +---------+
|         queue         |       enqueue successed, in uniqueCallables
+----------------------------+ 
| wrapper.filterDuplicates() |
+----------------------------+

Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so 
ForkedActionStartXCommand would be lost, and block at 
CallableQueueService.blockingWait(). {code}


> Fork actions parallel submit, becasue ForkedActionStartXCommand and 
> ActionStartXCommand has the same name, so ForkedActionStartXCommand would be 
> lost, and cause deadlock
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: OOZIE-3717
>                 URL: https://issues.apache.org/jira/browse/OOZIE-3717
>             Project: Oozie
>          Issue Type: Bug
>          Components: action
>    Affects Versions: 5.2.1
>            Reporter: chenhaodan
>            Assignee: chenhaodan
>            Priority: Major
>             Fix For: trunk
>
>
> Fork actions parallel submit, so will add ForkedActionStartXCommand and 
> RecoveryService will check pending action may add ActionStartXCommand, if 
> ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
> action) in queue, it would be lose. The thread parallel submit actions block 
> at CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
> finish, but ForkedActionStartXCommand had lost and cause deadlock.
> {code:java}
>          Thread 1                   Thread 2
>   (ForkedActionStartXCommand)      (ActionStartXCommand)
> +----------------------------+          +---------+
> | removeFromUniqueCallables  |          |  .....  |
> +----------------------------+          +---------+
> |           ......           |          |  queue  |
> +----------------------------+          +---------+
> |           queue            |       enqueue successed, in uniqueCallables
> +----------------------------+ 
> | wrapper.filterDuplicates() |
> +----------------------------+
> Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
> so ForkedActionStartXCommand would be lost, and block at 
> CallableQueueService.blockingWait(). {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to