[ 
https://issues.apache.org/jira/browse/HADOOP-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HADOOP-13826:
-----------------------------------
    Attachment: HADOOP-13826.002.patch

Attaching a proof-of-concept of my proposed solution. It still needs some 
polish and has the major drawback of depending on classes in 
com.amazonaws.services.s3.transfer.internal. It also has the major drawback of 
not working. It can work with more concurrent renames, but it would appear 
there isn't a simple division between 'control tasks' and 'sub tasks'. I had 
the control task pool fill up while the subtask pool was still empty, and it 
deadlocked. Things considered a control task can spawn other control tasks. I 
don't think tasks ever spawn other tasks of the same type, so I'm going to try 
just having another tier for the other control tasks. 

There's also a question once all the other obstacles are out of the way, about 
how this gets configured. It's no longer a single pool of resources, yet it's 
configured that way. Maybe we have a rule of thumb that 20% of the threads are 
for control tasks, and the rest are for subtasks, or something along those 
lines?

> [s3a] Deadlock possible using Amazon S3 SDK
> -------------------------------------------
>
>                 Key: HADOOP-13826
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13826
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Sean Mackrory
>         Attachments: HADOOP-13826.001.patch, HADOOP-13826.002.patch
>
>
> In testing HIVE-15093 we have encountered deadlocks in the s3a connector. The 
> TransferManager javadocs 
> (http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManager.html)
>  explain how this is possible:
> {quote}It is not recommended to use a single threaded executor or a thread 
> pool with a bounded work queue as control tasks may submit subtasks that 
> can't complete until all sub tasks complete. Using an incorrectly configured 
> thread pool may cause a deadlock (I.E. the work queue is filled with control 
> tasks that can't finish until subtasks complete but subtasks can't execute 
> because the queue is filled).{quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to