[ https://issues.apache.org/jira/browse/HADOOP-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15709695#comment-15709695 ]
Thomas Demoor commented on HADOOP-13826: ---------------------------------------- I think [~mackrorysd]'s implementation is heading in the right direction. Some questions / suggestions: * The {{controlTypes}} do not have a large memory and bandwidth impact as they carry little payload. Consequently, I think we can allow a lot of active threads here and the waiting room can be unbounded. I hope this would fix the issues [~mackrorysd] is still encountering. (In contrast to my earlier thinking above, I don't think the number of active threads needs to be shared between the two types, it seems unlikely that {{controlTypes}} will use significant resources) * The {{subTaskTypes}} have the potential to overwhelm memory and bandwidth usage and should thus be run from the bounded threadpool. We need to take care that all relevant classes are captured here. * I am not 100% sure if what I propose here would eliminate all deadlocks. I do not understand the deadlock scenario entirely (yet) from the discussion above. If you would have more insight please help me out. > S3A Deadlock in multipart copy due to thread pool limits. > --------------------------------------------------------- > > Key: HADOOP-13826 > URL: https://issues.apache.org/jira/browse/HADOOP-13826 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 > Affects Versions: 2.7.3 > Reporter: Sean Mackrory > Assignee: Sean Mackrory > Priority: Critical > Attachments: HADOOP-13826.001.patch, HADOOP-13826.002.patch > > > In testing HIVE-15093 we have encountered deadlocks in the s3a connector. The > TransferManager javadocs > (http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManager.html) > explain how this is possible: > {quote}It is not recommended to use a single threaded executor or a thread > pool with a bounded work queue as control tasks may submit subtasks that > can't complete until all sub tasks complete. Using an incorrectly configured > thread pool may cause a deadlock (I.E. the work queue is filled with control > tasks that can't finish until subtasks complete but subtasks can't execute > because the queue is filled).{quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org