[jira] [Commented] (MAPREDUCE-7307) Potential thread leak in LocatedFileStatusFetcher
[ https://issues.apache.org/jira/browse/MAPREDUCE-7307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237782#comment-17237782 ] Zhihua Deng commented on MAPREDUCE-7307: Thanks very much for the help and review. > Potential thread leak in LocatedFileStatusFetcher > - > > Key: MAPREDUCE-7307 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7307 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: job submission >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Major > Fix For: 3.3.1 > > > We see that when using LocatedFileStatusFetcher to get file infos In > parallel, if the listStatus thread is interrupted, the executor service in > LocatedFileStatusFetcher is left unclosed, the thread stack will like this: > {noformat} > "GetFileInfo #63" #125 daemon prio=5 os_prio=0 tid=0x7f6198106800 > nid=0x881 waiting on condition [0x7f60d9fde000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x82e810a8> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748){noformat} > This caused by if condition.await() throws InterruptedException, the method > `shutDownNow` for the executor service would not be called as a result, > should move such resource releasing call into the finally block. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7309) Improve performance of reading resource request for mapper/reducers from config
[ https://issues.apache.org/jira/browse/MAPREDUCE-7309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237689#comment-17237689 ] Wangda Tan commented on MAPREDUCE-7309: --- Thanks [~pbacsko], the latest patch looks better than v2. Can you check checkstyle and junit and see if it is related to the patch. > Improve performance of reading resource request for mapper/reducers from > config > --- > > Key: MAPREDUCE-7309 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7309 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: applicationmaster >Affects Versions: 3.0.0, 3.1.0, 3.2.0, 3.3.0 >Reporter: Wangda Tan >Assignee: Peter Bacsko >Priority: Major > Attachments: MAPREDUCE-7309-003.patch, MAPREDUCE-7309.001.patch, > MAPREDUCE-7309.002.patch > > > This is an issue could affect all the releases which includes YARN-6927. > Basically, we use regex match repeatly when we read mapper/reducer resource > request from config files. When we have large config file, and large number > of splits, it could take a long time. > We saw AM could take hours to parse config when we have 200k+ splits, with a > large config file (hundreds of kbs). > We should do proper caching for pre-configured resource requests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Assigned] (MAPREDUCE-7309) Improve performance of reading resource request for mapper/reducers from config
[ https://issues.apache.org/jira/browse/MAPREDUCE-7309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan reassigned MAPREDUCE-7309: - Assignee: Peter Bacsko (was: Wangda Tan) > Improve performance of reading resource request for mapper/reducers from > config > --- > > Key: MAPREDUCE-7309 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7309 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: applicationmaster >Affects Versions: 3.0.0, 3.1.0, 3.2.0, 3.3.0 >Reporter: Wangda Tan >Assignee: Peter Bacsko >Priority: Major > Attachments: MAPREDUCE-7309-003.patch, MAPREDUCE-7309.001.patch, > MAPREDUCE-7309.002.patch > > > This is an issue could affect all the releases which includes YARN-6927. > Basically, we use regex match repeatly when we read mapper/reducer resource > request from config files. When we have large config file, and large number > of splits, it could take a long time. > We saw AM could take hours to parse config when we have 200k+ splits, with a > large config file (hundreds of kbs). > We should do proper caching for pre-configured resource requests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7307) Potential thread leak in LocatedFileStatusFetcher
[ https://issues.apache.org/jira/browse/MAPREDUCE-7307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved MAPREDUCE-7307. --- Fix Version/s: 3.3.1 Resolution: Fixed > Potential thread leak in LocatedFileStatusFetcher > - > > Key: MAPREDUCE-7307 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7307 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: job submission >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Major > Fix For: 3.3.1 > > > We see that when using LocatedFileStatusFetcher to get file infos In > parallel, if the listStatus thread is interrupted, the executor service in > LocatedFileStatusFetcher is left unclosed, the thread stack will like this: > {noformat} > "GetFileInfo #63" #125 daemon prio=5 os_prio=0 tid=0x7f6198106800 > nid=0x881 waiting on condition [0x7f60d9fde000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x82e810a8> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748){noformat} > This caused by if condition.await() throws InterruptedException, the method > `shutDownNow` for the executor service would not be called as a result, > should move such resource releasing call into the finally block. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7308) Containers never get reused as containersToReuse map gets cleared on makeRemoteRequest
[ https://issues.apache.org/jira/browse/MAPREDUCE-7308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237462#comment-17237462 ] Hadoop QA commented on MAPREDUCE-7308: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 2m 42s{color} | | {color:red} Docker failed to build yetus/hadoop:9560f252cf1. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | MAPREDUCE-7308 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13015860/MAPREDUCE-7308-MR-6749.001.patch | | Console output | https://ci-hadoop.apache.org/job/PreCommit-MAPREDUCE-Build/38/console | | versions | git=2.17.1 | | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org | This message was automatically generated. > Containers never get reused as containersToReuse map gets cleared on > makeRemoteRequest > -- > > Key: MAPREDUCE-7308 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7308 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: MAPREDUCE-7308-MR-6749.001.patch > > > In RMContainerReuseRequestor whenever containerAssigned is called it checks > if allocated container can be reused. This always returns false as the map is > getting cleared on makeRemoteRequest. I think container can be removed from > containersToReuse map once its used. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7308) Containers never get reused as containersToReuse map gets cleared on makeRemoteRequest
[ https://issues.apache.org/jira/browse/MAPREDUCE-7308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated MAPREDUCE-7308: - Attachment: MAPREDUCE-7308-MR-6749.001.patch > Containers never get reused as containersToReuse map gets cleared on > makeRemoteRequest > -- > > Key: MAPREDUCE-7308 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7308 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: MAPREDUCE-7308-MR-6749.001.patch > > > In RMContainerReuseRequestor whenever containerAssigned is called it checks > if allocated container can be reused. This always returns false as the map is > getting cleared on makeRemoteRequest. I think container can be removed from > containersToReuse map once its used. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7308) Containers never get reused as containersToReuse map gets cleared on makeRemoteRequest
[ https://issues.apache.org/jira/browse/MAPREDUCE-7308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated MAPREDUCE-7308: - Status: Patch Available (was: Open) > Containers never get reused as containersToReuse map gets cleared on > makeRemoteRequest > -- > > Key: MAPREDUCE-7308 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7308 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: MAPREDUCE-7308-MR-6749.001.patch > > > In RMContainerReuseRequestor whenever containerAssigned is called it checks > if allocated container can be reused. This always returns false as the map is > getting cleared on makeRemoteRequest. I think container can be removed from > containersToReuse map once its used. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Assigned] (MAPREDUCE-6784) JobImpl state changes for containers reuse
[ https://issues.apache.org/jira/browse/MAPREDUCE-6784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T reassigned MAPREDUCE-6784: Assignee: Bilwa S T (was: Devaraj Kavali) > JobImpl state changes for containers reuse > -- > > Key: MAPREDUCE-6784 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6784 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: applicationmaster, mrv2 >Reporter: Devaraj Kavali >Assignee: Bilwa S T >Priority: Major > Attachments: MAPREDUCE-6784-v0.patch > > > Add JobImpl state changes for supporting reusing of containers. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7307) Potential thread leak in LocatedFileStatusFetcher
[ https://issues.apache.org/jira/browse/MAPREDUCE-7307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237443#comment-17237443 ] Steve Loughran commented on MAPREDUCE-7307: --- merged to trunk, just doing backport (including move to unshaded guava) & retest of the new test case > Potential thread leak in LocatedFileStatusFetcher > - > > Key: MAPREDUCE-7307 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7307 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: job submission >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Major > > We see that when using LocatedFileStatusFetcher to get file infos In > parallel, if the listStatus thread is interrupted, the executor service in > LocatedFileStatusFetcher is left unclosed, the thread stack will like this: > {noformat} > "GetFileInfo #63" #125 daemon prio=5 os_prio=0 tid=0x7f6198106800 > nid=0x881 waiting on condition [0x7f60d9fde000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x82e810a8> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748){noformat} > This caused by if condition.await() throws InterruptedException, the method > `shutDownNow` for the executor service would not be called as a result, > should move such resource releasing call into the finally block. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6786) TaskAttemptImpl state changes for containers reuse
[ https://issues.apache.org/jira/browse/MAPREDUCE-6786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237293#comment-17237293 ] Bilwa S T commented on MAPREDUCE-6786: -- Hi [~devaraj] [~brahmareddy] Can you please review this patch? > TaskAttemptImpl state changes for containers reuse > -- > > Key: MAPREDUCE-6786 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6786 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: applicationmaster, mrv2 >Reporter: Devaraj Kavali >Assignee: Bilwa S T >Priority: Major > Attachments: MAPREDUCE-6786-MR-6749.001.patch, > MAPREDUCE-6786-v0.patch, MAPREDUCE-6786.001.patch, MAPREDUCE-6786.002.patch > > > Update TaskAttemptImpl to support the reuse of containers. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7308) Containers never get reused as containersToReuse map gets cleared on makeRemoteRequest
[ https://issues.apache.org/jira/browse/MAPREDUCE-7308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated MAPREDUCE-7308: - Description: In RMContainerReuseRequestor whenever containerAssigned is called it checks if allocated container can be reused. This always returns false as the map is getting cleared on makeRemoteRequest. I think container can be removed from containersToReuse map once its used. (was: In RMContainerReuseRequestor whenever containerAssigned is called it checks if allocated container can be reused. This always returns false as the map is getting cleared on makeRemoteRequest. I think there is no need to clear the map as container will be removed from containersToReuse map once its used.) > Containers never get reused as containersToReuse map gets cleared on > makeRemoteRequest > -- > > Key: MAPREDUCE-7308 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7308 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > > In RMContainerReuseRequestor whenever containerAssigned is called it checks > if allocated container can be reused. This always returns false as the map is > getting cleared on makeRemoteRequest. I think container can be removed from > containersToReuse map once its used. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org