[jira] [Commented] (MAPREDUCE-7307) Potential thread leak in LocatedFileStatusFetcher

2020-11-23 Thread Zhihua Deng (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237782#comment-17237782
 ] 

Zhihua Deng commented on MAPREDUCE-7307:


Thanks very much for the help and review.

> Potential thread leak in LocatedFileStatusFetcher
> -
>
> Key: MAPREDUCE-7307
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7307
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
> Fix For: 3.3.1
>
>
> We see that when using LocatedFileStatusFetcher to get file infos In 
> parallel, if the listStatus thread is interrupted,  the  executor service in 
> LocatedFileStatusFetcher is left unclosed,  the thread stack will like this:
> {noformat}
> "GetFileInfo #63" #125 daemon prio=5 os_prio=0 tid=0x7f6198106800 
> nid=0x881 waiting on condition [0x7f60d9fde000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x82e810a8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){noformat}
> This caused by if condition.await() throws InterruptedException,  the method 
> `shutDownNow` for the executor service would not be called as a result, 
> should move such resource releasing call into the finally block.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7309) Improve performance of reading resource request for mapper/reducers from config

2020-11-23 Thread Wangda Tan (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237689#comment-17237689
 ] 

Wangda Tan commented on MAPREDUCE-7309:
---

Thanks [~pbacsko], the latest patch looks better than v2. Can you check 
checkstyle and junit and see if it is related to the patch.

> Improve performance of reading resource request for mapper/reducers from 
> config
> ---
>
> Key: MAPREDUCE-7309
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7309
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Affects Versions: 3.0.0, 3.1.0, 3.2.0, 3.3.0
>Reporter: Wangda Tan
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: MAPREDUCE-7309-003.patch, MAPREDUCE-7309.001.patch, 
> MAPREDUCE-7309.002.patch
>
>
> This is an issue could affect all the releases which includes YARN-6927. 
> Basically, we use regex match repeatly when we read mapper/reducer resource 
> request from config files. When we have large config file, and large number 
> of splits, it could take a long time.  
> We saw AM could take hours to parse config when we have 200k+ splits, with a 
> large config file (hundreds of kbs). 
> We should do proper caching for pre-configured resource requests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Assigned] (MAPREDUCE-7309) Improve performance of reading resource request for mapper/reducers from config

2020-11-23 Thread Wangda Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan reassigned MAPREDUCE-7309:
-

Assignee: Peter Bacsko  (was: Wangda Tan)

> Improve performance of reading resource request for mapper/reducers from 
> config
> ---
>
> Key: MAPREDUCE-7309
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7309
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Affects Versions: 3.0.0, 3.1.0, 3.2.0, 3.3.0
>Reporter: Wangda Tan
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: MAPREDUCE-7309-003.patch, MAPREDUCE-7309.001.patch, 
> MAPREDUCE-7309.002.patch
>
>
> This is an issue could affect all the releases which includes YARN-6927. 
> Basically, we use regex match repeatly when we read mapper/reducer resource 
> request from config files. When we have large config file, and large number 
> of splits, it could take a long time.  
> We saw AM could take hours to parse config when we have 200k+ splits, with a 
> large config file (hundreds of kbs). 
> We should do proper caching for pre-configured resource requests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-7307) Potential thread leak in LocatedFileStatusFetcher

2020-11-23 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved MAPREDUCE-7307.
---
Fix Version/s: 3.3.1
   Resolution: Fixed

> Potential thread leak in LocatedFileStatusFetcher
> -
>
> Key: MAPREDUCE-7307
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7307
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
> Fix For: 3.3.1
>
>
> We see that when using LocatedFileStatusFetcher to get file infos In 
> parallel, if the listStatus thread is interrupted,  the  executor service in 
> LocatedFileStatusFetcher is left unclosed,  the thread stack will like this:
> {noformat}
> "GetFileInfo #63" #125 daemon prio=5 os_prio=0 tid=0x7f6198106800 
> nid=0x881 waiting on condition [0x7f60d9fde000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x82e810a8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){noformat}
> This caused by if condition.await() throws InterruptedException,  the method 
> `shutDownNow` for the executor service would not be called as a result, 
> should move such resource releasing call into the finally block.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7308) Containers never get reused as containersToReuse map gets cleared on makeRemoteRequest

2020-11-23 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237462#comment-17237462
 ] 

Hadoop QA commented on MAPREDUCE-7308:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} |  | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  2m 
42s{color} |  | {color:red} Docker failed to build yetus/hadoop:9560f252cf1. 
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | MAPREDUCE-7308 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13015860/MAPREDUCE-7308-MR-6749.001.patch
 |
| Console output | 
https://ci-hadoop.apache.org/job/PreCommit-MAPREDUCE-Build/38/console |
| versions | git=2.17.1 |
| Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |


This message was automatically generated.



> Containers never get reused as containersToReuse map gets cleared on 
> makeRemoteRequest
> --
>
> Key: MAPREDUCE-7308
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7308
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: MAPREDUCE-7308-MR-6749.001.patch
>
>
> In RMContainerReuseRequestor whenever containerAssigned is called it checks 
> if allocated container can be reused. This always returns false as the map is 
> getting cleared on makeRemoteRequest. I think container can be removed from 
> containersToReuse map once its used.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7308) Containers never get reused as containersToReuse map gets cleared on makeRemoteRequest

2020-11-23 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated MAPREDUCE-7308:
-
Attachment: MAPREDUCE-7308-MR-6749.001.patch

> Containers never get reused as containersToReuse map gets cleared on 
> makeRemoteRequest
> --
>
> Key: MAPREDUCE-7308
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7308
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: MAPREDUCE-7308-MR-6749.001.patch
>
>
> In RMContainerReuseRequestor whenever containerAssigned is called it checks 
> if allocated container can be reused. This always returns false as the map is 
> getting cleared on makeRemoteRequest. I think container can be removed from 
> containersToReuse map once its used.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7308) Containers never get reused as containersToReuse map gets cleared on makeRemoteRequest

2020-11-23 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated MAPREDUCE-7308:
-
Status: Patch Available  (was: Open)

> Containers never get reused as containersToReuse map gets cleared on 
> makeRemoteRequest
> --
>
> Key: MAPREDUCE-7308
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7308
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: MAPREDUCE-7308-MR-6749.001.patch
>
>
> In RMContainerReuseRequestor whenever containerAssigned is called it checks 
> if allocated container can be reused. This always returns false as the map is 
> getting cleared on makeRemoteRequest. I think container can be removed from 
> containersToReuse map once its used.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Assigned] (MAPREDUCE-6784) JobImpl state changes for containers reuse

2020-11-23 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned MAPREDUCE-6784:


Assignee: Bilwa S T  (was: Devaraj Kavali)

> JobImpl state changes for containers reuse
> --
>
> Key: MAPREDUCE-6784
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6784
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: applicationmaster, mrv2
>Reporter: Devaraj Kavali
>Assignee: Bilwa S T
>Priority: Major
> Attachments: MAPREDUCE-6784-v0.patch
>
>
> Add JobImpl state changes for supporting reusing of containers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7307) Potential thread leak in LocatedFileStatusFetcher

2020-11-23 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237443#comment-17237443
 ] 

Steve Loughran commented on MAPREDUCE-7307:
---

merged to trunk, just doing backport (including move to unshaded guava) & 
retest of the new test case

> Potential thread leak in LocatedFileStatusFetcher
> -
>
> Key: MAPREDUCE-7307
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7307
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>
> We see that when using LocatedFileStatusFetcher to get file infos In 
> parallel, if the listStatus thread is interrupted,  the  executor service in 
> LocatedFileStatusFetcher is left unclosed,  the thread stack will like this:
> {noformat}
> "GetFileInfo #63" #125 daemon prio=5 os_prio=0 tid=0x7f6198106800 
> nid=0x881 waiting on condition [0x7f60d9fde000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x82e810a8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){noformat}
> This caused by if condition.await() throws InterruptedException,  the method 
> `shutDownNow` for the executor service would not be called as a result, 
> should move such resource releasing call into the finally block.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6786) TaskAttemptImpl state changes for containers reuse

2020-11-23 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237293#comment-17237293
 ] 

Bilwa S T commented on MAPREDUCE-6786:
--

Hi [~devaraj] [~brahmareddy]

Can you please review this patch?

> TaskAttemptImpl state changes for containers reuse
> --
>
> Key: MAPREDUCE-6786
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6786
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: applicationmaster, mrv2
>Reporter: Devaraj Kavali
>Assignee: Bilwa S T
>Priority: Major
> Attachments: MAPREDUCE-6786-MR-6749.001.patch, 
> MAPREDUCE-6786-v0.patch, MAPREDUCE-6786.001.patch, MAPREDUCE-6786.002.patch
>
>
> Update TaskAttemptImpl to support the reuse of containers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7308) Containers never get reused as containersToReuse map gets cleared on makeRemoteRequest

2020-11-23 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated MAPREDUCE-7308:
-
Description: In RMContainerReuseRequestor whenever containerAssigned is 
called it checks if allocated container can be reused. This always returns 
false as the map is getting cleared on makeRemoteRequest. I think container can 
be removed from containersToReuse map once its used.  (was: In 
RMContainerReuseRequestor whenever containerAssigned is called it checks if 
allocated container can be reused. This always returns false as the map is 
getting cleared on makeRemoteRequest. I think there is no need to clear the map 
as container will be removed from containersToReuse map once its used.)

> Containers never get reused as containersToReuse map gets cleared on 
> makeRemoteRequest
> --
>
> Key: MAPREDUCE-7308
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7308
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
>
> In RMContainerReuseRequestor whenever containerAssigned is called it checks 
> if allocated container can be reused. This always returns false as the map is 
> getting cleared on makeRemoteRequest. I think container can be removed from 
> containersToReuse map once its used.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org