[jira] [Commented] (MAPREDUCE-7369) MapReduce tasks timing out when spends more time on MultipleOutputs#close
[ https://issues.apache.org/jira/browse/MAPREDUCE-7369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450668#comment-17450668 ] Prabhu Joseph commented on MAPREDUCE-7369: -- bq. Have you thought about also parallelising the close so that and the different outputs can be closed simultaneously? That will improve the speed. Have reported [MapReduce-7370|https://issues.apache.org/jira/browse/MAPREDUCE-7370] to handle the same. Thanks. > MapReduce tasks timing out when spends more time on MultipleOutputs#close > - > > Key: MAPREDUCE-7369 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7369 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 3.3.1 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > > MapReduce tasks timing out when spends more time on MultipleOutputs#close. > MultipleOutputs#closes takes more time when there are multiple files to be > closed & there is a high latency in closing a stream. > {code} > 2021-11-01 02:45:08,312 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics > report from attempt_1634949471086_61268_m_001115_0: > AttemptID:attempt_1634949471086_61268_m_001115_0 Timed out after 300 secs > {code} > MapReduce task timeout can be increased but it is tough to set the right > timeout value. The timeout can be disabled with 0 but that might lead to > hanging tasks not getting killed. > The tasks are sending the ping every 3 seconds which are not honored by > ApplicationMaster. It expects the status information which won't be send > during MultipleOutputs#close. This jira is to add a config which considers > the ping from task as part of Task Liveliness Check in the ApplicationMaster. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7370) Parallelize MultipleOutputs#close call
[ https://issues.apache.org/jira/browse/MAPREDUCE-7370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated MAPREDUCE-7370: - Description: This call takes more time when there are lot of files to close and there is a high latency to close. Parallelize MultipleOutputs#close call to improve the speed. {code} public void close() throws IOException { for (RecordWriter writer : recordWriters.values()) { writer.close(null); } } {code} Idea is from [~ste...@apache.org] was: This call takes more time when there are lot of files to close and there is a high latency to close. Parallelize MultipleOutputs#close call to improve the speed. {code} public void close() throws IOException { for (RecordWriter writer : recordWriters.values()) { writer.close(null); } } {code} > Parallelize MultipleOutputs#close call > -- > > Key: MAPREDUCE-7370 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7370 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Ravuri Sushma sree >Priority: Major > > This call takes more time when there are lot of files to close and there is a > high latency to close. Parallelize MultipleOutputs#close call to improve the > speed. > {code} > public void close() throws IOException { > for (RecordWriter writer : recordWriters.values()) { > writer.close(null); > } > } > {code} > Idea is from [~ste...@apache.org] -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7370) Parallelize MultipleOutputs#close call
Prabhu Joseph created MAPREDUCE-7370: Summary: Parallelize MultipleOutputs#close call Key: MAPREDUCE-7370 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7370 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 3.3.0 Reporter: Prabhu Joseph Assignee: Ravuri Sushma sree This call takes more time when there are lot of files to close and there is a high latency to close. Parallelize MultipleOutputs#close call to improve the speed. {code} public void close() throws IOException { for (RecordWriter writer : recordWriters.values()) { writer.close(null); } } {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org