[ https://issues.apache.org/jira/browse/MAPREDUCE-7370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17611949#comment-17611949 ]
ASF GitHub Bot commented on MAPREDUCE-7370: ------------------------------------------- aajisaka commented on code in PR #4248: URL: https://github.com/apache/hadoop/pull/4248#discussion_r985124236 ########## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.java: ########## @@ -345,6 +356,11 @@ public static boolean getCountersEnabled(JobContext job) { return job.getConfiguration().getBoolean(COUNTERS_ENABLED, false); } + @VisibleForTesting + public synchronized void setRecordWriters(Map<String, RecordWriter<?, ?>> recordWriters) { Review Comment: Would you make it package-private? ########## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java: ########## @@ -527,9 +557,42 @@ public void collect(Object key, Object value) throws IOException { * @throws java.io.IOException thrown if any of the MultipleOutput files * could not be closed properly. */ - public void close() throws IOException { + public void close() throws IOException, InterruptedException { Review Comment: `InterruptedException` is not thrown in this method and should be removed. This class is annotated `@Public` and the change may cause compile error. Also, we can remove the below code from the test code. ``` try { mos.close(); } catch (InterruptedException e) { throw new RuntimeException(e); } ``` ########## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/MultipleOutputs.java: ########## @@ -381,6 +406,11 @@ public static boolean getCountersEnabled(JobConf conf) { private Map<String, RecordWriter> recordWriters; private boolean countersEnabled; + @VisibleForTesting + public synchronized void setRecordWriters(Map<String, RecordWriter> recordWriters) { Review Comment: Would you make this method package-private? > Parallelize MultipleOutputs#close call > -------------------------------------- > > Key: MAPREDUCE-7370 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7370 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Affects Versions: 3.3.0 > Reporter: Prabhu Joseph > Assignee: Ashutosh Gupta > Priority: Major > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > > This call takes more time when there are lot of files to close and there is a > high latency to close. Parallelize MultipleOutputs#close call to improve the > speed. > {code} > public void close() throws IOException { > for (RecordWriter writer : recordWriters.values()) { > writer.close(null); > } > } > {code} > Idea is from [~ste...@apache.org] -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org