[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17583611#comment-17583611
 ] 

ASF GitHub Bot commented on MAPREDUCE-7370:
-------------------------------------------

ashutoshcipher commented on PR #4248:
URL: https://github.com/apache/hadoop/pull/4248#issuecomment-1224052518

   Thanks @steveloughran  for checking. 
   
   > Inconsistent synchronization of  
org.apache.hadoop.mapred.lib.MultipleOutputs.recordWriters; locked 66%  of time 
 Unsynchronized access at MultipleOutputs.java:66% of time   Unsynchronized 
access at MultipleOutputs.java:[line 412]
   
   Here - This is due to this method which is only visible for testing and not 
to be used by actual prod code.
   
   ```
     @VisibleForTesting
     public void setRecordWriters(Map<String, RecordWriter> recordWriters) {
       this.recordWriters = recordWriters;
     }
   ```
   
   > Inconsistent synchronization of  
org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.recordWriters;  locked 
66% of time  Unsynchronized access at MultipleOutputs.java:66% of  time  
Unsynchronized access at MultipleOutputs.java:[line 360]
   
   Here, This is also due to same method which is only visible for testing and 
not to be used by actual prod code. I missed marking it as `VisibleForTesting` 
which I will do in next commit.
   
   ```
     public void setRecordWriters(Map<String, RecordWriter<?, ?>> 
recordWriters) {
       this.recordWriters = recordWriters;
     }
   ```
   
   Let me know if I need to address anything else.
   
   
   
   
   
   
   




> Parallelize MultipleOutputs#close call
> --------------------------------------
>
>                 Key: MAPREDUCE-7370
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7370
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 3.3.0
>            Reporter: Prabhu Joseph
>            Assignee: groot
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> This call takes more time when there are lot of files to close and there is a 
> high latency to close. Parallelize MultipleOutputs#close call to improve the 
> speed.
> {code}
>   public void close() throws IOException {
>     for (RecordWriter writer : recordWriters.values()) {
>       writer.close(null);
>     }
>   }
> {code}
> Idea is from [~ste...@apache.org]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to