[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628855#comment-13628855
 ] 

Gelesh commented on MAPREDUCE-4974:
-----------------------------------

[~jira.shegalov] .. thanks for sharing  your thoughts,

I have tested using JUnit run of TestLineRecordReader , but as of now, for 
compressed input test case is not incorporated in TestLineRecordReader. Thats a 
place we need to cross check, but hope the code would hold good, because 
modification in this area is minimal.

The aim was to perfomance enhance, by removing the null check ..  but the 
incompatibility with any build happen upon the existing may give NPE , as 
discussed above ([~snihalani]'s comments,

The patch was limited to
1) removing the null assignments for the key & Value  
2) limiting CompressionCodecFactory ,  and Codec to method local scope
3) removing line 170-173

     if (newSize == 0) {
       break;
      }
    Unnecessary ==0 check inside a look. ... Because the code to handle this is 
there iut side the loop, and the code which does the same seems of no value add.

4)  in order to achieve point 2 , private boolen isCompressedInput variable was 
introduces instead if 
private boolean isCompressedInput();
 method.
                
> Optimising the LineRecordReader initialize() method
> ---------------------------------------------------
>
>                 Key: MAPREDUCE-4974
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv1, mrv2, performance
>    Affects Versions: 2.0.2-alpha, 0.23.5
>         Environment: Hadoop Linux
>            Reporter: Arun A K
>            Assignee: Gelesh
>              Labels: patch, performance
>             Fix For: 0.23.7, 2.0.5-beta
>
>         Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch, 
> MAPREDUCE-4974.4.patch, MAPREDUCE-4974.5.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> I found there is a a scope of optimizing the code, over initialize() if we 
> have compressionCodecs & codec instantiated only if its a compressed input.
> Mean while Gelesh George Omathil, added if we could avoid the null check of 
> key & value. This would time save, since for every next key value generation, 
> null check is done. The intention being to instantiate only once and avoid 
> NPE as well. Hope both could be met if initialize key & value over  
> initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to