[
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628855#comment-13628855
]
Gelesh commented on MAPREDUCE-4974:
-----------------------------------
[~jira.shegalov] .. thanks for sharing your thoughts,
I have tested using JUnit run of TestLineRecordReader , but as of now, for
compressed input test case is not incorporated in TestLineRecordReader. Thats a
place we need to cross check, but hope the code would hold good, because
modification in this area is minimal.
The aim was to perfomance enhance, by removing the null check .. but the
incompatibility with any build happen upon the existing may give NPE , as
discussed above ([~snihalani]'s comments,
The patch was limited to
1) removing the null assignments for the key & Value
2) limiting CompressionCodecFactory , and Codec to method local scope
3) removing line 170-173
if (newSize == 0) {
break;
}
Unnecessary ==0 check inside a look. ... Because the code to handle this is
there iut side the loop, and the code which does the same seems of no value add.
4) in order to achieve point 2 , private boolen isCompressedInput variable was
introduces instead if
private boolean isCompressedInput();
method.
> Optimising the LineRecordReader initialize() method
> ---------------------------------------------------
>
> Key: MAPREDUCE-4974
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: mrv1, mrv2, performance
> Affects Versions: 2.0.2-alpha, 0.23.5
> Environment: Hadoop Linux
> Reporter: Arun A K
> Assignee: Gelesh
> Labels: patch, performance
> Fix For: 0.23.7, 2.0.5-beta
>
> Attachments: MAPREDUCE-4974.2.patch, MAPREDUCE-4974.3.patch,
> MAPREDUCE-4974.4.patch, MAPREDUCE-4974.5.patch
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> I found there is a a scope of optimizing the code, over initialize() if we
> have compressionCodecs & codec instantiated only if its a compressed input.
> Mean while Gelesh George Omathil, added if we could avoid the null check of
> key & value. This would time save, since for every next key value generation,
> null check is done. The intention being to instantiate only once and avoid
> NPE as well. Hope both could be met if initialize key & value over
> initialize() method. We both have worked on it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira