[ 
https://issues.apache.org/jira/browse/TEZ-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-1767:
----------------------------------
    Attachment: TEZ-1767.5.patch
                TEZ-1767.WIP.5.patch

>>>>>
compareKeyWithNextTopKey - I think this needs to check for top() being null (or 
equivalently size() ==0). Maybe it's inherently handled by passing null as the 
current reader.
>>>>>
Right - This is already taken care of by the null as current reader.


>>>>
isSameKey - should this be throwing an UnsupportedOperationException where it 
isn't supported
>>>>
Done.  Changed in DefaultSorter, PipelinedSorter etc

>>>>>
Is this too verbose, even if only in DBEUG mode. This could be logged every N 
records, and once at the end of the merge instead.
>>>>>
This is printed once per writeFile and not within the while loop.  Hence it 
wouldn't be too verbose in debug mode.  Please let me know if this needs to be 
trimmed.


>>>>>
        Logger should be TestTezMerger instead of TestIFile
        LocalDir should be under the workDir as well, so that it gets cleaned 
up when the test completes.
>>>>
Done

>>>>
        Lots of System.out - which should be removed / changed to LOG messages
>>>>
Changed to LOG messages.  Mainly added them for debugging.


>>>>
It'll be good to have a test which explicitly exercise SAME_KEY checks / corner 
cases, rather than leaving it to random()
>>>>
Done.


> Enable RLE in reducer side merge codepath
> -----------------------------------------
>
>                 Key: TEZ-1767
>                 URL: https://issues.apache.org/jira/browse/TEZ-1767
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>         Attachments: TEZ-1767.1.patch, TEZ-1767.2.patch, TEZ-1767.3.patch, 
> TEZ-1767.4.patch, TEZ-1767.5.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to