[ 
https://issues.apache.org/jira/browse/TEZ-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268331#comment-14268331
 ] 

Siddharth Seth commented on TEZ-1913:
-------------------------------------

Questions/comments on the patch.
- The ValuesIterator is used by the Combiners as well. I'm not sure a result of 
a merge (RawKViterator which support isSameKey) is the only iterator which will 
be used in these cases. From the PipelineSorter, there were some RawKVIterators 
which don't implement the method.
- EmptyIteartor.isSameKey isn't implemented - don't think we'll ever his this, 
but the Merger can return an instance of this. Should probably change this to 
return false.
- Test in TestValuesIterator to validate same keys working.

> Reduce deserialize cost in ValuesIterator
> -----------------------------------------
>
>                 Key: TEZ-1913
>                 URL: https://issues.apache.org/jira/browse/TEZ-1913
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>              Labels: perfomance
>         Attachments: TEZ-1913.1.patch
>
>
> When TezRawKeyValueIterator->isSameKey() is added, it should be possible to 
> reduce the number of deserializations in ValuesIterator->readNextKey().
> Creating this ticket to track the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to