[
https://issues.apache.org/jira/browse/TEZ-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268331#comment-14268331
]
Siddharth Seth commented on TEZ-1913:
-------------------------------------
Questions/comments on the patch.
- The ValuesIterator is used by the Combiners as well. I'm not sure a result of
a merge (RawKViterator which support isSameKey) is the only iterator which will
be used in these cases. From the PipelineSorter, there were some RawKVIterators
which don't implement the method.
- EmptyIteartor.isSameKey isn't implemented - don't think we'll ever his this,
but the Merger can return an instance of this. Should probably change this to
return false.
- Test in TestValuesIterator to validate same keys working.
> Reduce deserialize cost in ValuesIterator
> -----------------------------------------
>
> Key: TEZ-1913
> URL: https://issues.apache.org/jira/browse/TEZ-1913
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Rajesh Balamohan
> Assignee: Rajesh Balamohan
> Labels: perfomance
> Attachments: TEZ-1913.1.patch
>
>
> When TezRawKeyValueIterator->isSameKey() is added, it should be possible to
> reduce the number of deserializations in ValuesIterator->readNextKey().
> Creating this ticket to track the issue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)