[ 
https://issues.apache.org/jira/browse/TEZ-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956459#comment-14956459
 ] 

Siddharth Seth edited comment on TEZ-1692 at 10/14/15 8:02 AM:
---------------------------------------------------------------

Test failures are unrelated.

bq. Memory wise I think we should be at parity with the latest patch. Likely 
also with cpu. But like I said, this can only be measured.
Trying to measure patches for memory/cpu/perf without any solid basis will end 
up wasting time for everyone. This patch adds wrappers around existing code. 
Since there's no specific suspicions on memory/cpu increase - I don't think a 
perf test is required.
That said, I got some numbers as part of testing TEZ-2879, and there's no 
noticeable difference in runtime.

If there's no other concerns, I'll commit the patch and get 2879 moving.

A test with random splits is the simplest way to measure something like this - 
have created TEZ-2892 for a grouping micro benchmark. I'm sure there's other 
cases for such tests as well.


was (Author: sseth):
Test failures are unrelated.

bq. Memory wise I think we should be at parity with the latest patch. Likely 
also with cpu. But like I said, this can only be measured.
Trying to measure patches for memory/cpu/perf without any solid basis will end 
up wasting time for everyone. This patch adds wrappers around existing code. 
Since there's no specific suspicions on memory/cpu increase - I don't think a 
perf test is required.
That said, I got some numbers as part of testing TEZ-2892, and there's no 
noticeable difference in runtime.

If there's no other concerns, I'll commit the patch and get 2892 moving.

A test with random splits is the simplest way to measure something like this - 
have created TEZ-2892 for a grouping micro benchmark. I'm sure there's other 
cases for such tests as well.

> Reduce code duplication between TezMapredSplitsGrouper and 
> TezMapreduceSplitsGrouper
> ------------------------------------------------------------------------------------
>
>                 Key: TEZ-1692
>                 URL: https://issues.apache.org/jira/browse/TEZ-1692
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Siddharth Seth
>            Assignee: Siddharth Seth
>         Attachments: TEZ-1692.1.txt, TEZ-1692.2.txt, TEZ-1692.3.txt
>
>
> The two are almost identical - with lots of repeated logic. The main 
> difference being the mapred / mapreduce InputSplit being grouped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to