[ 
https://issues.apache.org/jira/browse/TEZ-3317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15438516#comment-15438516
 ] 

Rajesh Balamohan edited comment on TEZ-3317 at 8/26/16 9:06 AM:
----------------------------------------------------------------

Thanks for sharing the patch. This can't be directly checked with app like 
Hive, as Hive uses its own processor.

Couple of minor comments.
1. Can you please remove {{e.printStackTrace()}} statements
2. In OrderedGroupedKVInput, is it possible to move {{shuffledInputs, 
shuffledBytes}} to avoid the lookup call, as getProgress could be in the hot 
path.
3. In OrderedGroupedKVInput, progress split up by into two (0.5f). Is this 
added to account for skew?
4. SimpleProcessor  - wouldn't having 50ms refresh time for scheduling be too 
aggressive? Is it possible to increase it?. Same in SleepProcessor (but here it 
might be ok as it in examples)


was (Author: rajesh.balamohan):
Thanks for sharing the patch. I tried with couple of hive jobs and did not find 
much increase in CPU usage and overall resposne time (for overhead). I will try 
with a job with complex DAG soon. 

Couple of minor comments.
1. Can you please remove {{e.printStackTrace()}} statements
2. In OrderedGroupedKVInput, is it possible to move {{shuffledInputs, 
shuffledBytes}} to avoid the lookup call, as getProgress could be in the hot 
path.
3. In OrderedGroupedKVInput, progress split up by into two (0.5f). Is this 
added to account for skew?
4. SimpleProcessor  - wouldn't having 50ms refresh time for scheduling be too 
aggressive? Is it possible to increase it?. Same in SleepProcessor (but here it 
might be ok as it in examples)

> Speculative execution starts too early due to 0 progress
> --------------------------------------------------------
>
>                 Key: TEZ-3317
>                 URL: https://issues.apache.org/jira/browse/TEZ-3317
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Jonathan Eagles
>            Assignee: Kuhu Shukla
>         Attachments: TEZ-3317.001.patch, TEZ-3317.002.patch, 
> TEZ-3317.003.patch, TEZ-3317.004.patch, TEZ-3317.005.patch, TEZ-3317.006.patch
>
>
> Don't know at this point if this is a tez or a PigProcessor issue. There is 
> some setProgress chain that is keeping task progress from being correctly 
> reported. Task status is always zero, so as soon as the first task finishes, 
> tasks up to the speculation limit are always launched.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to