[
https://issues.apache.org/jira/browse/PIG-2029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Olga Natkovich updated PIG-2029:
--------------------------------
Fix Version/s: (was: 0.8.1)
Assignee: Richard Ding
I do not believe it is a P1 issue so don't think it belongs on 0.8 branch. Even
for 0.9 I do not see it as a blocker. If we can find a quick reproducible case,
we will fix it in 0.9. Otherwise will delay till we can reproduce. Also, this
could be a potential issue with Hadoop.
> Inconsistency in Pig Stats reports
> -----------------------------------
>
> Key: PIG-2029
> URL: https://issues.apache.org/jira/browse/PIG-2029
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.8.1, 0.9.0
> Reporter: Viraj Bhat
> Assignee: Richard Ding
> Fix For: 0.9.0
>
>
> I have a Pig script which reports varying Stats for the same M/R job (same
> inputs). Sometimes the PigStats reports all the stats (such as
> Maps,Reduces,MaxMapTime,MinMapTime,AvgMapTime,MaxReduceTime, MinReduceTime
> and AvgReduceTime) for the M/R job as 0. Sometimes it reports it correctly.
> Enclosed are the stderr logs for 2 runs, you can notice that for Run 1
> job_201103091134_556600 from Run 1; has 0 against all the columns whereas in
> Run 2, Hadoop job job_201104272229_75693 has some valid values.
> The actual Job Tracker link shows that they are non empty. This points to a
> bug in the interaction of the PigStats module with the Jobtracker.
> Run 1:
> {quote}
> Job Stats (time in seconds):
> JobId Maps Reduces MaxMapTime MinMapTIme AvgMapTime
> MaxReduceTime MinReduceTime AvgReduceTime Alias Feature Outputs
> job_201103091134_556458 160 100 552 191 368 1257
> 371 392
> IN,SP10P,SP11P,SP12P,SP13P,SP16P,SP17P,SP18P,SP20P,SP21P,SP22P,SP23P,SP24P,SP26P,SP27P,SP28P,SP29P,SP30P,SP31P,SP32P,SP33P,SP34P,SP4P,SP6P,SP7P,SP8P,SP9P
> DISTINCT,MULTI_QUERY
> job_201103091134_556600 0 0 0 0 0 0
> 0 0 UNION5 MULTI_QUERY,MAP_ONLY /user/viraj/dir,,
> job_201103091134_556601 7 100 17 8 14 200
> 15 27 CNJOIN25,GNJOIN25,sampleNJOIN25 GROUP_BY,COMBINER
> job_201103091134_556602 0 0 0 0 0 0
> 0 0 CNJOIN3,GNJOIN3,sampleNJOIN3 GROUP_BY,COMBINER
> job_201103091134_556603 0 0 0 0 0 0
> 0 0 CNJOIN15,GNJOIN15,sampleNJOIN15 GROUP_BY,COMBINER
> job_201103091134_556604 2 100 13 7 10 34
> 13 31 CNJOIN19,GNJOIN19,sampleNJOIN19 GROUP_BY,COMBINER
> job_201103091134_556644 0 0 0 0 0 0
> 0 0 ONJOIN15 SAMPLER
> job_201103091134_556645 0 0 0 0 0 0
> 0 0 ONJOIN25 SAMPLER
> job_201103091134_556646 0 0 0 0 0 0
> 0 0 ONJOIN3 SAMPLER
> job_201103091134_556654 0 0 0 0 0 0
> 0 0 ONJOIN19 SAMPLER
> job_201103091134_556662 0 0 0 0 0 0
> 0 0 ONJOIN19 ORDER_BY,COMBINER
> ..
> {quote}
> Run 2:
> {quote}
> Job Stats (time in seconds):
> JobId Maps Reduces MaxMapTime MinMapTIme AvgMapTime
> MaxReduceTime MinReduceTime AvgReduceTime Alias Feature Outputs
> job_201104272229_75503 159 100 484 192 353 396
> 308 321
> IN,SP10P,SP11P,SP12P,SP13P,SP16P,SP17P,SP18P,SP20P,SP21P,SP22P,SP23P,SP24P,SP26P,SP27P,SP28P,SP29P,SP30P,SP31P,SP32P,SP33P,SP34P,SP4P,SP6P,SP7P,SP8P,SP9P
> DISTINCT,MULTI_QUERY
> job_201104272229_75693 18 0 31 14 24 0
> 0 UNION5 MULTI_QUERY,MAP_ONLY /user/viraj/dir,
> job_201104272229_75694 7 100 34 13 22 46
> 20 25 CNJOIN25,GNJOIN25,sampleNJOIN25 GROUP_BY,COMBINER
> job_201104272229_75695 125 100 19 11 15 32
> 18 26 CNJOIN3,GNJOIN3,sampleNJOIN3 GROUP_BY,COMBINER
> job_201104272229_75698 1 100 12 12 12 13
> 9 11 CNJOIN15,GNJOIN15,sampleNJOIN15 GROUP_BY,COMBINER
> job_201104272229_75702 2 100 21 5 13 35
> 22 26 CNJOIN19,GNJOIN19,sampleNJOIN19 GROUP_BY,COMBINER
> job_201104272229_75724 1 1 4 4 4 11
> 11 11 ONJOIN15 SAMPLER
> job_201104272229_75725 0 0 0 0 0 0
> 0 ONJOIN25 SAMPLER
> job_201104272229_75726 6 1 8 6 8 24
> 24 24 ONJOIN3 SAMPLER
> job_201104272229_75729 0 0 0 0 0 0
> 0 ONJOIN19 SAMPLER
> job_201104272229_75752 1 100 5 5 5 12
> 9 11 ONJOIN19 ORDER_BY,COMBINER
> ..
> {quote}
> Viraj
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira