Sergey created PIG-3416:
---------------------------
Summary: Looks like MultiStore stucks
Key: PIG-3416
URL: https://issues.apache.org/jira/browse/PIG-3416
Project: Pig
Issue Type: Bug
Components: data
Affects Versions: 0.11
Reporter: Sergey
Hi, I've met strange problem. Maybe it's related to data. BUt I'm not sure. I'm
working with derivative in avro format so all "bad data" should be caught on
early stages.
My pig script worked to 2 days each hour (invoked using oozie coordinator).
Now it stucks. It always have one reducer which shows progess = 67.55%
I see in TT log that it does merge, sort, then starts reduce.
I do use custom UDF in my pig script.
I've added counters trying to debug the situation.My UDF works with bags.
Counter says that UDF worked fine because "Reduce input groups" = "invocation
times of UDF".
I even see counters of output:
{code}
Map-Reduce Framework
Combine input records 0
Combine output records 0
Reduce input groups 31 019
Reduce shuffle bytes 65 071 957
Reduce input records 96 727
Reduce output records 0
Spilled Records 0
CPU time spent (ms) 48 870
Physical memory (bytes) snapshot 643 358 720
Virtual memory (bytes) snapshot 3 821 920 256
Total committed heap usage (bytes) 1 057 357 824
MarkEndPointsForCurrentHour
callTimes 31 019
totalExecutionTime 13 690
MultiStoreCounters
Output records in _0_08 26 862
Output records in _1_09 7 383
{code}
Counters say that all data passed through my UDF and even some output has been
written.
But reducer (always only 1 of 54 total reducers) stucks for 1 hour an then
killed by JT because of timeout. All other 53 reducers finished in 7 minutes.
How can I debug MultiStore?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira