Sorry to spam, but I also forgot to mention versions of stuff. This fails in the same way on both:
Hadoop 2.4 HBase 0.98 Pig 0.15.1 and Hadoop 2.7 HBase 1.1 Pig 0.15.1 William Watson Lead Software Engineer On Thu, Mar 10, 2016 at 11:16 AM, Billy Watson <williamrwat...@gmail.com> wrote: > Bah, I forgot to paste the pig script like an idiot: > > table1 = LOAD 'hbase://table1' > USING org.apache.pig.backend.hadoop.hbase.HBaseStorage( > '', > '-loadKey -noWAL=true -minTimestamp=1451624400000 > -maxTimestamp=1454302800000') AS > (uid:chararray); > > table2 = LOAD 'hbase://table2' > USING org.apache.pig.backend.hadoop.hbase.HBaseStorage( > '', > '-loadKey -noWAL=true -regex=\\\\|ago=156\\\\| > -minTimestamp=1451624400000 -maxTimestamp=1454302800000') AS > (uid:chararray); > > > user_segment_with_event = JOIN table1 BY uid, table2 BY uid USING 'merge'; > > -- fails with TableSplitComparable cannot be cast to TableSplit > -- > http://ip-10-0-1-180.ec2.internal:19888/jobhistory/logs/ip-10-0-1-14.ec2.internal:45454/container_e10_1457365475473_0248_01_000029/attempt_1457365475473_0248_r_000000_0/hadoop/syslog/?start=0 > ones = FOREACH user_segment_with_event GENERATE (int) 1 AS one:int; > c = GROUP ones ALL; c = FOREACH c GENERATE COUNT(ones); dump c > > William Watson > Lead Software Engineer > > On Thu, Mar 10, 2016 at 11:11 AM, Billy Watson <williamrwat...@gmail.com> > wrote: > >> Thanks to a bug fix put in by a colleague of mine, merge joins work for >> tables loaded into pig via HBaseStorage. In our test environment and in the >> test environment for pig itself, I'm able to get all sorts of fairly >> complex data merging without issue. >> >> However, when I use that same code on larger data sets in a production >> environment, the merge join fails. If I run it on the same exact tables on >> the same cluster after trimming the data down to just a few rows, the merge >> join works fine. >> >> Here is the most basic I've been able to get the pig script. I've been >> taking out pieces and parts trying to narrow it down but it still fails: >> >> >> >> If I change the count portion to a limit 5 or something, I'm able to dump >> the relation. >> >> The merge join finishes all of its mappers, but when it gets to the >> reduce step and starts doing a sort (don't ask me why it's even doing a >> sort on pre-sorted data), it throws the following error: >> >> 2016-03-09 19:36:01,738 WARN [main] org.apache.hadoop.mapred.YarnChild: >> Exception running child : >> org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: Error while >> doing final merge >> at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:160) >> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376) >> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:422) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) >> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) >> Caused by: java.lang.ClassCastException: >> org.apache.pig.backend.hadoop.hbase.TableSplitComparable cannot be cast to >> org.apache.hadoop.hbase.mapreduce.TableSplit >> at >> org.apache.pig.backend.hadoop.hbase.TableSplitComparable.compareTo(TableSplitComparable.java:26) >> at org.apache.pig.data.DataType.compare(DataType.java:566) >> at org.apache.pig.data.DataType.compare(DataType.java:464) >> at >> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compareDatum(BinInterSedes.java:1106) >> at >> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compare(BinInterSedes.java:1082) >> at >> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compareBinSedesTuple(BinInterSedes.java:787) >> at >> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compare(BinInterSedes.java:728) >> at >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTupleSortComparator.compare(PigTupleSortComparator.java:100) >> at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:587) >> at org.apache.hadoop.util.PriorityQueue.upHeap(PriorityQueue.java:128) >> at org.apache.hadoop.util.PriorityQueue.put(PriorityQueue.java:55) >> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:678) >> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:596) >> at org.apache.hadoop.mapred.Merger.merge(Merger.java:131) >> at org.apache.hadoop.mapred.Merger.merge(Merger.java:115) >> at >> org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:722) >> at >> org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.close(MergeManagerImpl.java:370) >> at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:158) >> >> >> >> If I switch the order of the two relations in the merge join, I get a >> different error which appears more promising, but I still don't know what >> to do about it: >> >> 2016-03-09 19:55:24,789 WARN [main] org.apache.hadoop.mapred.YarnChild: >> Exception running child : >> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception >> while executing (Name: c: Local Rearrange[tuple]{chararray}(false) - >> scope-334 Operator Key: scope-334): >> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error while >> executing ForEach at [c[62,4]] >> at >> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:316) >> at >> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:291) >> at >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:279) >> at >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:274) >> at >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64) >> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) >> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:422) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) >> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) >> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: >> Error while executing ForEach at [c[62,4]] >> at >> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:325) >> at >> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:307) >> ... 12 more >> Caused by: java.lang.NullPointerException >> at >> org.apache.pig.impl.builtin.DefaultIndexableLoader.seekNear(DefaultIndexableLoader.java:190) >> at >> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.seekInRightStream(POMergeJoin.java:542) >> at >> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.getNextTuple(POMergeJoin.java:299) >> at >> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:307) >> at >> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPreCombinerLocalRearrange.getNextTuple(POPreCombinerLocalRearrange.java:126) >> at >> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:307) >> at >> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:252) >> >> >> Again, I've tried replicating the exact scenario (and more complicated >> ones) in local environments and I can't get it to fail. I think it's >> related to yarn/mapreduce, but I can't figure out why that would matter or >> what it's really doing. >> >> I'm trying to set up the e2e (end to end) tests in the pig repo, but I'm >> not having any luck there, either. If I can't get a test failure, I'm >> afraid I'm not going to be able to fix the bug or issue. >> >> Can anyone help point me in the right direction as far as next debugging >> steps or what might be wrong? >> >> >> William Watson >> Lead Software Engineer >> > >