Vincent, is the behavior random or the same each time?
Couple of things to narrow it down.. - attach the entire console output from PIG run when this happened. - only load start_sessions and end_sessions and store them.. - load the data from tables from previous step and run the same pig command Consider filing a JIRA. it might be a better place to go into more details. -Raghu. On Wed, Jul 27, 2011 at 5:38 AM, Vincent Barat <[email protected]>wrote: > More info on this issue: > > 1- I use PIG 0.8.1 and HBase 0.90.3 and Hadoop 0.20-append > 2- The issue can be reproduced with PIG trunk too > > The script: > > start_sessions = LOAD > 'startSession.mde253811.**preprod.ubithere.com<http://startSession.mde253811.preprod.ubithere.com>' > USING org.apache.pig.backend.hadoop.**hbase.HBaseStorage('meta:sid > meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, > imei:chararray, start:long); > end_sessions = LOAD > 'endSession.mde253811.preprod.**ubithere.com<http://endSession.mde253811.preprod.ubithere.com>' > USING org.apache.pig.backend.hadoop.**hbase.HBaseStorage('meta:sid > meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray); > sessions = JOIN start_sessions BY sid, end_sessions BY sid; > sessions = FILTER sessions BY end > start AND end - start < 86400000L; > sessions = FOREACH sessions GENERATE start_sessions::sid, imei, start, end; > sessions = LIMIT sessions 100; > dump sessions; > <output 1> > dump sessions; > <output 2> > > The issue: > > <output 1> is empty > <output 2> is 100 lines > > I can reproduce the issue systematically. > > Please advice: this issue prevent me from moving to HBase 0.90.3 in > production, as I need to upgrade to PIG 0.8.1 at the same time ! > >
