Yes: if I remove the FILTER or the JOIN clause, the loading of data
works fine and consistently.
I will do more testings, but yes, I suspect HBase loader to work
incorrectly in my case...
The same query works perfectly with HBase 0.20.6 and PIG 0.6.1.
Le 27/07/11 19:43, Thejas Nair a écrit :
I looked at the query plan for the query using explain, and it
looks correct.
As you said, this is a simple use case, I would be very surprised
if there is a optimizer bug here.
I suspect that something is wrong in loading the data from hbase.
Are you able to get a simple load-store script working consistently ?
Thanks,
Thejas
On 7/27/11 7:31 AM, Vincent Barat wrote:
I built the pig trunk with hbase 0.90.3 client lib (ant
-Dhbase.version=0.90.3) and the issue is still here.
It makes me thing about an issue in the optimizer... Anyway the
fact is
that my request is not complex, so I wonder how such an issue can go
through PIG test suite !
Any help ?
Le 27/07/11 14:38, Vincent Barat a écrit :
More info on this issue:
1- I use PIG 0.8.1 and HBase 0.90.3 and Hadoop 0.20-append
2- The issue can be reproduced with PIG trunk too
The script:
start_sessions = LOAD 'startSession.mde253811.preprod.ubithere.com'
USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid
meta:infoid meta:imei meta:timestamp') AS (sid:chararray,
infoid:chararray, imei:chararray, start:long);
end_sessions = LOAD 'endSession.mde253811.preprod.ubithere.com'
USING
org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid
meta:timestamp meta:locid') AS (sid:chararray, end:long,
locid:chararray);
sessions = JOIN start_sessions BY sid, end_sessions BY sid;
sessions = FILTER sessions BY end > start AND end - start <
86400000L;
sessions = FOREACH sessions GENERATE start_sessions::sid, imei,
start,
end;
sessions = LIMIT sessions 100;
dump sessions;
<output 1>
dump sessions;
<output 2>
The issue:
<output 1> is empty
<output 2> is 100 lines
I can reproduce the issue systematically.
Please advice: this issue prevent me from moving to HBase 0.90.3 in
production, as I need to upgrade to PIG 0.8.1 at the same time !