Hi Shahab, It will be great if someone can delete this email from PIG group. I am aware of this mistake and had posted this issue to HIVE group almost immediately.
Regards, Gourav On Mon, Jun 10, 2013 at 5:28 PM, Shahab Yunus <[email protected]>wrote: > Gourav, this is not a HIVE mailing list. It is PIG's. > > Regards, > Shahab > > > On Mon, Jun 10, 2013 at 10:39 AM, Gourav Sengupta > <[email protected]>wrote: > > > Hi, > > > > On running the following query I am getting multiple records with same > > value of F1 > > > > SELECT F1, COUNT(*) > > FROM > > ( > > SELECT F1, F2, COUNT(*) > > FROM TABLE1 > > GROUP BY F1, F2 > > ) a > > GROUP BY F1; > > > > As per what I understand there are multiple number of records based on > > number of reducers. > > > > Replicating the test scenario: > > STEP1: get the dataset as available in > > http://snap.stanford.edu/data/amazon0302.html > > > > STEP2: Open the file and delete the heading > > > > STEP3: hadoop fs -mkdir /test > > > > STEP4: hadoop fs -put amazon0302.txt /test > > > > STEP5: create external table test (f1 int, f2 int) row format delimited > > fields terminated by '\t' lines terminated by '\n' stored as textfile > > location '/test'; > > > > STEP6: create table test1 location '/test1' as select left_table.* from > > (select * from test where f1<10000) left_table join (select * from test > > where f1 < 10000) right_table; > > > > STEP7: hadoop fs -mkdir /test2 > > > > STEP8: create table test2 location '/test2' as select f1, count(*) from > > (select f1, f2, count(*) from test1 group by f1, f2) a group by f1; > > > > STEP9: select * from test2 where f1 = 9887; > > > > ENVIRONMENT: > > HADOOP 2.0.4 > > HIVE 0.11 > > > > Please do let me know whether I am doing anything wrong. > > > > > > Thanks and Regards, > > Gourav Sengupta > > >
