Hello,

I am new to Hadoop, Pig and have just been reading whatever I could lay my
hands on. If I needed to sort a dataset using Pig is just the ORDER syntax
sufficient?

For eg here is what I came up with to sort a dataset of users based on their
login count

records = LOAD 'input/sample.txt' AS (username:chararray);

grpd = GROUP records BY username;

cntd = FOREACH grpd GENERATE
          group, COUNT(records) AS cnt;

srtd = ORDER cntd BY cnt;

STORE srtd INTO 'output';

Is this sufficient to sort a dataset? Is there something else that needs to
be done? I read about partition/combine for SORT when I read Mapreduce and
hence was confused.

Any help is greatly appreciated.

Thanks
VJ

Reply via email to