Use ORDER. if set is not too big. Or write mr job with single reducer. You even can try use default mapper and reducer in there is no problem with input format.
2013/7/18 Bhavesh Shah <[email protected]> > Thanks Serega and Pradeep for your quick replies. > > > > Serega, As i am new to PIG, I didn't understand "Pig Script with one > reduce action". Do you mean to write reduce action in Pig Latin or in some > other langauge? > > > > - Bhavesh. > > > > > > Date: Thu, 18 Jul 2013 16:03:54 +0400 > > Subject: Re: Want to add data in same file in Apache PIG? > > From: [email protected] > > To: [email protected] > > > > *merge* and sort them to only > > one file on *local fs*. <src> is kept. > > Are you sure that you want to merge several HDFS files into one LOCAL > file? > > Local file would be in your local file system. > > The simples way is to use union in pig and union existig files in HDFS > with > > new one generated by pig script. > > The other way is to write pig script with one reduce action > > > > > > 2013/7/18 Bhavesh Shah <[email protected]> > > > > > Thanks for reply. :) > > > > > > I just came across one command -getmerge > > > > > > > > > > > > -getmerge <src> <localdst>: Get all the files in the directories that > > > match the source file pattern and merge and sort them to only > > > one file on local fs. <src> is kept. > > > > > > > > > > > > I am thinking if I STORE the data in some other file say TMP_Name > > > > > > and later If I use this command to dump the data in the required file. > > > > > > > > > > > > Is it possible to merge the data using this command in PIG? If yes, > then > > > is it good way to achieve my goal? > > > > > > Please let me know. > > > > > > > > > > > > > > > > > > Many Thanks, > > > > > > Bhavesh Shah > > > > > > > > > > > > > > > > > > > > > > Date: Thu, 18 Jul 2013 15:49:47 +0400 > > > > Subject: Re: Want to add data in same file in Apache PIG? > > > > From: [email protected] > > > > To: [email protected] > > > > CC: [email protected] > > > > > > > > it's not possible. It's HDFS. > > > > > > > > > > > > 2013/7/18 Bhavesh Shah <[email protected]> > > > > > > > > > Hello, > > > > > > > > > > Actually I have a use case in which I will receive the data from > some > > > > > source and I have to dump it in the same file after every regular > > > interval > > > > > and use that file for further operation. I tried to search on it, > but I > > > > > didn't see the anything related to this. > > > > > > > > > > I am using STORE function, but STORE function always create new > file > > > with > > > > > specified name and gives error if the specified file already > exists. > > > > > How should I do store the data in same file? Is it possible in Pig > or > > > have > > > > > some work around for it? > > > > > Please suggest me some solution over this. > > > > > > > > > > > > > > > Thanks, > > > > > Bhavesh Shah > > > > > > > >
