Sorry for reforwarding this email. I was not sure if it actually got through since I just got the confirmation regarding my membership to the mailing list. Thanks, Sunil.
On Tue, Apr 24, 2012 at 7:12 PM, Sunil S Nandihalli < sunil.nandiha...@gmail.com> wrote: > Hi Everybody, > I am a newbie to hadoop. I have about 40K .tgz files each of > approximately 3MB . I would like to process this as if it were a single > large file formed by > "cat list-of-files | gnuparallel 'tar -Oxvf {} | sed 1d' > output.txt" > how can I achieve this using hadoop-streaming or some-other similar > library.. > > > thanks, > Sunil. >