bzip2 support as of 0.19
(see JIRA HADOOP-3646),
and have splitting support working (see JIRA HADOOP-4012) as a patch.
Getting HADOOP-4012 committed has been painful,
but it seems close.
-John Heidemann
into multiple maps.
Thanks,
Ryan
Work is in progress to support splitting of .bz2 files.
See http://issues.apache.org/jira/browse/HADOOP-4012
I don't believe splitting of .tgz files is possible, something
compressed with gzip can only be uncompressed from the beginning.
-John Heidemann
that actually does it?
Or are there instructions for poking around on the compute nodes' local
disks to assemble it by hand? Or better suggestions?
It would be a real boon for people developing map and reduce user code.
Thanks for any pointers.
-John Heidemann
On Thu, 07 Aug 2008 19:42:05 +0200, Leon Mergen wrote:
Hello John,
On Thu, Aug 7, 2008 at 6:30 PM, John Heidemann [EMAIL PROTECTED] wrote:
I have a large Hadoop streaming job that generally works fine,
but a few (2-4) of the ~3000 maps and reduces have problems.
To make matters worse
On Wed, 20 Feb 2008 12:10:09 PST, Ajay Anand wrote:
The registration page for the Hadoop summit is now up:
http://developer.yahoo.com/hadoop/summit/
...
Agenda:
Ajay, when we talked about the summit on the phone, you were considering
having a poster session. I don't see that listed. Should I