Re: Are .bz2 extensions supported in Hadoop 18.3

2009-06-24 Thread John Heidemann
bzip2 support as of 0.19 (see JIRA HADOOP-3646), and have splitting support working (see JIRA HADOOP-4012) as a patch. Getting HADOOP-4012 committed has been painful, but it seems close. -John Heidemann

Re: Hadoop and .tgz files

2008-12-02 Thread John Heidemann
into multiple maps. Thanks, Ryan Work is in progress to support splitting of .bz2 files. See http://issues.apache.org/jira/browse/HADOOP-4012 I don't believe splitting of .tgz files is possible, something compressed with gzip can only be uncompressed from the beginning. -John Heidemann

extracting input to a task from a (streaming) job?

2008-08-07 Thread John Heidemann
that actually does it? Or are there instructions for poking around on the compute nodes' local disks to assemble it by hand? Or better suggestions? It would be a real boon for people developing map and reduce user code. Thanks for any pointers. -John Heidemann

Re: extracting input to a task from a (streaming) job?

2008-08-07 Thread John Heidemann
On Thu, 07 Aug 2008 19:42:05 +0200, Leon Mergen wrote: Hello John, On Thu, Aug 7, 2008 at 6:30 PM, John Heidemann [EMAIL PROTECTED] wrote: I have a large Hadoop streaming job that generally works fine, but a few (2-4) of the ~3000 maps and reduces have problems. To make matters worse

Re: Hadoop summit / workshop at Yahoo!

2008-02-21 Thread John Heidemann
On Wed, 20 Feb 2008 12:10:09 PST, Ajay Anand wrote: The registration page for the Hadoop summit is now up: http://developer.yahoo.com/hadoop/summit/ ... Agenda: Ajay, when we talked about the summit on the phone, you were considering having a poster session. I don't see that listed. Should I