DataXceiver WRITE_BLOCK: Premature EOF from inputStream: Using Avro Multiple Outputs

2015-07-04 Thread ed
Hello, We are running a job that makes use of Avro Multiple Ouputs (Avro 1.7.5). When there are lots of output files the job was failing with the following error which I believed caused the job to fail: hc1hdfs2p.thecarlylegroup.local:50010:DataXceiverServer: java.io.IOException: Xceiver count

Job fails while re attempting the task in multiple outputs case

2013-12-30 Thread AnilKumar B
Hi, I am using multiple outputs in our job. So whenever any reduce task fails, all it's next task attempts are failing with file exist exception. The output file name should also append the task attempt right? But it's only appending the task id. Is this the bug or Some thing wrong from my

Re: Job fails while re attempting the task in multiple outputs case

2013-12-30 Thread Harsh J
, 2013 at 4:22 PM, AnilKumar B akumarb2...@gmail.com wrote: Hi, I am using multiple outputs in our job. So whenever any reduce task fails, all it's next task attempts are failing with file exist exception. The output file name should also append the task attempt right? But it's only appending

Re: Job fails while re attempting the task in multiple outputs case

2013-12-30 Thread AnilKumar B
-to_hdfs_files_directly_from_map.2Freduce_tasks.3F On Mon, Dec 30, 2013 at 4:22 PM, AnilKumar B akumarb2...@gmail.com wrote: Hi, I am using multiple outputs in our job. So whenever any reduce task fails, all it's next task attempts are failing with file exist exception. The output file

Re: Job fails while re attempting the task in multiple outputs case

2013-12-30 Thread Jiayu Ji
.2Freduce_tasks.3F On Mon, Dec 30, 2013 at 4:22 PM, AnilKumar B akumarb2...@gmail.com wrote: Hi, I am using multiple outputs in our job. So whenever any reduce task fails, all it's next task attempts are failing with file exist exception. The output file name should also append

Re: Real Multiple Outputs for Hadoop -- is this implementation correct?

2013-09-13 Thread Harsh J
this functionality, but otherwise I was happy writing Plain M/R jars I wrote up the implementation here: https://github.com/paulhoule/infovore/wiki/Real-Multiple-Outputs-in-Hadoop And this works hand-in hand with an abstraction layer that supports unit testing w/ Mockito https://github.com

Real Multiple Outputs for Hadoop -- is this implementation correct?

2013-09-13 Thread Paul Houle
if I want this functionality, but otherwise I was happy writing Plain M/R jars I wrote up the implementation here: https://github.com/paulhoule/infovore/wiki/Real-Multiple-Outputs-in-Hadoop And this works hand-in hand with an abstraction layer that supports unit testing w/ Mockito https

Re: Multiple outputs

2013-03-18 Thread Harsh J
MultipleOutputs is the way to go :) On Tue, Mar 12, 2013 at 12:48 PM, Fatih Haltas fatih.hal...@nyu.edu wrote: Hi Everyone, I would like to have 2 different output (having different columns of a same input text file.) When I googled a bit, I got multipleoutputs classes, is this the common

Multiple outputs

2013-03-12 Thread Fatih Haltas
Hi Everyone, I would like to have 2 different output (having different columns of a same input text file.) When I googled a bit, I got multipleoutputs classes, is this the common way of doing it or is there any way to create contextiterable kind of things/is there context array/is it possible to

Multiple inputs and Multiple outputs give IOException

2012-02-07 Thread Tarjei Huse
Hi, I'm writing an MR job that takes a set of SequenceFiles extracts a new key, outputs the key + value to the reducer and then the reducer writes the value to a set of sequence files based on the key. This job works perfectly if I run it with one sequencefile, but fails if I run it with more

Re: Multiple inputs and Multiple outputs give IOException

2012-02-07 Thread Harsh J
Are your DataNodes all up? Do regular jobs like Hadoop's wordcount example program work fine? On Tue, Feb 7, 2012 at 7:36 PM, Tarjei Huse tar...@scanmine.com wrote: Hi, I'm writing an MR job that takes a set of SequenceFiles extracts a new key, outputs the key + value to the reducer and then

Re: Multiple inputs and Multiple outputs give IOException

2012-02-07 Thread Tarjei Huse
On 02/07/2012 03:14 PM, Harsh J wrote: Are your DataNodes all up? Do regular jobs like Hadoop's wordcount example program work fine? Yes. Various other jobs work and I've used Sqoop to export the files I try to import. I've tried reformatting the FS and resetting the whole install multiple

Re: multiple outputs

2010-06-08 Thread Torsten Curdt
Can the MultipleOutputs also be used inside a mapper? So basically I pipe data into different reducers from the mapper. Of course I could do two separate jobs but that would very inefficient as I would have to go/read through all the data twice. cheers -- Torsten On Tue, Jun 8, 2010 at 06:22,

Re: multiple outputs

2010-06-08 Thread Amareshwari Sri Ramadasu
Yes. They can be used inside a mapper also. See org.apache.hadoop.mapred.lib.TestMultipleOutputs.java or org.apache.hadoop.mapreduce.lib.output.TestMRMultiplteOutputs for some sample code. Thanks Amareshwari On 6/9/10 5:57 AM, Torsten Curdt tcu...@vafer.org wrote: Can the MultipleOutputs

Re: multiple outputs

2010-06-07 Thread Amareshwari Sri Ramadasu
MultipleOutputs is ported to use new api through http://issues.apache.org/jira/browse/MAPREDUCE-370 See the discussions on jira and javadoc/testcase as an example on how to use it. Thanks Amareshwari On 6/7/10 8:08 PM, Torsten Curdt tcu...@apache.org wrote: I need to emit to different output

Map and reduce :Running multiple jobs and multiple outputs

2009-09-02 Thread ll_oz_ll
Hi, I'm a new user to hadoop and have been having difficulties adapting map and reduce to my needs. This is what I want to do: 1. Run multiple jobs chained to each other. The output of one Map and reduce is the input to the one after it. 2. Get multiple outputs for each Map and reduce job (Ive

Re: Multiple outputs and getmerge?

2009-04-21 Thread Todd Lipcon
On Mon, Apr 20, 2009 at 1:14 PM, Stuart White stuart.whi...@gmail.comwrote: Is this the best/only way to deal with this? It would be better if hadoop offered the option of writing different outputs to different output directories, or if getmerge offered the ability to specify a file prefix

RE: Multiple outputs and getmerge?

2009-04-21 Thread Koji Noguchi
- From: Stuart White [mailto:stuart.whi...@gmail.com] Sent: Monday, April 20, 2009 1:15 PM To: core-user@hadoop.apache.org Subject: Multiple outputs and getmerge? I've written a MR job with multiple outputs. The normal output goes to files named part-X and my secondary output records go

Re: Multiple outputs and getmerge?

2009-04-21 Thread Stuart White
On Tue, Apr 21, 2009 at 12:06 PM, Todd Lipcon t...@cloudera.com wrote: Would dfs -cat do what you need? e.g: ./bin/hdfs dfs -cat /path/to/output/ExceptionDocuments-m-\* /tmp/exceptions-merged Yes, that would work. Thanks for the suggestion.

Re: Multiple outputs and getmerge?

2009-04-21 Thread Stuart White
On Tue, Apr 21, 2009 at 1:00 PM, Koji Noguchi knogu...@yahoo-inc.com wrote: I once used MultipleOutputFormat and created   (mapred.work.output.dir)/type1/part-_   (mapred.work.output.dir)/type2/part-_    ... And JobTracker took care of the renaming to  

RE: Multiple outputs and getmerge?

2009-04-21 Thread Koji Noguchi
@hadoop.apache.org Subject: Re: Multiple outputs and getmerge? On Tue, Apr 21, 2009 at 1:00 PM, Koji Noguchi knogu...@yahoo-inc.com wrote: I once used MultipleOutputFormat and created   (mapred.work.output.dir)/type1/part-_   (mapred.work.output.dir)/type2/part-_    ... And JobTracker took

Multiple outputs and getmerge?

2009-04-20 Thread Stuart White
I've written a MR job with multiple outputs. The normal output goes to files named part-X and my secondary output records go to files I've chosen to name ExceptionDocuments (and therefore are named ExceptionDocuments-m-X). I'd like to pull merged copies of these files to my local