We created a Jira for this as well as provided a patch. Please see
http://issues.apache.org/jira/browse/HADOOP-4614
I hope it'll make it into svn soon (it's been kind of slow lately).
Are you able to create a reproducible setup for this? I haven't been
able to.
Yes we did see consistent
On Wednesday 05 November 2008 15:27:34 Karl Anderson wrote:
I am running into a similar issue. It seems to be affected by the
number of simultaneous tasks.
For me, while I generally allow up to 4 mappers per node, in this particular
instance I had only one mapper reading from a single gzipped
Hi,
I'm running current snapshot (-r709609), doing a simple word count using python
over
streaming. I'm have a relatively moderate setup of 17 nodes.
I'm getting this exception:
java.io.FileNotFoundException:
August 2008 10:09:48 Yuri Pradkin wrote:
On Thursday 07 August 2008 16:43:10 John Heidemann wrote:
On Thu, 07 Aug 2008 19:42:05 +0200, Leon Mergen wrote:
Hello John,
On Thu, Aug 7, 2008 at 6:30 PM, John Heidemann [EMAIL PROTECTED] wrote:
I have a large Hadoop streaming job that generally
Yuri Pradkin wrote:
I believe you should set keep.failed.tasks.files to true -- this way,
give a task id, you can see what input files it has in ~/
taskTracker/${taskid}/work (source:
http://hadoop.apache.org/core/docs/r0.17.0/mapred_tutorial.html#Isolatio
nR unner )
I forgot to add: I set
On Thursday 21 August 2008 00:14:56 Gopal Gandhi wrote:
I am using Hadoop streaming and I need to pass arguments to my map/reduce
script. Because a map/reduce script is triggered by hadoop, like hadoop
-file MAPPER -mapper $MAPPER -file REDUCER -reducer $REDUCER ...
How can I pass
On Thursday 07 August 2008 16:43:10 John Heidemann wrote:
On Thu, 07 Aug 2008 19:42:05 +0200, Leon Mergen wrote:
Hello John,
On Thu, Aug 7, 2008 at 6:30 PM, John Heidemann [EMAIL PROTECTED] wrote:
I have a large Hadoop streaming job that generally works fine,
but a few (2-4) of the ~3000
interface. The null pointer message in the secondary Namenode log
is a harmless bug but should be fixed. It would be nice if you can open
a JIRA for it.
Thanks,
Dhruba
-Original Message-
From: Yuri Pradkin [mailto:[EMAIL PROTECTED]
Sent: Friday, April 04, 2008 2:45 PM
To: core-user
On Tuesday 08 April 2008 11:54:35 am Konstantin Shvachko wrote:
If you have anything in mind that can be displayed on the UI please let us
know. You can also find a jira for the issue, it would be good if this
discussion is reflected in it.
Well, I guess we could have interface to browse the
Here is how we (attempt to) do it:
Reducer (in streaming) writes one file for each different key it receives as
input.
Here's some example code in perl:
my $envdir = $ENV{'mapred_output_dir'};
my $fs = ($envdir =~ s/^file://);
if ($fs) {
#output goes onto NFS
Hi,
I'm running Hadoop (latest snapshot) on several machines and in our setup
namenode
and secondarynamenode are on different systems. I see from the logs than
secondary
namenode regularly checkpoints fs from primary namenode.
But when I go to the secondary namenode HTTP
Hi,
I'm relatively new to Hadoop and I have what I hope is a simple
question:
I don't understand why the key/value assumption is preserved AFTER the
reduce operation, in other words why the output of a reducer is
expected as key,value instead of arbitrary, possibly binary bytes?
Why can't
.
Miles
On 12/02/2008, Yuri Pradkin [EMAIL PROTECTED] wrote:
Hi,
I'm relatively new to Hadoop and I have what I hope is a simple
question:
I don't understand why the key/value assumption is preserved AFTER
the reduce operation, in other words why the output of a reducer
is expected
13 matches
Mail list logo