Hi,
Most of our map jobs are IO bound. However, for the same node, the IO
throughput during the map phase is only 20% of its real sequential IO
capability (we tested the sequential IO throughput by iozone)
I think the reason is that while each map has a sequential IO request, since
there
Is it possible to do that?
I can access files at HDFS by specifying the URI below.
FileSystem fileSys = FileSystem.get(new URI(hdfs://server:9000), conf);
But I don't know how to do that for JobConf.
Thanks,
-Songting
I am reading the performance counter and having the following question:
Map Input Bytes = 6.8G, while HDFS Bytes Read = 10G.
What are the additional 3.2G?
Thanks
-Songting
Is there a way for the Map process to know it's the end of records?
I need to flush some additional data at the end of the Map process, but
wondering where I should put that code.
Thanks,
-Songting
, at 2:43 PM, Songting Chen wrote:
To summarize the slow shuffle issue:
1. I think one problem is that the Reducer starts very
late in the process, slowing the entire job
significantly.
Is there a way to let reducer start earlier?
http://issues.apache.org/jira/browse/HADOOP-3136
We encountered a bottleneck during the shuffle phase. However, there is not
much data to be shuffled across the network at all - total less than 10MBytes
(the combiner aggregated most of the data).
Are there any parameters or anything we can tune to improve the shuffle
performance?
Thanks,
this threshold
before
the reduce can begin.
/description
/property
How long did the shuffle take relative to the rest of the
job?
Alex
On Fri, Dec 5, 2008 at 11:17 AM, Songting Chen
[EMAIL PROTECTED]wrote:
We encountered a bottleneck during the shuffle phase.
However
map outputs in memory must consume less
than this threshold
before
the reduce can begin.
/description
/property
How long did the shuffle take relative to the rest of the
job?
Alex
On Fri, Dec 5, 2008 at 11:17 AM, Songting Chen
[EMAIL PROTECTED]wrote:
We encountered
@hadoop.apache.org
Date: Friday, December 5, 2008, 12:28 PM
How many reduce tasks do you have? Look into increasing
mapred.reduce.parallel.copies from the default of 5 to
something more like
20 or 30.
- Aaron
On Fri, Dec 5, 2008 at 10:00 PM, Songting Chen
[EMAIL PROTECTED]wrote:
A little
I think one of the issues is that the Reducer starts very late in the process,
slowing the entire job significantly.
Is there a way to let reducer start earlier?
--- On Fri, 12/5/08, Songting Chen [EMAIL PROTECTED] wrote:
From: Songting Chen [EMAIL PROTECTED]
Subject: Re: slow shuffle
puzzles me what's behind the scene. (note that sorting takes 1
sec)
Thanks,
-Songting
--- On Fri, 12/5/08, Songting Chen
[EMAIL PROTECTED] wrote:
From: Songting Chen [EMAIL PROTECTED]
Subject: Re: slow shuffle
To: core-user@hadoop.apache.org
Date: Friday, December 5, 2008, 1:27 PM
1. The namenode webpage shows:
Upgrades: Upgrade for version -18 has been completed.
Upgrade is not finalized.
2. SequenceFile.Writer failed when trying to creating a new file with the
following error: (we have two HaDoop clusters, both have issue 1; one has issue
2, but the other is
Hi,
I modified the classpath in hadoop-env.sh in namenode and datanodes before
shutting down the cluster. Then problem appears: I cannot stop hadoop cluster
at all. The stop-all.sh shows no datanode/namenode, while all the java
processes are running.
So I manually killed the java process.
to go away.
So basically my problem was fixed - just hope my experience may help find some
potential bugs.
Thanks,
-Songting
--- On Mon, 10/27/08, Songting Chen [EMAIL PROTECTED] wrote:
From: Songting Chen [EMAIL PROTECTED]
Subject: namenode failure
To: core-user@hadoop.apache.org
Date
It seems that I encountered a similar problem:
Zlib , lzo installed.
Running ant -Dcompile.native=true gave the following error.
[exec]
/server/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo/LzoCompressor.c:
In function
@hadoop.apache.org
Date: Friday, October 10, 2008, 7:44 AM
On 10/9/08 6:46 PM, Songting Chen
[EMAIL PROTECTED] wrote:
Does that mean I have to rebuild the native library?
Also, the LZO installation puts liblzo2.a and
liblzo2.la under /usr/local/lib.
There is no liblzo2.so there. Do I need to rename
I switched to lzo-2.02 package. This time liblzo2.so was built.
Now everything worked.
Thanks,
-Songting
--- On Fri, 10/10/08, Songting Chen [EMAIL PROTECTED] wrote:
From: Songting Chen [EMAIL PROTECTED]
Subject: Re: How to make LZO work?
To: core-user@hadoop.apache.org
Date: Friday
Hi,
I have installed lzo-2.03 to my Linux box.
But still my code for writing a SequenceFile using LZOcodec returns the
following error:
util.NativeCodeLoader: Loaded the native-hadoop library
java.lang.UnsatisfiedLinkError: Cannot load liblzo2.so!
What needs to be done to make this
: Arun C Murthy [EMAIL PROTECTED]
Subject: Re: How to make LZO work?
To: core-user@hadoop.apache.org
Date: Thursday, October 9, 2008, 6:35 PM
On Oct 9, 2008, at 5:58 PM, Songting Chen wrote:
Hi,
I have installed lzo-2.03 to my Linux box.
But still my code for writing a SequenceFile using
datanodes, that could introduce significant network
transfer cost.
Any ideas of that? Thanks,
-Songting Chen
20 matches
Mail list logo