Lots of Datanode SocketTimeoutException

2012-02-25 Thread Clay Chiang
Hi All,

We have a HDFS cluster with ~200 nodes, and for some reason, it's
divided into 4 MR clusters which sharing the same HDFS.
Recently, we saw a lots of SocketTimeoutException in datanode log, such
as:

2012-02-24 11:57:51,882 WARN  datanode.DataNode
(DataXceiver.java:readBlock(236)) - DatanodeRegistration(.):Got
exception while serving blk_-5205544551109548677_55590565 to /xx.xx.xx.xx:
java.net.SocketTimeoutException: 48 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/... remote=/...]
at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:350)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:436)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:214)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:114)

   And actually it happened for months, but only recently there were lots
of  timeout while waiting for channel to be ready for *write* (no read),
and the remote host all come from one of the MR cluster. Monitoring that
MR cluster from ganglia, no constant heavy load found. i test the network
by scp a large file between hosts and i don't think it's a network problem.

i did some googling for this problem and found
https://issues.apache.org/jira/browse/HDFS-770 , which may be related to
but not solved.

and in some datanode, we also found Out of socket memory in dmesg. (
does it hurt ? need some kernel tuning? ,  uname -a:   Linux ...
2.6.18-194.el5 #1 SMP Fri Apr 2 14:58:14 EDT 2010 x86_64 x86_64 x86_64
GNU/Linux)

 anybody who has some idea about this, pls help :) Thanks in advance.

-- 
Kindest Regards,
Clay Chiang


Re: MapReduce tunning

2012-02-25 Thread Jie Li
Hello Mohit,

I am looking at some hadoop tuning parameters like io.sort.mb,
 mapred.child.javaopts etc.

- My question was where to look at for current setting


The default settings as well as the documentations can be found in Hadoop
directory:

src/mapred/mapred-default.xml
src/core/core-default.xml
src/hdfs/hdfs-default.xml


 - Are these settings configured cluster wide or per job?


Some settings are configured cluster wide, e.g. the number of map/reduce
slots per node, while some settings are configured per job, e.g.
io.sort.mb. It depends on the functionality of that specific parameter.


 - What's the best way to look at reasons of slow performance?


Well, I want to introduce Starfish to you. Starfish is a self-tuning system
built on Hadoop to provide good performance automatically, without any need
for users to understand and manipulate the many tuning knobs in Hadoop.

With Starfish, you can analyze the performance of your Hadoop job at fine
grained level, e.g. the time for map processing, spilling, merging,
shuffling, sorting, and reduce processing.  So you can understand which
part is the bottleneck of the performance.

You can also ask what-if questions, e.g. What if I double io.sort.mb ?,
and Starfish will predict the new behaviour of the job, so you can better
understand how these parameters work.  In addition, you can simply delegate
Starfish to find the optimal configurations for you to achieve the best
performance.

Welcome to join our Google Group to discuss more about Starfish and any
feedback will be appreciated. If you meet any problems, please don't
hesitate to let us know. The Group address is
http://groups.google.com/group/hadoop-starfish.

Thanks,
Jie
-
Starfish Group, Duke University
Starfish Homepage: www.cs.duke.edu/starfish/
Starfish Google Group: http://groups.google.com/group/hadoop-starfish


Re: Experience with Hadoop in production

2012-02-25 Thread Jie Li
Hi Pavel,

Seems your team spent some time on the performance and tuning issues. Just
wonder whether an automatic Hadoop tuning tool like Starfish would be
interesting to you. We'd like to exchange the tuning experience with you.

Thanks,
Jie

Starfish Group, Duke University
Starfish Homepage: www.cs.duke.edu/starfish/
Starfish Google Group: http://groups.google.com/group/hadoop-starfish

On Thu, Feb 23, 2012 at 1:17 PM, Pavel Frolov pfro...@gmail.com wrote:

 Hi,

 We are going into 24x7 production soon and we are considering whether we
 need vendor support or not.  We use a free vendor distribution of Cluster
 Provisioning + Hadoop + HBase and looked at their Enterprise version but it
 is very expensive for the value it provides (additional functionality +
 support), given that we’ve already ironed out many of our performance and
 tuning issues on our own and with generous help from the community (e.g.
 all of you).

 So, I wanted to run it through the community to see if anybody can share
 their experience of running a Hadoop cluster (50+ nodes with Apache
 releases or Vendor distributions) in production, with in-house support
 only, and how difficult it was.  How many people were involved, etc..

 Regards,
 Pavel



Re: How to estimate hadoop?

2012-02-25 Thread Jie Li
Hi Jinyan,

I'd like to introduce you our system Starfish, which can be used to analyze
and estimate the Hadoop performance and memory usage.

With Starfish, you can analyze the performance of your Hadoop job at fine
grained levels, e.g. the time for map processing, spilling, merging,
shuffling, sorting, and reduce processing.  So you can understand which
part is the bottleneck of the performance.

You can also ask what-if questions, e.g. What if I double io.sort.mb ?,
and Starfish will predict the new behaviour of the job, so you can better
understand how these parameters work, and estimate the time for the new job.

In addition, you can simply delegate Starfish to find the optimal
configurations for you to achieve the best performance.

Welcome to join our Google Group to discuss more about Starfish and any
feedback will be appreciated. If you meet any problems, please don't
hesitate to let us know. The Group address is
http://groups.google.com/group/hadoop-starfish.

Thanks,
Jie

Starfish Group, Duke University
Starfish Homepage: www.cs.duke.edu/starfish/
Starfish Google Group: http://groups.google.com/group/hadoop-starfish


On Wed, Feb 15, 2012 at 11:26 PM, Srinivas Surasani vas...@gmail.comwrote:

 Hey,

 It completely depends on your data sizes and processing. You can have
 one node cluster to thousands (many more) depending on your needs.
 Following link may help you.

 http://wiki.apache.org/hadoop/HardwareBenchmarks

 Regards,


 On Wed, Feb 15, 2012 at 10:17 PM, Jinyan Xu xxjjyy2...@gmail.com wrote:
  Hi all,
 
  I want to used hadoop system, but I need a overall system info about
  hadoop, for example,
  system performance, mem used, cpu utilization and so on.  So do anyone
 have
  a system estimate  about hadoop? which tool can do this?
 
  yours
  rock



 --
 -- Srinivas
 srini...@cloudwick.com




Re: MapReduce tunning

2012-02-25 Thread sriramsrao
Use a search engine to find the Hadoop best practices blog by Arun Murthy.

Sriram

On Feb 24, 2012, at 10:36 PM, Mohit Anchlia mohitanch...@gmail.com wrote:

 I am looking at some hadoop tuning parameters like io.sort.mb,
 mapred.child.javaopts etc.
 
 - My question was where to look at for current setting
 - Are these settings configured cluster wide or per job?
 - What's the best way to look at reasons of slow performance?


dfs.block.size

2012-02-25 Thread Mohit Anchlia
If I want to change the block size then can I use Configuration in
mapreduce job and set it when writing to the sequence file or does it need
to be cluster wide setting in .xml files?

Also, is there a way to check the block of a given file?


Re: LZO with sequenceFile

2012-02-25 Thread Shi Yu
Yes, it is supported by Hadoop sequence file. It is splittable 
by default. If you have installed and specified LZO correctly,  
use these:

   
org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputForma
t.setCompressOutput(job,true);
   
org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputForma
t.setOutputCompressorClass(job,com.hadoop.compression.lzo.LzoC
odec.class);
   
org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputForma
t.setOutputCompressionType(job, 
SequenceFile.CompressionType.BLOCK);
   
job.setOutputFormatClass(org.apache.hadoop.mapreduce.lib.outpu
t.SequenceFileOutputFormat.class);


Shi


Re: LZO with sequenceFile

2012-02-25 Thread Mohit Anchlia
Thanks. Does it mean LZO is not installed by default? How can I install LZO?

On Sat, Feb 25, 2012 at 6:27 PM, Shi Yu sh...@uchicago.edu wrote:

 Yes, it is supported by Hadoop sequence file. It is splittable
 by default. If you have installed and specified LZO correctly,
 use these:


 org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputForma
 t.setCompressOutput(job,true);

 org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputForma
 t.setOutputCompressorClass(job,com.hadoop.compression.lzo.LzoC
 odec.class);

 org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputForma
 t.setOutputCompressionType(job,
 SequenceFile.CompressionType.BLOCK);

 job.setOutputFormatClass(org.apache.hadoop.mapreduce.lib.outpu
 t.SequenceFileOutputFormat.class);


 Shi