RE: More Replication on dfs

2009-04-10 Thread Puri, Aseem

Hi
I also tried the command $ bin/hadoop balancer. But still the
same problem.

Aseem

-Original Message-
From: Puri, Aseem [mailto:aseem.p...@honeywell.com] 
Sent: Friday, April 10, 2009 11:18 AM
To: core-user@hadoop.apache.org
Subject: RE: More Replication on dfs

Hi Alex,

Thanks for sharing your knowledge. Till now I have three
machines and I have to check the behavior of Hadoop so I want
replication factor should be 2. I started my Hadoop server with
replication factor 3. After that I upload 3 files to implement word
count program. But as my all files are stored on one machine and
replicated to other datanodes also, so my map reduce program takes input
from one Datanode only. I want my files to be on different data node so
to check functionality of map reduce properly.

Also before starting my Hadoop server again with replication
factor 2 I formatted all Datanodes and deleted all old data manually.

Please suggest what I should do now.

Regards,
Aseem Puri 


-Original Message-
From: Mithila Nagendra [mailto:mnage...@asu.edu] 
Sent: Friday, April 10, 2009 10:56 AM
To: core-user@hadoop.apache.org
Subject: Re: More Replication on dfs

To add to the question, how does one decide what is the optimal
replication
factor for a cluster. For instance what would be the appropriate
replication
factor for a cluster consisting of 5 nodes.
Mithila

On Fri, Apr 10, 2009 at 8:20 AM, Alex Loddengaard a...@cloudera.com
wrote:

 Did you load any files when replication was set to 3?  If so, you'll
have
 to
 rebalance:


http://hadoop.apache.org/core/docs/r0.19.1/commands_manual.html#balance
r
 

http://hadoop.apache.org/core/docs/r0.19.1/hdfs_user_guide.html#Rebalanc
er
 

 Note that most people run HDFS with a replication factor of 3.  There
have
 been cases when clusters running with a replication of 2 discovered
new
 bugs, because replication is so often set to 3.  That said, if you can
do
 it, it's probably advisable to run with a replication factor of 3
instead
 of
 2.

 Alex

 On Thu, Apr 9, 2009 at 9:56 PM, Puri, Aseem aseem.p...@honeywell.com
 wrote:

  Hi
 
 I am a new Hadoop user. I have a small cluster with 3
  Datanodes. In hadoop-site.xml values of dfs.replication property is
2
  but then also it is replicating data on 3 machines.
 
 
 
  Please tell why is it happening?
 
 
 
  Regards,
 
  Aseem Puri
 
 
 
 
 
 



Re: Can we somehow read from the HDFS without converting it to local?

2009-04-10 Thread Stuart White
Not sure if this is what you're looking for...

http://wiki.apache.org/hadoop/HadoopDfsReadWriteExample


On Thu, Apr 9, 2009 at 10:56 PM, Sid123 itis...@gmail.com wrote:

 I need to reuse the O/P of my DFS file without copying to local. Is there a
 way?
 --
 View this message in context: 
 http://www.nabble.com/Can-we-somehow-read-from-the-HDFS-without-converting-it-to-local--tp22982760p22982760.html
 Sent from the Hadoop core-user mailing list archive at Nabble.com.




Add second partition to HDFS

2009-04-10 Thread Harshal
Hi,

On a particular datanode I have /dev/sda1 (root partition) which hdfs shows
properly. Now I have added another disk which is appearing as /dev/sdb1,
what changes I need to make in hadoop-site.xml so that the second disk is
also used ?


--Harshal


Re: HDFS read/write speeds, and read optimization

2009-04-10 Thread Stas Oskin
Hi.


 Hypertable (a BigTable implementation) has a good KFS vs. HDFS breakdown: 
 http://code.google.com/p/hypertable/wiki/KFSvsHDFS


From this comparison it seems KFS is quite faster then HDFS for small data
transfers (for SQL commands).

Any idea if same holds true for small-medium (20Mb - 150 MB) files?



 
 
  2) If the chunk server is located on same host as the client, is there
 any
  optimization in read operations?
  For example, Kosmos FS describe the following functionality:
 
  Localhost optimization: One copy of data
  is placed on the chunkserver on the same
  host as the client doing the write
 
  Helps reduce network traffic

 In Hadoop-speak, we're interested in DataNodes (storage nodes) and
 TaskTrackers (compute nodes).  In terms of MapReduce, Hadoop does try and
 schedule tasks such that the data being processed by a given task on a
 given
 machine is also on that machine.  As for loading data onto a DataNode,
 loading data from a DataNode will put a replica on that node.  However, if
 you're loading data from, say, your local machine, Hadoop will choose a
 DataNode at random.


Ah, so if DataNode will store file to HDFS, it would try to place a replica
on this same DataNode as well? And then if this DataNode would try to read
the file. HDFS would try to read it first from itself first?

Regards.


Re: HDFS read/write speeds, and read optimization

2009-04-10 Thread Stas Oskin
Hi.


 Depends.  What hardware?  How much hardware?  Is the cluster under load?
  What does your I/O load look like?  As a rule of thumb, you'll probably
 expect very close to hardware speed.


Standard Xeon dual cpu, quad core servers, 4 GB RAM.
The DataNodes also do some processing, with usual loads about ~4 (from 8
recommended). The IO load is linear, there are almost no write or read
peaks.

By close to hardware speed, you mean results very near the results I get via
iozone?

Thanks.


Re: Add second partition to HDFS

2009-04-10 Thread Ravi Phulari
Add your second disk name in dfs.data.dir .
Refer - http://hadoop.apache.org/core/docs/r0.19.1/cluster_setup.html

dfs.data.dir  =  Comma separated list of paths on the local filesystem of a 
DataNode where it should store its blocks. If this is a comma-delimited list of 
directories, then data will be stored in all named directories, typically on 
different devices.

--
Ravi

On 4/10/09 6:01 AM, Harshal p.hars...@gmail.com wrote:

Hi,

On a particular datanode I have /dev/sda1 (root partition) which hdfs shows
properly. Now I have added another disk which is appearing as /dev/sdb1,
what changes I need to make in hadoop-site.xml so that the second disk is
also used ?


--Harshal


Ravi
--



Does the HDFS client read the data from NameNode, or from DataNode directly?

2009-04-10 Thread Stas Oskin
Hi.

I wanted to verify a point about HDFS client operations:

When asking for file, is the all communication done through the NameNode? Or
after being pointed to correct DataNode, does the HDFS works directly
against it?

Also, NameNode provides a URL named streamFile which allows any HTTP
client to get the stored files. Any idea how it's operations compare in
terms of speed to client HDFS access?

Regards.


Thin version of Hadoop jar for client

2009-04-10 Thread Stas Oskin
Hi.

Is there any thin version of Hadoop jar, specifically for Java client?

Regards.


Re: HDFS read/write speeds, and read optimization

2009-04-10 Thread Owen O'Malley
On Thu, Apr 9, 2009 at 9:30 PM, Brian Bockelman bbock...@cse.unl.edu wrote:

 On Apr 9, 2009, at 5:45 PM, Stas Oskin wrote:

 Hi.

 I have 2 questions about HDFS performance:

 1) How fast are the read and write operations over network, in Mbps per
 second?


 Depends.  What hardware?  How much hardware?  Is the cluster under load?
  What does your I/O load look like?  As a rule of thumb, you'll probably
 expect very close to hardware speed.

For comparison, on a 1400 node cluster, I can checksum 100 TB in
around 10 minutes, which means I'm seeing read averages of roughly 166
GB/sec. For writes with replication of 3, I see roughly 40-50 minutes
to write 100TB, so roughly 33 GB/sec average. Of course the peaks are
much higher. Each node has 4 SATA disks, dual quad core, and 8 GB of
ram.

-- Owen


Re: HDFS read/write speeds, and read optimization

2009-04-10 Thread Stas Oskin
Thanks for sharing.


 For comparison, on a 1400 node cluster, I can checksum 100 TB in
 around 10 minutes, which means I'm seeing read averages of roughly 166
 GB/sec. For writes with replication of 3, I see roughly 40-50 minutes
 to write 100TB, so roughly 33 GB/sec average. Of course the peaks are
 much higher. Each node has 4 SATA disks, dual quad core, and 8 GB of
 ram.


From your experience,  how RAM hungry HDFS is? Meaning, additional 4GB or
ram (to make it 8GB aas in your case), really change anything?

Regards.


Two degrees of replications reliability

2009-04-10 Thread Stas Oskin
Hi.

I know that there were some hard to find bugs with replication set to 2,
which caused data loss to HDFS users.

Was there any progress with these issues, and if there any fixes which were
introduced?

Regards.


Re: HDFS read/write speeds, and read optimization

2009-04-10 Thread Owen O'Malley


On Apr 10, 2009, at 9:07 AM, Stas Oskin wrote:

From your experience,  how RAM hungry HDFS is? Meaning, additional  
4GB or

ram (to make it 8GB aas in your case), really change anything?


I don't think the 4 to 8GB would matter much for HDFS. For Map/Reduce,  
it is very important.


-- Owen


Re: Two degrees of replications reliability

2009-04-10 Thread Brian Bockelman
Most of the issues were resolved in 0.19.1 -- I think 0.20.0 is going  
to be even better.


We run about 300TB @ 2 replicas, and haven't had file loss that was  
Hadoop's fault since about January.


Brian

On Apr 10, 2009, at 11:11 AM, Stas Oskin wrote:


Hi.

I know that there were some hard to find bugs with replication set  
to 2,

which caused data loss to HDFS users.

Was there any progress with these issues, and if there any fixes  
which were

introduced?

Regards.




Re: [Interesting] One reducer randomly hangs on getting 0 mapper output

2009-04-10 Thread Steve Gao
Does anybody have a clue? Thanks lot.

--- On Thu, 4/9/09, Steve Gao steve@yahoo.com wrote:

From: Steve Gao steve@yahoo.com
Subject: [Interesting] One reducer randomly hangs on getting 0 mapper output
To: core-user@hadoop.apache.org
Date: Thursday, April 9, 2009, 6:04 PM


I have hadoop jobs with the last 1 reducer randomly hangs on getting 0 mapper 
output. By randomly I mean the job sometimes works correctly, sometimes their 
last 1 reducer keeps reading map output but always gets 0 data. It would hang 
up to 100 hours for getting 0 data until I kill it. After I kill and re-run it, 
it could run correctly. The hung reducer could happen on any machine of my 
cluster.

I attach the tail of the problematic reducer's log here. Does anybody have a 
hint what happened?

syslog logs

2009-04-09 21:57:46,445 INFO org.apache.hadoop.mapred.ReduceTask: 
task_200902022141_50382_r_08_0 Need 15 map output(s)
2009-04-09 21:57:46,446 INFO org.apache.hadoop.mapred.ReduceTask: 
task_200902022141_50382_r_08_0: Got 0 new map-outputs  0 obsolete 
map-outputs from tasktracker and 0 map-outputs from previous failures
2009-04-09 21:57:46,446 INFO org.apache.hadoop.mapred.ReduceTask: 
task_200902022141_50382_r_08_0 Got 0 known map output location(s); 
scheduling...
2009-04-09 21:57:46,446 INFO org.apache.hadoop.mapred.ReduceTask: 
task_200902022141_50382_r_08_0 Scheduled 0 of 0 known outputs (0 slow hosts 
and 0 dup hosts)

2009-04-09 21:57:51,453 INFO org.apache.hadoop.mapred.ReduceTask: 
task_200902022141_50382_r_08_0 Need 15 map output(s)
2009-04-09 21:57:51,460 INFO org.apache.hadoop.mapred.ReduceTask: 
task_200902022141_50382_r_08_0: Got 0 new map-outputs  0 obsolete 
map-outputs from tasktracker and 0 map-outputs from previous failures
2009-04-09 21:57:51,460 INFO org.apache.hadoop.mapred.ReduceTask: 
task_200902022141_50382_r_08_0 Got 0 known map output location(s); 
scheduling...
2009-04-09 21:57:51,460 INFO org.apache.hadoop.mapred.ReduceTask: 
task_200902022141_50382_r_08_0 Scheduled 0 of 0 known outputs (0 slow hosts 
and 0 dup hosts)


... (forever)



      


  

Re: More Replication on dfs

2009-04-10 Thread Alex Loddengaard
Mithila,

Most people run with a replication of 3.  3 replicas gets you one local
copy, one copy on a different rack, and an additional copy on that same
rack.

Quantcast gave a talk at a user group a while ago about physically moving a
data center from one colo to another.  Turning off machines and moving them
increases the probability of those machines going down, so Quantcast upped
their replication to 7, I believe.  Once the move was done, they lowered
their replication back to whatever it was set to previously, which I think
was 3.

So anyway, a replication factor of 3 is totally sufficient, unless you've
come across a particular case when node failure is higher than normal, for
example maybe if you're running super unreliable hardware.

Alex

On Thu, Apr 9, 2009 at 10:26 PM, Mithila Nagendra mnage...@asu.edu wrote:

 To add to the question, how does one decide what is the optimal replication
 factor for a cluster. For instance what would be the appropriate
 replication
 factor for a cluster consisting of 5 nodes.
 Mithila

 On Fri, Apr 10, 2009 at 8:20 AM, Alex Loddengaard a...@cloudera.com
 wrote:

  Did you load any files when replication was set to 3?  If so, you'll have
  to
  rebalance:
 
  
 http://hadoop.apache.org/core/docs/r0.19.1/commands_manual.html#balancer
  
 
 http://hadoop.apache.org/core/docs/r0.19.1/hdfs_user_guide.html#Rebalancer
  
 
  Note that most people run HDFS with a replication factor of 3.  There
 have
  been cases when clusters running with a replication of 2 discovered new
  bugs, because replication is so often set to 3.  That said, if you can do
  it, it's probably advisable to run with a replication factor of 3 instead
  of
  2.
 
  Alex
 
  On Thu, Apr 9, 2009 at 9:56 PM, Puri, Aseem aseem.p...@honeywell.com
  wrote:
 
   Hi
  
  I am a new Hadoop user. I have a small cluster with 3
   Datanodes. In hadoop-site.xml values of dfs.replication property is 2
   but then also it is replicating data on 3 machines.
  
  
  
   Please tell why is it happening?
  
  
  
   Regards,
  
   Aseem Puri
  
  
  
  
  
  
 



Re: More Replication on dfs

2009-04-10 Thread Alex Loddengaard
Aseem,

How are you verifying that blocks are not being replicated?  Have you ran
fsck?  *bin/hadoop fsck /*

I'd be surprised if replication really wasn't happening.  Can you run fsck
and pay attention to Under-replicated blocks and Mis-replicated blocks?
In fact, can you just copy-paste the output of fsck?

Alex

On Thu, Apr 9, 2009 at 11:23 PM, Puri, Aseem aseem.p...@honeywell.comwrote:


 Hi
I also tried the command $ bin/hadoop balancer. But still the
 same problem.

 Aseem

 -Original Message-
 From: Puri, Aseem [mailto:aseem.p...@honeywell.com]
 Sent: Friday, April 10, 2009 11:18 AM
 To: core-user@hadoop.apache.org
 Subject: RE: More Replication on dfs

 Hi Alex,

Thanks for sharing your knowledge. Till now I have three
 machines and I have to check the behavior of Hadoop so I want
 replication factor should be 2. I started my Hadoop server with
 replication factor 3. After that I upload 3 files to implement word
 count program. But as my all files are stored on one machine and
 replicated to other datanodes also, so my map reduce program takes input
 from one Datanode only. I want my files to be on different data node so
 to check functionality of map reduce properly.

Also before starting my Hadoop server again with replication
 factor 2 I formatted all Datanodes and deleted all old data manually.

 Please suggest what I should do now.

 Regards,
 Aseem Puri


 -Original Message-
 From: Mithila Nagendra [mailto:mnage...@asu.edu]
 Sent: Friday, April 10, 2009 10:56 AM
 To: core-user@hadoop.apache.org
 Subject: Re: More Replication on dfs

 To add to the question, how does one decide what is the optimal
 replication
 factor for a cluster. For instance what would be the appropriate
 replication
 factor for a cluster consisting of 5 nodes.
 Mithila

 On Fri, Apr 10, 2009 at 8:20 AM, Alex Loddengaard a...@cloudera.com
 wrote:

  Did you load any files when replication was set to 3?  If so, you'll
 have
  to
  rebalance:
 
 
 http://hadoop.apache.org/core/docs/r0.19.1/commands_manual.html#balance
 r
  
 
 http://hadoop.apache.org/core/docs/r0.19.1/hdfs_user_guide.html#Rebalanc
 er
  
 
  Note that most people run HDFS with a replication factor of 3.  There
 have
  been cases when clusters running with a replication of 2 discovered
 new
  bugs, because replication is so often set to 3.  That said, if you can
 do
  it, it's probably advisable to run with a replication factor of 3
 instead
  of
  2.
 
  Alex
 
  On Thu, Apr 9, 2009 at 9:56 PM, Puri, Aseem aseem.p...@honeywell.com
  wrote:
 
   Hi
  
  I am a new Hadoop user. I have a small cluster with 3
   Datanodes. In hadoop-site.xml values of dfs.replication property is
 2
   but then also it is replicating data on 3 machines.
  
  
  
   Please tell why is it happening?
  
  
  
   Regards,
  
   Aseem Puri
  
  
  
  
  
  
 



Re: Add second partition to HDFS

2009-04-10 Thread Alex Loddengaard
Make sure you bounce the datanode daemon once you change the configuration
file as well.

Alex

On Fri, Apr 10, 2009 at 8:23 AM, Ravi Phulari rphul...@yahoo-inc.comwrote:

 Add your second disk name in dfs.data.dir .
 Refer - http://hadoop.apache.org/core/docs/r0.19.1/cluster_setup.html

 dfs.data.dir  =  Comma separated list of paths on the local filesystem of a
 DataNode where it should store its blocks. If this is a comma-delimited list
 of directories, then data will be stored in all named directories, typically
 on different devices.

 --
 Ravi

 On 4/10/09 6:01 AM, Harshal p.hars...@gmail.com wrote:

 Hi,

 On a particular datanode I have /dev/sda1 (root partition) which hdfs shows
 properly. Now I have added another disk which is appearing as /dev/sdb1,
 what changes I need to make in hadoop-site.xml so that the second disk is
 also used ?


 --Harshal


 Ravi
 --




Multithreaded Reducer

2009-04-10 Thread Sagar Naik

Hi,
I would like to implement a Multi-threaded reducer.
As per my understanding , the system does not have one coz we expect the 
output to be sorted.


However, in my case I dont need the output sorted.

Can u pl point to me any other issues or it would be safe to do so

-Sagar


Re: Thin version of Hadoop jar for client

2009-04-10 Thread Aaron Kimball
Not currently, sorry.
- Aaron

On Fri, Apr 10, 2009 at 8:35 AM, Stas Oskin stas.os...@gmail.com wrote:

 Hi.

 Is there any thin version of Hadoop jar, specifically for Java client?

 Regards.



Re: More Replication on dfs

2009-04-10 Thread Aaron Kimball
Changing the default replication in hadoop-site.xml does not affect files
already loaded into HDFS. File replication factor is controlled on a
per-file basis.

You need to use the command `hadoop fs -setrep n path...` to set the
replication factor to n for a particular path already present in HDFS. It
can also take a -R for recursive.

- Aaron

On Fri, Apr 10, 2009 at 10:34 AM, Alex Loddengaard a...@cloudera.comwrote:

 Aseem,

 How are you verifying that blocks are not being replicated?  Have you ran
 fsck?  *bin/hadoop fsck /*

 I'd be surprised if replication really wasn't happening.  Can you run fsck
 and pay attention to Under-replicated blocks and Mis-replicated blocks?
 In fact, can you just copy-paste the output of fsck?

 Alex

 On Thu, Apr 9, 2009 at 11:23 PM, Puri, Aseem aseem.p...@honeywell.com
 wrote:

 
  Hi
 I also tried the command $ bin/hadoop balancer. But still the
  same problem.
 
  Aseem
 
  -Original Message-
  From: Puri, Aseem [mailto:aseem.p...@honeywell.com]
  Sent: Friday, April 10, 2009 11:18 AM
  To: core-user@hadoop.apache.org
  Subject: RE: More Replication on dfs
 
  Hi Alex,
 
 Thanks for sharing your knowledge. Till now I have three
  machines and I have to check the behavior of Hadoop so I want
  replication factor should be 2. I started my Hadoop server with
  replication factor 3. After that I upload 3 files to implement word
  count program. But as my all files are stored on one machine and
  replicated to other datanodes also, so my map reduce program takes input
  from one Datanode only. I want my files to be on different data node so
  to check functionality of map reduce properly.
 
 Also before starting my Hadoop server again with replication
  factor 2 I formatted all Datanodes and deleted all old data manually.
 
  Please suggest what I should do now.
 
  Regards,
  Aseem Puri
 
 
  -Original Message-
  From: Mithila Nagendra [mailto:mnage...@asu.edu]
  Sent: Friday, April 10, 2009 10:56 AM
  To: core-user@hadoop.apache.org
  Subject: Re: More Replication on dfs
 
  To add to the question, how does one decide what is the optimal
  replication
  factor for a cluster. For instance what would be the appropriate
  replication
  factor for a cluster consisting of 5 nodes.
  Mithila
 
  On Fri, Apr 10, 2009 at 8:20 AM, Alex Loddengaard a...@cloudera.com
  wrote:
 
   Did you load any files when replication was set to 3?  If so, you'll
  have
   to
   rebalance:
  
  
  http://hadoop.apache.org/core/docs/r0.19.1/commands_manual.html#balance
  r
   
  
  http://hadoop.apache.org/core/docs/r0.19.1/hdfs_user_guide.html#Rebalanc
  er
   
  
   Note that most people run HDFS with a replication factor of 3.  There
  have
   been cases when clusters running with a replication of 2 discovered
  new
   bugs, because replication is so often set to 3.  That said, if you can
  do
   it, it's probably advisable to run with a replication factor of 3
  instead
   of
   2.
  
   Alex
  
   On Thu, Apr 9, 2009 at 9:56 PM, Puri, Aseem aseem.p...@honeywell.com
   wrote:
  
Hi
   
   I am a new Hadoop user. I have a small cluster with 3
Datanodes. In hadoop-site.xml values of dfs.replication property is
  2
but then also it is replicating data on 3 machines.
   
   
   
Please tell why is it happening?
   
   
   
Regards,
   
Aseem Puri
   
   
   
   
   
   
  
 



Re: Multithreaded Reducer

2009-04-10 Thread Aaron Kimball
Rather than implementing a multi-threaded reducer, why not simply increase
the number of reducer tasks per machine via
mapred.tasktracker.reduce.tasks.maximum, and increase the total number of
reduce tasks per job via mapred.reduce.tasks to ensure that they're all
filled. This will effectively utilize a higher number of cores.

- Aaron

On Fri, Apr 10, 2009 at 11:12 AM, Sagar Naik sn...@attributor.com wrote:

 Hi,
 I would like to implement a Multi-threaded reducer.
 As per my understanding , the system does not have one coz we expect the
 output to be sorted.

 However, in my case I dont need the output sorted.

 Can u pl point to me any other issues or it would be safe to do so

 -Sagar



Re: Multithreaded Reducer

2009-04-10 Thread Sagar Naik


Two things
- multi-threaded is preferred over multi-processes. The process I m 
planning is IO bound so I can really take advantage of  multi-threads 
(100 threads)
- Correct me if I m wrong. The next MR_JOB in the pipeline will have  
increased number of splits to process as the number of reducer-outputs 
(from prev job) have increased . This leads to increase

  in the map-task completion time.



-Sagar

Aaron Kimball wrote:

Rather than implementing a multi-threaded reducer, why not simply increase
the number of reducer tasks per machine via
mapred.tasktracker.reduce.tasks.maximum, and increase the total number of
reduce tasks per job via mapred.reduce.tasks to ensure that they're all
filled. This will effectively utilize a higher number of cores.

- Aaron

On Fri, Apr 10, 2009 at 11:12 AM, Sagar Naik sn...@attributor.com wrote:

  

Hi,
I would like to implement a Multi-threaded reducer.
As per my understanding , the system does not have one coz we expect the
output to be sorted.

However, in my case I dont need the output sorted.

Can u pl point to me any other issues or it would be safe to do so

-Sagar




  


Re: Does the HDFS client read the data from NameNode, or from DataNode directly?

2009-04-10 Thread Stas Oskin
Thanks, this is what I thought.
Regards.

2009/4/10 Alex Loddengaard a...@cloudera.com

 Data is streamed directly from the data nodes themselves.  The name node is
 only queried for block locations and other meta data.

 Alex

 On Fri, Apr 10, 2009 at 8:33 AM, Stas Oskin stas.os...@gmail.com wrote:

  Hi.
 
  I wanted to verify a point about HDFS client operations:
 
  When asking for file, is the all communication done through the NameNode?
  Or
  after being pointed to correct DataNode, does the HDFS works directly
  against it?
 
  Also, NameNode provides a URL named streamFile which allows any HTTP
  client to get the stored files. Any idea how it's operations compare in
  terms of speed to client HDFS access?
 
  Regards.
 



Re: HDFS read/write speeds, and read optimization

2009-04-10 Thread Stas Oskin
Hi.


 Depends on what kind of I/O you do - are you going to be using MapReduce
 and co-locating jobs and data?  If so, it's possible to get close to those
 speeds if you are I/O bound in your job and read right through each chunk.
  If you have multiple disks mounted individually, you'll need the number of
 streams equal to the number of disks.  If you're going to do I/O that's not
 through MapReduce, you'll probably be bound by the network interface.


Btw, this what I wanted to ask as well:

Is it more efficient to unify the disks into one volume (RAID or LVM), and
then present them as a single space? Or it's better to specify each disk
separately?

Reliability-wise, the latter sounds more correct, as a single/several (up to
3) disks going down won't take the whole node with them. But perhaps there
is a performance penalty?


Re: Two degrees of replications reliability

2009-04-10 Thread Stas Oskin
2009/4/10 Brian Bockelman bbock...@cse.unl.edu

 Most of the issues were resolved in 0.19.1 -- I think 0.20.0 is going to be
 even better.

 We run about 300TB @ 2 replicas, and haven't had file loss that was
 Hadoop's fault since about January.

 Brian


And you running 0.19.1?

Regards.


Re: Multithreaded Reducer

2009-04-10 Thread Aaron Kimball
At that level of parallelism, you're right that the process overhead would
be too high.
- Aaron


On Fri, Apr 10, 2009 at 11:36 AM, Sagar Naik sn...@attributor.com wrote:


 Two things
 - multi-threaded is preferred over multi-processes. The process I m
 planning is IO bound so I can really take advantage of  multi-threads (100
 threads)
 - Correct me if I m wrong. The next MR_JOB in the pipeline will have
  increased number of splits to process as the number of reducer-outputs
 (from prev job) have increased . This leads to increase
  in the map-task completion time.



 -Sagar


 Aaron Kimball wrote:

 Rather than implementing a multi-threaded reducer, why not simply increase
 the number of reducer tasks per machine via
 mapred.tasktracker.reduce.tasks.maximum, and increase the total number of
 reduce tasks per job via mapred.reduce.tasks to ensure that they're all
 filled. This will effectively utilize a higher number of cores.

 - Aaron

 On Fri, Apr 10, 2009 at 11:12 AM, Sagar Naik sn...@attributor.com
 wrote:



 Hi,
 I would like to implement a Multi-threaded reducer.
 As per my understanding , the system does not have one coz we expect the
 output to be sorted.

 However, in my case I dont need the output sorted.

 Can u pl point to me any other issues or it would be safe to do so

 -Sagar









Re: Two degrees of replications reliability

2009-04-10 Thread Stas Oskin
Actually, now I remember that you posted some time ago about your University
loosing about 300 files.
So since then the situation has improved I presume?

2009/4/10 Stas Oskin stas.os...@gmail.com

 2009/4/10 Brian Bockelman bbock...@cse.unl.edu

 Most of the issues were resolved in 0.19.1 -- I think 0.20.0 is going to
 be even better.

 We run about 300TB @ 2 replicas, and haven't had file loss that was
 Hadoop's fault since about January.

 Brian


 And you running 0.19.1?

 Regards.



Re: Two degrees of replications reliability

2009-04-10 Thread Brian Bockelman


On Apr 10, 2009, at 1:54 PM, Stas Oskin wrote:

Actually, now I remember that you posted some time ago about your  
University

loosing about 300 files.
So since then the situation has improved I presume?


Yup!  The only files we lose now are due to multiple simultaneous  
hardware loss.  Since January: 11 files to accidentally reformatting 2  
nodes at once, 35 to a night with 2 dead nodes.  Make no mistake -  
HDFS with 2 replicas is *not* an archive-quality file system.  HDFS  
does not replace tape storage for long term storage.


Brian




2009/4/10 Stas Oskin stas.os...@gmail.com


2009/4/10 Brian Bockelman bbock...@cse.unl.edu

Most of the issues were resolved in 0.19.1 -- I think 0.20.0 is  
going to

be even better.

We run about 300TB @ 2 replicas, and haven't had file loss that was
Hadoop's fault since about January.

Brian



And you running 0.19.1?

Regards.





Re: Two degrees of replications reliability

2009-04-10 Thread Brian Bockelman


On Apr 10, 2009, at 1:53 PM, Stas Oskin wrote:


2009/4/10 Brian Bockelman bbock...@cse.unl.edu

Most of the issues were resolved in 0.19.1 -- I think 0.20.0 is  
going to be

even better.

We run about 300TB @ 2 replicas, and haven't had file loss that was
Hadoop's fault since about January.

Brian



And you running 0.19.1?


0.19.1 with a few convenience patches (mostly, they improve logging so  
the local file system researchers can play around with our data  
patterns).


Brian



Re: Two degrees of replications reliability

2009-04-10 Thread Todd Lipcon
On Fri, Apr 10, 2009 at 12:03 PM, Brian Bockelman bbock...@cse.unl.eduwrote:



 0.19.1 with a few convenience patches (mostly, they improve logging so the
 local file system researchers can play around with our data patterns).


Hey Brian,

I'm curious about this. Could you elaborate a bit on what kind of stuff
you're logging? I'm interested in what FS metrics you're looking at and how
you instrumented the code.

-Todd


does hadoop have any way to append to an existing file?

2009-04-10 Thread javateck javateck
Hi,
  does hadoop have any way to append to an existing file? for example, I
wrote some contents to a file, and later on I want to append some more
contents to the file.

thanks,


Re: Does the HDFS client read the data from NameNode, or from DataNode directly?

2009-04-10 Thread Philip Zeyliger


 Also, NameNode provides a URL named streamFile which allows any HTTP
 client to get the stored files. Any idea how it's operations compare in
 terms of speed to client HDFS access?


What happens here is that the NameNode redirects you to a smartly (a data
node that has some of the file's first 5 blocks, I think) chosen DataNode,
and that DataNode proxies the file for you.  Specifically, the assembling of
a full file from multiple nodes is happening on that DataNode.  If you were
using a DFSClient, it would assemble the file from blocks at the client, and
talk to many data nodes.

-- Philip


Re: Two degrees of replications reliability

2009-04-10 Thread Brian Bockelman


On Apr 10, 2009, at 2:06 PM, Todd Lipcon wrote:

On Fri, Apr 10, 2009 at 12:03 PM, Brian Bockelman bbock...@cse.unl.edu 
wrote:





0.19.1 with a few convenience patches (mostly, they improve logging  
so the
local file system researchers can play around with our data  
patterns).




Hey Brian,

I'm curious about this. Could you elaborate a bit on what kind of  
stuff
you're logging? I'm interested in what FS metrics you're looking at  
and how

you instrumented the code.

-Todd


No clue what they're doing *with* the data, but I know what we've  
applied to HDFS to get the data.  We apply both of these patches:

http://issues.apache.org/jira/browse/HADOOP-5222
https://issues.apache.org/jira/browse/HADOOP-5625

This adds the duration and offset to each read.  Each read is then  
logged through the HDFS audit mechanisms.  We've been pulling the logs  
through the web interface and putting them back into HDFS, then  
processing them (actually, today we've been playing with log  
collection via Chukwa).


There is a student who is looking at our cluster's I/O access  
patterns, and there's a few folks who do work in designing metadata  
caching algorithms that love to see application traces.  Personally,  
I'm interested in hooking the logfiles up to our I/O accounting system  
so I can keep historical records of transfers and compare it to our  
other file systems.


Brian




a script to re-write JIRA subject lines for easier GMail threading

2009-04-10 Thread Philip Zeyliger
If you subscribe to core-dev and use gmail, you might be interested in a
quick script I wrote to log into my e-mail and re-write JIRA subject lines,
to fix the threading in gmail.  This way Updated, Commented, and
Created messages all appear in the same thread.

The code is at http://github.com/philz/jira-rewrite/tree/master .  I've
pasted in a bit of the documentation and a sample run below.

-- Philip


Re-writes JIRA subject lines to remove JIRA's Updated/Commented/Created
annotations.
JIRA's default subjects break Gmail's threading (which threads by subject
line); making the
subject line uniform unbreaks Gmail's threading.

Specifically, this looks at every message in --source, does a search and
replace
with --regex and --replace (defaults are for JIRA subject lines), and puts
the
modified message in --dest.  The original message is moved to --backup.

# EXAMPLE OUTPUT:
#   $ python rewritejira.py --username @com --source hadoop-jira
--dest hadoop-jira-rewritten --backup hadoop-jira-orig
#   Password:
#   INFO:__main__:Looking at 3 messages.
#   INFO:__main__:[3] Subject: '[jira] Commented: (HADOOP-5649) Enable
ServicePlugins for the\r\n JobTracker' - '(HADOOP-5649) Enable
ServicePlugins for the\r\n JobTracker'
#   INFO:__main__:[2] Subject: '[jira] Commented: (HADOOP-5581) libhdfs does
not get\r\n FileNotFoundException' - '(HADOOP-5581) libhdfs does not
get\r\n FileNotFoundException'
#   INFO:__main__:[1] Subject: '[jira] Commented: (HADOOP-5638) More
improvement on block placement\r\n performance' - '(HADOOP-5638) More
improvement on block placement\r\n performance'
#   INFO:__main__:Rewrote 3 messages.


Re: Does the HDFS client read the data from NameNode, or from DataNode directly?

2009-04-10 Thread Stas Oskin
Hi.


 What happens here is that the NameNode redirects you to a smartly (a data
 node that has some of the file's first 5 blocks, I think) chosen DataNode,
 and that DataNode proxies the file for you.  Specifically, the assembling
 of
 a full file from multiple nodes is happening on that DataNode.  If you were
 using a DFSClient, it would assemble the file from blocks at the client,
 and
 talk to many data nodes.


I see, thanks for the explanation.


Re: Multithreaded Reducer

2009-04-10 Thread jason hadoop
Hi Sagar!

There is no reason for the body of your reduce method to do more than copy
and queue the key value set into an execution pool.

The close method will need to wait until the all of the items finish
execution and potentially keep the heartbeat up with the task tracker by
periodically reporting something. Sadly right now the reporter has to be
grabbed from the reduce method as configure and close do not get an
instance.

I believe the key and value objects are reused by the framework on the next
call to reduce, so making a copy before queuing them into your thread pool
is important.


On Fri, Apr 10, 2009 at 11:12 AM, Sagar Naik sn...@attributor.com wrote:

 Hi,
 I would like to implement a Multi-threaded reducer.
 As per my understanding , the system does not have one coz we expect the
 output to be sorted.

 However, in my case I dont need the output sorted.

 Can u pl point to me any other issues or it would be safe to do so

 -Sagar




-- 
Alpha Chapters of my book on Hadoop are available
http://www.apress.com/book/view/9781430219422


Re: Multithreaded Reducer

2009-04-10 Thread Todd Lipcon
On Fri, Apr 10, 2009 at 12:31 PM, jason hadoop jason.had...@gmail.comwrote:

 Hi Sagar!

 There is no reason for the body of your reduce method to do more than copy
 and queue the key value set into an execution pool.


Agreed. You probably want to use a either a bounded queue on your execution
pool, or even a SynchronousQueue to do handoff to executor threads.
Otherwise your reducer will churn through all of its inputs at IO rate,
potentially fill up RAM, and report 100% complete way before it's actually
complete. Something like:

BlockingQueueRunnable queue = new SynchronousQueueRunnable();
threadPool = new ThreadPoolExecutor(WORKER_THREAD_COUNT,
WORKER_THREAD_COUNT, 5, TimeUnit.SECONDS, queue);
**


 The close method will need to wait until the all of the items finish
 execution and potentially keep the heartbeat up with the task tracker by
 periodically reporting something. Sadly right now the reporter has to be
 grabbed from the reduce method as configure and close do not get an
 instance.


+1. You probably want to call reporter.progress() after each item is
processed by the worker threads.



 I believe the key and value objects are reused by the framework on the next
 call to reduce, so making a copy before queuing them into your thread pool
 is important.


+1 here too. You will definitely run into issues if you don't make a deep
copy.

-Todd


Re: does hadoop have any way to append to an existing file?

2009-04-10 Thread Todd Lipcon
Hi,

Hadoop's append functionality is somewhat in progress and is not stable in
any released versions quite yet. The relevant ticket is
http://issues.apache.org/jira/browse/HADOOP-1700. Note that, while this JIRA
indicates that it is in 0.19.0, it was rolled back in 0.19.1 (
http://issues.apache.org/jira/browse/HADOOP-5224). It is coming back in
0.20, soon to be released, but you may want to investigate other solutions
to your problem in the meantime.

-Todd

On Fri, Apr 10, 2009 at 12:09 PM, javateck javateck javat...@gmail.comwrote:

 Hi,
  does hadoop have any way to append to an existing file? for example, I
 wrote some contents to a file, and later on I want to append some more
 contents to the file.

 thanks,



Listing the files in HDFS directory

2009-04-10 Thread Stas Oskin
Hi.

I must be completely missing it in API, but what is the function to list the
directory contents?

Thanks!


Hadoop and Image analysis question

2009-04-10 Thread Sameer Tilak
Hi everyone,
I would like to use Hadoop for analyzing tens of thousands of images.
Ideally each mapper gets few hundred images to process and I'll have few
hundred mappers. However, I want the mapper function to run on the machine
where its images are stored. How can I achieve that. With text data creating
splits and exploiting locality seems easy.

One option would be input to the map function would be a text file and that
each line of the text file will contain name of the image to
be processed. Now this text file is the i/p to the mapper function, so
mapper parses the file and reads the image file name to be processed..
Unfortunately, one drawback of this scheme is that the image file itself
might be stored on a machine different than the one running this mapper
function. Copying the file over the network would be quite inefficient. Any
help on this would be great.


Re: HDFS read/write speeds, and read optimization

2009-04-10 Thread Konstantin Shvachko

I just wanted to add to this one other published benchmark
http://developer.yahoo.net/blogs/hadoop/2008/09/scaling_hadoop_to_4000_nodes_a.html
In this example on a very busy cluster of 4000 nodes both read and write 
throughputs
were close to the local disk bandwidth.
This benchmark (called TestDFSIO) uses large consequent write and reads.
You can run it yourself on your hardware to compare.


Is it more efficient to unify the disks into one volume (RAID or LVM), and
then present them as a single space? Or it's better to specify each disk
separately?


There was a discussion recently on this list about RAID0 vs separate disks.
Please search the archives. Separate disks turn out to perform better.


Reliability-wise, the latter sounds more correct, as a single/several (up to
3) disks going down won't take the whole node with them. But perhaps there
is a performance penalty?


You always have block replicas on other nodes, so one node going down should 
not be a problem.

Thanks,
--Konstantin


Abandoning block messages

2009-04-10 Thread Stas Oskin
Hi.

I'm testing HDFS fault-tolerance, and randomly disconnecting and powering
off the chunk nodes to see how HDFS withstands it.

After I have disconnected a chunk node, the logs have started to fill with
the following errors:

09/04/11 02:25:12 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Bad connect ack with firstBadLink 192.168.253.41:50010
09/04/11 02:25:12 INFO dfs.DFSClient: Abandoning block
blk_-656235730036158252_1379
09/04/11 02:25:12 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Bad connect ack with firstBadLink 192.168.253.41:50010
09/04/11 02:25:12 INFO dfs.DFSClient: Abandoning block
blk_4525038275659790960_1379
09/04/11 02:25:12 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Bad connect ack with firstBadLink 192.168.253.41:50010
09/04/11 02:25:12 INFO dfs.DFSClient: Abandoning block
blk_4496668387968639765_1379
09/04/11 02:25:12 INFO dfs.DFSClient: Waiting to find target node:
192.168.253.20:50010
09/04/11 02:25:12 INFO dfs.DFSClient: Waiting to find target node:
192.168.253.20:50010
09/04/11 02:25:12 INFO dfs.DFSClient: Waiting to find target node:
192.168.253.20:50010
09/04/11 02:25:21 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Bad connect ack with firstBadLink 192.168.253.41:50010
09/04/11 02:25:21 INFO dfs.DFSClient: Abandoning block
blk_7909869387543801936_1379
09/04/11 02:25:21 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Bad connect ack with firstBadLink 192.168.253.41:50010
09/04/11 02:25:21 INFO dfs.DFSClient: Abandoning block
blk_3548805008548426073_1368
09/04/11 02:25:21 INFO dfs.DFSClient: Waiting to find target node:
192.168.253.20:50010
09/04/11 02:25:25 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Bad connect ack with firstBadLink 192.168.253.41:50010
09/04/11 02:25:25 INFO dfs.DFSClient: Abandoning block
blk_6143739150500056332_1379
09/04/11 02:25:25 INFO dfs.DFSClient: Waiting to find target node:
192.168.253.20:50010
09/04/11 02:25:30 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Bad connect ack with firstBadLink 192.168.253.41:50010
09/04/11 02:25:30 INFO dfs.DFSClient: Abandoning block
blk_5626187863612769755_1379

Any idea what it means, and should I be concerned about it.

Regards.


Re: Listing the files in HDFS directory

2009-04-10 Thread Stas Oskin
Hi.
Just wanted to tell that FileStatus did the trick.

Regards.

2009/4/11 Stas Oskin stas.os...@gmail.com

 Hi.

 I must be completely missing it in API, but what is the function to list
 the directory contents?

 Thanks!