Re: HDFS read/write speeds, and read optimization

Alex Loddengaard Thu, 09 Apr 2009 21:07:52 -0700

Answers in-line.

Alex

On Thu, Apr 9, 2009 at 3:45 PM, Stas Oskin <[email protected]> wrote:

> Hi.
>
> I have 2 questions about HDFS performance:
>
> 1) How fast are the read and write operations over network, in Mbps per
> second?

Hypertable (a BigTable implementation) has a good KFS vs. HDFS breakdown: <
http://code.google.com/p/hypertable/wiki/KFSvsHDFS>

>
>
> 2) If the chunk server is located on same host as the client, is there any
> optimization in read operations?
> For example, Kosmos FS describe the following functionality:
>
> "Localhost optimization: One copy of data
> is placed on the chunkserver on the same
> host as the client doing the write
>
> Helps reduce network traffic"

In Hadoop-speak, we're interested in DataNodes (storage nodes) and
TaskTrackers (compute nodes).  In terms of MapReduce, Hadoop does try and
schedule tasks such that the data being processed by a given task on a given
machine is also on that machine.  As for loading data onto a DataNode,
loading data from a DataNode will put a replica on that node.  However, if
you're loading data from, say, your local machine, Hadoop will choose a
DataNode at random.

>
>
> Regards.
>

Re: HDFS read/write speeds, and read optimization

Reply via email to