What are you trying to do? Hadoop dfs has different goals than a network file system such as samba.
-Michael On 4/16/07 10:32 AM, "jafarim" <[EMAIL PROTECTED]> wrote: > On linux and jvm6 with normal IDE disks and a giga ethernet switch with > corresponding NIC and with hadoop 0.9.11's HDFS. We wrote a C program by > using the native libs provided in the package but then we tested again with > distcp. The scenario was as follows: > We ran the test on a cluster with 1 node, then we added the nodes one by one > until reaching 5 nodes. Same test with samba saturated the link with only > one node. > > --jaf > > > On 4/16/07, Doug Cutting <[EMAIL PROTECTED]> wrote: >> >> Please use a new subject when starting a new topic. >> >> jafarim wrote: >>> Sorry if being off topic, but we experienced a very low bandwidth with >>> hadoop while copying files to/from the cluster (some 1/100 comparing to >>> plain samba share). The bandwidth did not improve at all by adding nodes >> to >>> the cluster. At that time I thought that hadoop is not supposed to be >> used >>> for this purpose and did not use it for my project. >>> I am just curious how much scalable hadoop is and how bandwidth should >> grow >>> as nodes are added to the cluster. >> >> It's not clear to me what you tried. Are you running HDFS? On how >> large of a cluster? What version of Hadoop? What operating system? >> How were you copying files to/from the cluster? >> >> The 'bin/hadoop distcp' command should scale to consume available >> network bandwidth and disk i/o. >> >> Doug >>
