Hello,

 I am running org.apache.hadoop.fs.TestDFSIO to benchmark our HDFS
installation and had a couple of questions regarding the same.

a) If I run the benchmark back to back in the same directory, I start seeing
strange errors such as NotReplicatedYetException or
AlreadyBeingCreatedException (failed to create file  .... on client 5,
because this file is already being created by DFSClient_.... on ...).  It
seems like there might be some kind of race condition between the
replication from a previous run and subsequent runs. Is there any way to
avoid this?

b) I have been testing with concurrent writers and see a significant drop in
throughput. I get about 60 MB/s for 1 writer and about 8 MB/s for 50
concurrent writers. Is this the known scalability limits for HDFS. Is there
any way to configure this to perform better?

thanks
LR

Reply via email to