A few improvements to DataNodeCluster
-------------------------------------
Key: HADOOP-5556
URL: https://issues.apache.org/jira/browse/HADOOP-5556
Project: Hadoop Core
Issue Type: Bug
Components: test
Reporter: Hairong Kuang
Fix For: 0.21.0
DataNodeCluster is a great tool to simulate a large scale DFS cluster using a
small set of machines. A few suggestions to improve this tool:
# DataNodeCluster uses MiniDFSCluster#startDataNode to start multiple instances
of DataNode on one machine. MiniDFSCluster sets DataNode's address to be
127.0.0.1. We should allow to set its address to 0.0.0.0 so DataNodes in
different machines could communicate.
# Currently the size of the blocks injected to DataNode and created in
CreatedEditsLog is hardcoded as 10. It would be more convenient if this could
be configurable. Also we need to make sure that both use the same block size.
# If the replication factor of blocks is larger than 1, currently a DataNode in
DataNodeCluster will be injected blocks multiple times and therefore it sends
block reports to NameNode multiple times. Initial block reports contain only a
portion of its blocks and therefore may cause unnecessary block replications.
It would be cleaner if only one block report with all its blocks is sent.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.