All,

We currently a Hadoop 2.2.0 cluster with the following characteristics:
- 4 nodes
- Each node is a datanode
- Each node has 3 physical disks for data: 2 x 500GB and 1 x 2TB disk.
- HDFS replication factor of 3

It appears that our 500GB disks are filling up first (the alternative would be to put 4 times the number of blocks on the 2TB disks per node). I'm concerned that once the 500GB disks fill, our performance will slow down (less spindles being read / written at the same time per node). Is this correct? Is there anything we can do to change this behavior?

Thanks,
Brian


Reply via email to