Jason Venner wrote:
We have 3 types of machines we can get, 2 disk, 6 disk and 16 disk machines. They all have 4 dual core cpus.

The 2 disk machines have about 1 TB, the 6 disks about 3TB and the 16 disk about 8TB. The 16 disk machines have about 25% slower CPU's than the 2/6 disk machines.

We handle a lot of bulky data, and don't think we can fit it all o the 3TB machines if those are our sole compute/dfs nodes.

Your performance will be better if you buy enough of the 6 disk nodes to hold all your data than if you intermix 16 disk nodes. Are the 16 disk nodes considerably cheaper per byte stored than the 6 disk boxes?

From my reading, I conjecture that an ideal configuration would be 1 local disk per cpu for local data/reducing, and some number of separate disks for dfs.
Is this an accurate assessment?

DFS storage is typically local on compute nodes.

Doug

Reply via email to