Jason Venner wrote:
We have 3 types of machines we can get, 2 disk, 6 disk and 16 disk
machines. They all have 4 dual core cpus.
The 2 disk machines have about 1 TB, the 6 disks about 3TB and the 16
disk about 8TB. The 16 disk machines have about 25% slower CPU's than
the 2/6 disk machines.
We handle a lot of bulky data, and don't think we can fit it all o the
3TB machines if those are our sole compute/dfs nodes.
Your performance will be better if you buy enough of the 6 disk nodes to
hold all your data than if you intermix 16 disk nodes. Are the 16 disk
nodes considerably cheaper per byte stored than the 6 disk boxes?
From my reading, I conjecture that an ideal configuration would be 1
local disk per cpu for local data/reducing, and some number of separate
disks for dfs.
Is this an accurate assessment?
DFS storage is typically local on compute nodes.
Doug