A short, slightly off-topic question:

>       Also note that in this configuration that one cannot take
> advantage of the "keep the machine up at all costs" features in newer
> Hadoop's, which require that root, swap, and the log area be mirrored
> to be truly effective.  I'm not quite convinced that those features are
> worth it yet for anything smaller than maybe a 12 disk config.

Dell and Cloudera promote the C2100. I'd like to see the calculations behind 
that config. Am I wrong thinking that keeping your cluster up with such dense 
nodes will only work if you have many (order of magnitude 100+) of them, and 
interconnected with 10Gb Ethernet? If you don't then recovery times from 
failing disks / rack switches are going to get crazy, right? If you want to get 
bang for buck, don't the proportions "disk IO / processing power", "node 
storage capacity / ethernet speed" and "total amount of nodes / ethernet 
speed", indicate many small nodes with not too many disks and 1Gb Ethernet?

Cheers,
Evert

Reply via email to