Hi, I'm testing a deployment of Nutch at work and am trying to decide what filesystem to use. I got the NDFS demo working, and am excited to use it, but it looks pretty new. Should I consider using it for production? I'm considering storing quite a lot of data, in the 10-100 TB range.
Also, I'm wondering about the read/write performance. From some initial testing, it looks like I'm not getting any speedup reading from two data nodes compared to reading the same data from a single host using a program like scp. I'm wondering if any performance tuning has been done yet on ndfs. Cheers, Pablo Mayrgundter
