Looking for some expert insights on HDFS

Yoav Steinberg Sun, 21 Oct 2007 04:14:15 -0700

Hello my company is considering using a DFS for a project we're
currently working on. Since we don't have much experience in the field
I've compiled a list of questions that I hope can guide us to making
better decisions. I would greatly appreciate anyones help regarding
these issues.


- How do we handle failure of the single metaserver/namenode? Is there a
way to build a no "downtime" solution?

- What are the major differences between KFS and HDFS? - spec wise they
seem similar.

- Our service needs to handle a large amount of small (typically 1-20MB)
in size files. Is HDFS/KFS appropriate for this?

- Our service requires accessing these files in a low latency fashion,
we're less worried about throughput since (a) the files are small so
they might be cached by the OS and in any case will reside on a single
data server (probably won't have multiple chunks/blocks per file) and
(b) we don't have high throughput requirements. We do however require
low latency (lets say less than 50ms) to start getting data from the
file. Will HDFS/KFS provide those numbers?

- Are there any options for providing data reliability without the need
for complete replication (waisting storage space). For example
performing "raid xor" type operations between chunks/blocks?

- Are there any other DFS's you'd recommend looking into, which might
better fit our requirements?

Thanks,
   Yoav.

Looking for some expert insights on HDFS

Reply via email to