Hello All,

I working on a P2P storage project for research purpose.
I want to use HDFS DataNode as a part of a research project.
One possibility is using only DataNode as a storage engine and do
everything else at upper level. In this case I will have all the metadata
management and replication mechanism at upper level and use DataNode only
for storing data per node.

The second possibility is using also NameNode for metadata management and
modify it to fit in my project.

I have been trying to find where to start. How much modularity is there in
HDFS?
Can I use only DataNode alone and modify it to fit in my project? What are
inputs and outputs of DataNode? Where should I start?

If I decide to use also NameNode, where should start?

Any comment/help is appreciated.

Thanks

Yasin Celik

Reply via email to