Hi Yasin! Without knowing more about your project, here are answers to your questions.
It's trivially easy to start only the Datanode. The HDFS code is very modular. https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java . https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop is a script you can use. Obviously though the Datanode will try to talk to a Namenode via the Namenode RPC mechanism. https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeProtocol.java If you wanted to modify the Namenode, here's the RPC interface it exports : https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java . Good luck with your project! HTH Ravi On Wed, Aug 10, 2016 at 9:11 AM, Yasin Celik <yasinceli...@gmail.com> wrote: > Hello All, > > I working on a P2P storage project for research purpose. > I want to use HDFS DataNode as a part of a research project. > One possibility is using only DataNode as a storage engine and do > everything else at upper level. In this case I will have all the metadata > management and replication mechanism at upper level and use DataNode only > for storing data per node. > > The second possibility is using also NameNode for metadata management and > modify it to fit in my project. > > I have been trying to find where to start. How much modularity is there in > HDFS? > Can I use only DataNode alone and modify it to fit in my project? What are > inputs and outputs of DataNode? Where should I start? > > If I decide to use also NameNode, where should start? > > Any comment/help is appreciated. > > Thanks > > Yasin Celik >