hi, Bhooshan, thanks for your kind response. I run the code on one of the data node of my cluster, with only one hadoop daemon running. I believe my java client code connect to the cluster correctly as I am able to retrieve fileStatus, and list files under a particular hdfs path, and similar things... However, you are right that the daemon process use the hdfs-site.xml under another folder for cloudera : /var/run/cloudera-scm-agent/process/90-hdfs-DATANODE/hdfs-site.xml.
about " retrieving the info from a live cluster", I would like to get the information beyond the configuration files(that is beyond the .xml files). Since I am able to use : conf = new Configuration() to connect to hdfs and did other operations, shouldn't I be able to retrieve the configuration variables? Thanks Demai On Mon, Sep 8, 2014 at 2:40 PM, Bhooshan Mogal <[email protected]> wrote: > Hi Demai, > > When you read a property from the conf object, it will only have a value > if the conf object contains that property. > > In your case, you created the conf object as new Configuration() -- adds > core-default and core-site.xml. > > Then you added site.xmls (hdfs-site.xml and core-site.xml) from specific > locations. If none of these files have defined dfs.data.dir, then you will > get NULL. This is expected behavior. > > What do you mean by retrieving the info from a live cluster? Even for > processes like datanode, namenode etc, the source of truth for these > properties is hdfs-site.xml. It is loaded from a specific location when you > start these services. > > Question: Where are you running the above code? Is it on a node which has > other hadoop daemons as well? > > My guess is that the path you are referring to (/etc/hadoop/conf. > cloudera.hdfs/core-site.xml) is not the right path where these config > properties are defined. Since this is a CDH cluster, you would probably be > best served by asking on the CDH mailing list as to where the right path to > these files is. > > > HTH, > Bhooshan > > > On Mon, Sep 8, 2014 at 11:47 AM, Demai Ni <[email protected]> wrote: > >> hi, experts, >> >> I am trying to get the local filesystem directory of data node. My >> cluster is using CDH5.x (hadoop 2.3) and the default configuration. So the >> datanode is under file:///dfs/dn. I didn't specify the value in >> hdfs-site.xml. >> >> My code is something like: >> >> conf = new Configuration() >> >> // test both with and without the following two lines >> conf.addResource (new >> Path("/etc/hadoop/conf.cloudera.hdfs/hdfs-site.xml")); >> conf.addResource (new >> Path("/etc/hadoop/conf.cloudera.hdfs/core-site.xml")); >> >> // I also tried get("dfs.datanode.data.dir"), which also return NULL >> String dnDir = conf.get("dfs.data.dir"); // return NULL >> >> It looks like the get only look at the configuration file instead of >> retrieving the info from the live cluster? >> >> Many thanks for your help in advance. >> >> Demai >> > > > > -- > Bhooshan >
