Yes, two datanodes can coexist. You just have to change those config parameters. I believe you might also have to change hadoop.tmp.dir, which is where it stores a file to remember the PID of each process.
Matei On May 6, 2012, at 9:16 PM, Nairan Zhang wrote: > Thank you, Matei. > > Before I try, may I ask if having this configuration can allow two datanode > processes co-existing in one slave machine? For example, > > one process name: datanode; pid: 20000 (belong to 1st hdfs) > another process name: datanode (yes, again); pid: 30000 (belong to 2nd hdfs) > > Nairan > > 2012/5/6 Matei Zaharia <[email protected]> > >> Hi Nairan, >> >> HDFS doesn't normally run on top of Mesos, and we generally expect people >> to have only one instance of HDFS, which multiple instances of MapReduce >> (or other frameworks) would share. If you want two instances of HDFS, you >> need to set them up manually, and configure them to use different ports. >> Here are the Hadoop settings you need to change: >> >> fs.default.name (contains port of NameNode) >> dfs.http.address (web UI of NameNode) >> dfs.datanode.address >> dfs.datanode.ipc.address >> dfs.datanode.http.address >> dfs.secondary.http.address >> dfs.name.dir >> dfs.data.dir >> >> We actually do this in our EC2 scripts ( >> https://github.com/mesos/mesos/wiki/EC2-Scripts), which will launch a >> Mesos cluster with both a "persistent" and an "ephemeral" HDFS for you. You >> might take a look at how that gets configured. >> >> Matei >> >> On May 6, 2012, at 7:04 PM, Nairan Zhang wrote: >> >>> Hi, >>> >>> It seems it disallows to have the second datanode in one machine. Is it a >>> common problem? Anybody can please help me out a little bit? Thanks, >>> >>> Nairan >> >>
