Re: how to set up two hdfs for two hadoops

Matei Zaharia Mon, 07 May 2012 10:37:18 -0700

Yes, two datanodes can coexist. You just have to change those config 
parameters. I believe you might also have to change hadoop.tmp.dir, which is 
where it stores a file to remember the PID of each process.


Matei

On May 6, 2012, at 9:16 PM, Nairan Zhang wrote:

> Thank you, Matei.
> 
> Before I try, may I ask if having this configuration can allow two datanode
> processes co-existing in one slave machine? For example,
> 
> one process name: datanode; pid: 20000 (belong to 1st hdfs)
> another process name: datanode (yes, again); pid: 30000 (belong to 2nd hdfs)
> 
> Nairan
> 
> 2012/5/6 Matei Zaharia <[email protected]>
> 
>> Hi Nairan,
>> 
>> HDFS doesn't normally run on top of Mesos, and we generally expect people
>> to have only one instance of HDFS, which multiple instances of MapReduce
>> (or other frameworks) would share. If you want two instances of HDFS, you
>> need to set them up manually, and configure them to use different ports.
>> Here are the Hadoop settings you need to change:
>> 
>> fs.default.name  (contains port of NameNode)
>> dfs.http.address  (web UI of NameNode)
>> dfs.datanode.address
>> dfs.datanode.ipc.address
>> dfs.datanode.http.address
>> dfs.secondary.http.address
>> dfs.name.dir
>> dfs.data.dir
>> 
>> We actually do this in our EC2 scripts (
>> https://github.com/mesos/mesos/wiki/EC2-Scripts), which will launch a
>> Mesos cluster with both a "persistent" and an "ephemeral" HDFS for you. You
>> might take a look at how that gets configured.
>> 
>> Matei
>> 
>> On May 6, 2012, at 7:04 PM, Nairan Zhang wrote:
>> 
>>> Hi,
>>> 
>>> It seems it disallows to have the second datanode in one machine. Is it a
>>> common problem? Anybody can please help me out a little bit? Thanks,
>>> 
>>> Nairan
>> 
>>

Re: how to set up two hdfs for two hadoops

Reply via email to