RE: mesos be aware of hdfs location using spark

Sebastien Brennion Tue, 23 Jun 2015 08:35:46 -0700

Thank you very much!
It is much more clear to me now…

From: haosdent [mailto:[email protected]]
Sent: mardi 23 juin 2015 16:10
To: [email protected]
Subject: Re: mesos be aware of hdfs location using spark

>-          Is there a way to ensure, that there is always a spark and an hdfs 
>instance on the same mesos worker ?
I think mesos could not ensure this now.

> If they are on the same mesos worker, the two services would probably not 
> know they could talk to each other locally ?
Spark use hdfs client library to connect to hdfs. HDFS would detect that which 
is the fastest way to read data if would configure your hdfs cluster correct.

>Is it the way most of the people are using hdfs in mesos, or are they any 
>bestpractices to attach the same storage when the instance moves ?

I think it is not related to mesos. For a hdfs cluster, if you replica factor 
number is 3. After you decommission a datanode, some blocks replica would 
become to 2. This hdfs cluster still could work as normal, because the client 
would retry to connect other datanodes to get the data. But for the hdfs 
cluster, it would have some internal network copy to keep the replica factor 
number 3. There may be some glitches during this time, but I think you don't 
need worried too much.

On Tue, Jun 23, 2015 at 9:21 PM, Sebastien Brennion 
<[email protected]<mailto:[email protected]>> wrote:
Thank you for your answers… I’m new to both…

Sorry sent to quick…

I’m not sure to understand your answers…  I probably should try to reformulate 
my question…

If hdfs is in Mesos, and Spark too..
1.

-          Is there a way to ensure, that there is always a spark and an hdfs 
instance on the same mesos worker ?

-          If they are on the same mesos worker, the two services would 
probably not know they could talk to each other locally ?

2.  What I also not sure about, is how to handle the storage, if hdfs is 
running in mesos ? What happend when hdfs_1 is moved from mesos_worker_1 to 
mesos_worker_2, do all Data have to be copied ? How are handling this ?

->            I think unless you already have data replica in new hdfs 
datanode, otherwise hdfs would copy the block from other exist datanode.

Do you mean, if the Data Node instance moves, it would copy all data ?
Is it the way most of the people are using hdfs in mesos, or are they any 
bestpractices to attach the same storage when the instance moves ?

From: haosdent [mailto:[email protected]]
Sent: mardi 23 juin 2015 15:03
To: [email protected]<mailto:[email protected]>
Subject: Re: mesos be aware of hdfs location using spark

By the way, I think you problems are related to HDFS, not related to mesos. You 
could send it to hdfs user email list.

On Tue, Jun 23, 2015 at 9:01 PM, haosdent 
<[email protected]<mailto:[email protected]>> wrote:
And for your this question:
>on instances, that also contain hdfs service, to prevent all Data going over 
>the network ?
If you open HDFS Short-Circuit Local Reads, HDFS would auto read from local 
machine instead of read from network when the data exists in local machine.

On Tue, Jun 23, 2015 at 8:58 PM, haosdent 
<[email protected]<mailto:[email protected]>> wrote:
For your second question, I think unless you already have data replica in new 
hdfs datanode, otherwise hdfs would copy the block from other exist datanode.

On Tue, Jun 23, 2015 at 7:51 PM, Sebastien Brennion 
<[email protected]<mailto:[email protected]>> wrote:

Hi,

- I would like to know it there is a way to make mesos dispatch in priority 
spark jobs, on instances, that also contain hdfs service, to prevent all Data 
going over the network ?

- What I also not sure about, is how to handle the storage, if hdfs is running 
in mesos ? What happend when hdfs_1 is moved from mesos_worker_1 to 
mesos_worker_2, do all Data have to be copied ? How are handling this ?

Regards

Sébastien

--
Best Regards,
Haosdent Huang

--
Best Regards,
Haosdent Huang

--
Best Regards,
Haosdent Huang

--
Best Regards,
Haosdent Huang

RE: mesos be aware of hdfs location using spark

Reply via email to