Re: mesos be aware of hdfs location using spark

zhou weitao Tue, 23 Jun 2015 20:14:42 -0700

2015-06-23 21:21 GMT+08:00 Sebastien Brennion <[email protected]>
:


>  Thank you for your answers… I’m new to both…
>
>
>
> Sorry sent to quick…
>
>
>
> I’m not sure to understand your answers…  I probably should try to
> reformulate my question…
>
>
> If hdfs is in Mesos, and Spark too..
>
> 1.
>
> -          Is there a way to ensure, that there is always a spark and an
> hdfs instance on the same mesos worker ?
>
AFAIK, under current 0.22,
   1. config HDFS in mesos slaves: A, B, C
   2. set the "attributes(mesos-slave --help)" as HDFS of A, B, C, (or set
the "role" for fine grain constraint)
   3. constraint spark driver/framework to run job on A, B, C only. (I
didn't test this step, but Maybe)


>  -          If they are on the same mesos worker, the two services would
> probably not know they could talk to each other locally ?
>
Mesos wouldn't be responsible for it. It's spark&HDFS issue.

>
>
> 2.  What I also not sure about, is how to handle the storage, if hdfs is
> running in mesos ? What happend when hdfs_1 is moved from mesos_worker_1 to
> mesos_worker_2, do all Data have to be copied ? How are handling this ?
>
> ->            I think unless you already have data replica in new hdfs
> datanode, otherwise hdfs would copy the block from other exist datanode.
>
> Do you mean, if the Data Node instance moves, it would copy all data ?
> Is it the way most of the people are using hdfs in mesos, or are they any
> bestpractices to attach the same storage when the instance moves ?
>
>
>
> *From:* haosdent [mailto:[email protected] <[email protected]>]
> *Sent:* mardi 23 juin 2015 15:03
> *To:* [email protected]
> *Subject:* Re: mesos be aware of hdfs location using spark
>
>
>
> By the way, I think you problems are related to HDFS, not related to
> mesos. You could send it to hdfs user email list.
>
>
>
> On Tue, Jun 23, 2015 at 9:01 PM, haosdent <[email protected]> wrote:
>
>  And for your this question:
>
> >on instances, that also contain hdfs service, to prevent all Data going
> over the network ?
>
> If you open HDFS Short-Circuit Local Reads, HDFS would auto read from
> local machine instead of read from network when the data exists in local
> machine.
>
>
>
> On Tue, Jun 23, 2015 at 8:58 PM, haosdent <[email protected]> wrote:
>
>  For your second question, I think unless you already have data replica
> in new hdfs datanode, otherwise hdfs would copy the block from other exist
> datanode.
>
>
>
> On Tue, Jun 23, 2015 at 7:51 PM, Sebastien Brennion <
> [email protected]> wrote:
>
>  Hi,
>
>
>
> - I would like to know it there is a way to make mesos dispatch in
> priority spark jobs, on instances, that also contain hdfs service, to
> prevent all Data going over the network ?
>
>
>
> - What I also not sure about, is how to handle the storage, if hdfs is
> running in mesos ? What happend when hdfs_1 is moved from mesos_worker_1 to
> mesos_worker_2, do all Data have to be copied ? How are handling this ?
>
>
>
> Regards
>
> Sébastien
>
>
>
>
>
> --
>
> Best Regards,
>
> Haosdent Huang
>
>
>
>
>
> --
>
> Best Regards,
>
> Haosdent Huang
>
>
>
>
>
> --
>
> Best Regards,
>
> Haosdent Huang
>

Re: mesos be aware of hdfs location using spark

Reply via email to