Should I open a ticket to allow data locality in IP per container context ?
2017-01-12 23:41 GMT+01:00 Michael Gummelt <mgumm...@mesosphere.io>: > If the executor reports a different hostname inside the CNI container, > then no, I don't think so. > > On Thu, Jan 12, 2017 at 2:28 PM, vincent gromakowski < > vincent.gromakow...@gmail.com> wrote: > >> So even if I make the Spark executors run on the same node as Casssandra >> nodes, I am not sure each worker will connect to c* nodes on the same mesos >> agent ? >> >> 2017-01-12 21:13 GMT+01:00 Michael Gummelt <mgumm...@mesosphere.io>: >> >>> The code in there w/ docs that reference CNI doesn't actually run when >>> CNI is in effect, and doesn't have anything to do with locality. It's just >>> making Spark work in a no-DNS environment >>> >>> On Thu, Jan 12, 2017 at 12:04 PM, vincent gromakowski < >>> vincent.gromakow...@gmail.com> wrote: >>> >>>> I have found this but I am not sure how it can help... >>>> https://github.com/mesosphere/spark-build/blob/a9efef8850976 >>>> f787956660262f3b77cd636f3f5/conf/spark-env.sh >>>> >>>> >>>> 2017-01-12 20:16 GMT+01:00 Michael Gummelt <mgumm...@mesosphere.io>: >>>> >>>>> That's a good point. I hadn't considered the locality implications of >>>>> CNI yet. I think tasks are placed based on the hostname reported by the >>>>> executor, which in a CNI container will be different than the >>>>> HDFS/Cassandra hostname. I'm not aware of anyone running Spark+CNI in >>>>> prod >>>>> yet, either. >>>>> >>>>> However, locality in Mesos isn't great right now anyway. Executors >>>>> are placed w/o regard to locality. Locality is only taken into account >>>>> when tasks are assigned to executors. So if you get a locality-poor >>>>> executor placement, you'll also have locality poor task placement. It >>>>> could be better. >>>>> >>>>> On Thu, Jan 12, 2017 at 7:55 AM, vincent gromakowski < >>>>> vincent.gromakow...@gmail.com> wrote: >>>>> >>>>>> Hi all, >>>>>> Does anyone have experience running Spark on Mesos with CNI (ip per >>>>>> container) ? >>>>>> How would Spark use IP or hostname for data locality with backend >>>>>> framework like HDFS or Cassandra ? >>>>>> >>>>>> V >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Michael Gummelt >>>>> Software Engineer >>>>> Mesosphere >>>>> >>>> >>>> >>> >>> >>> -- >>> Michael Gummelt >>> Software Engineer >>> Mesosphere >>> >> >> > > > -- > Michael Gummelt > Software Engineer > Mesosphere >