[
https://issues.apache.org/jira/browse/SPARK-19259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon updated SPARK-19259:
---------------------------------
Labels: bulk-closed performance security (was: performance security)
> spark locality in CNI context
> -----------------------------
>
> Key: SPARK-19259
> URL: https://issues.apache.org/jira/browse/SPARK-19259
> Project: Spark
> Issue Type: Improvement
> Components: Scheduler
> Environment: Mesos and all resources managers using CNI model
> (Kubernetes, GKE, ECS...)
> Reporter: Vincent gromakowski
> Priority: Major
> Labels: bulk-closed, performance, security
>
> When using CNI deployment model, each executor gets its own IP/hostname so
> Spark isn't able to schedule tasks locally depending on the hostnames
> advertised by the backend. Currently all backends providing data locality
> with Spark use the same method: advertise the topology by giving list of
> hostnames.
> On one hand, data locality is mandatory for large scale production jobs as
> you can get huge performance improvement. On the other hand, CNI is clearly
> the future network model of all container deployments providing easy service
> discovery, isolation and security policies. So it would be very interesting
> to mix these two features in Spark.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]