[ 
https://issues.apache.org/jira/browse/SPARK-19259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-19259:
---------------------------------
    Labels: bulk-closed performance security  (was: performance security)

> spark locality in CNI context
> -----------------------------
>
>                 Key: SPARK-19259
>                 URL: https://issues.apache.org/jira/browse/SPARK-19259
>             Project: Spark
>          Issue Type: Improvement
>          Components: Scheduler
>         Environment: Mesos and all resources managers using CNI model 
> (Kubernetes, GKE, ECS...)
>            Reporter: Vincent gromakowski
>            Priority: Major
>              Labels: bulk-closed, performance, security
>
> When using CNI deployment model, each executor gets its own IP/hostname so 
> Spark isn't able to schedule tasks locally depending on the hostnames 
> advertised by the backend. Currently all backends providing data locality 
> with Spark use the same method: advertise the topology by giving list of 
> hostnames.
> On one hand, data locality is mandatory for large scale production jobs as 
> you can get huge performance improvement. On the other hand, CNI is clearly 
> the future network model of all container deployments providing easy service 
> discovery, isolation and security policies. So it would be very interesting 
> to mix these two features in Spark.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to