[
https://issues.apache.org/jira/browse/SPARK-6987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574983#comment-14574983
]
Russell Alexander Spitzer commented on SPARK-6987:
--------------------------------------------------
Or being able to specify an identifier for each spark worker that wasn't
dependent on ip?
> Node Locality is determined with String Matching instead of Inet Comparison
> ---------------------------------------------------------------------------
>
> Key: SPARK-6987
> URL: https://issues.apache.org/jira/browse/SPARK-6987
> Project: Spark
> Issue Type: Bug
> Components: Scheduler, Spark Core
> Affects Versions: 1.2.0, 1.3.0
> Reporter: Russell Alexander Spitzer
>
> When determining whether or not a task can be run NodeLocal the
> TaskSetManager ends up using a direct string comparison between the
> preferredIp and the executor's bound interface.
> https://github.com/apache/spark/blob/c84d91692aa25c01882bcc3f9fd5de3cfa786195/core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala#L878-L880
> https://github.com/apache/spark/blob/c84d91692aa25c01882bcc3f9fd5de3cfa786195/core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala#L488-L490
> This means that the preferredIp must be a direct string match of the ip the
> the worker is bound to. This means that apis which are gathering data from
> other distributed sources must develop their own mapping between the
> interfaces bound (or exposed) by the external sources and the interface bound
> by the Spark executor since these may be different.
> For example, Cassandra exposes a broadcast rpc address which doesn't have to
> match the address which the service is bound to. This means when adding
> preferredLocation data we must add both the rpc and the listen address to
> ensure that we can get a string match (and of course we are out of luck if
> Spark has been bound on to another interface).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]