GitHub user andrewor14 opened a pull request:

    https://github.com/apache/spark/pull/3633

    [SPARK-4759] Avoid using empty string as default preferred location

    See JIRA for reproduction.
    
    Our use of empty string as default preferred location in 
`CoalescedRDDPartition` causes the `TaskSetManager` to schedule the 
corresponding task on host `""` (empty string). The intended semantics here, 
however, is that the partition does not have a preferred location, and the TSM 
should schedule the corresponding task accordingly.
    
    I tested this on master and 1.1.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/andrewor14/spark coalesce-preferred-loc

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/3633.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3633
    
----
commit 2f7dfb603c000a204831748f1fbaa53ef52531c8
Author: Andrew Or <[email protected]>
Date:   2014-12-08T07:53:15Z

    Avoid using empty string as default preferred location
    
    This is causing the TaskSetManager to try to schedule certain
    tasks on the host "" (empty string). The intended semantics here,
    however, is that the partition does not have preferred location,
    and the TSM should schedule the corresponding task in accordance.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to