GitHub user andrewor14 opened a pull request:

    https://github.com/apache/spark/pull/8710

    [SPARK-10548] [SQL] Fix concurrent SQL executions

    The query execution ID is currently passed from a thread to its children, 
which is not the intended behavior. This led to `IllegalArgumentException` when 
running queries in parallel, e.g.:
    ```
    (1 to 100).par.foreach { _ =>
      sc.parallelize(1 to 5).map { i => (i, i) }.toDF("a", "b").count()
    }
    ```
    The cause is `SparkContext`'s local properties are inherited by default. 
This patch adds a way to exclude keys we don't want to be inherited, and makes 
SQL go through that code path.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/andrewor14/spark concurrent-sql-executions

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/8710.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #8710
    
----
commit 8ceae42fbb5c08a1e5d801b1bc4beceab9f03142
Author: Andrew Or <[email protected]>
Date:   2015-09-10T23:37:55Z

    Exclude certain local properties from being inherited
    
    such as, cough cough, the SQL execution ID. This was a problem
    because scala's parallel collections spawns threads as children
    of the existing threads, causing the execution ID to be inherited
    when it shouldn't be.

commit 3ec715c4e5af5fa8d5e58c7aa93d01cd09970ae7
Author: Andrew Or <[email protected]>
Date:   2015-09-11T00:45:30Z

    Fix remove from Properties + add tests
    
    Because java.util.Properties' remove method takes in an Any
    instead of a String, there were some issues with matching the
    key's hashCode, so removing was not successful in unit tests.
    
    Instead, this commit fixes it by manually filtering out the keys
    and adding them to the child thread's properties.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to