GitHub user cloud-fan opened a pull request:

    https://github.com/apache/spark/pull/21299

    [SPARK-24250][SQL] support accessing SQLConf inside tasks

    ## What changes were proposed in this pull request?
    
    Previously in #20136 we decided to forbid tasks to access `SQLConf`, 
because it doesn't work and always give you the default conf value. In #21190 
we fixed the check and all the places that violate it.
    
    Currently the pattern of accessing configs at the executor side is: read 
the configs at the driver side, then access the variables holding the config 
values in the RDD closure, so that they will be serialized to the executor 
side. Something like
    ```
    val someConf = conf.getXXX
    child.execute().mapPartitions {
      if (someConf == ...) ...
      ...
    }
    ```
    
    However, this pattern is hard to apply if the config needs to be propagated 
via a long call stack. An example is `DataType.sameType`, and see how many 
changes were made in #21190 .
    
    When it comes to code generation, it's even worse. I tried it locally and 
we need to change a ton of files to propagate configs to code generators.
    
    This PR proposes to allow tasks to access `SQLConf`. The idea is, we can 
save all the SQL configs to job properties when an SQL execution is triggered. 
At executor side we rebuild the `SQLConf` from job properties.
    
    This PR reverts #21190, please review the second commit.
    
    ## How was this patch tested?
    
    a new test suite

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloud-fan/spark config

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21299.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21299
    
----
commit e39b7d02746812e6c2eb3bc44ba3de37c12768d6
Author: Wenchen Fan <wenchen@...>
Date:   2018-05-11T08:31:25Z

    Revert "[SPARK-22938][SQL][FOLLOWUP] Assert that SQLConf.get is accessed 
only on the driver"
    
    This reverts commit a4206d58e05ab9ed6f01fee57e18dee65cbc4efc.

commit e2e4a52e38eea0abce033baccbcb37211aa8665e
Author: Wenchen Fan <wenchen@...>
Date:   2018-05-11T08:32:38Z

    support accessing SQLConf inside tasks

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to