Thanks for pointing this out, Michael.  Based on the conversation on the PR
<https://github.com/apache/spark/pull/16944#issuecomment-285529275> this
seems like a risky change to include in a release branch with a default
other than NEVER_INFER.

+Wenchen?  What do you think?

On Thu, Apr 20, 2017 at 4:14 PM, Michael Allman <mich...@videoamp.com>
wrote:

> We've identified the cause of the change in behavior. It is related to the
> SQL conf key "spark.sql.hive.caseSensitiveInferenceMode". This key and
> its related functionality was absent from our previous build. The default
> setting in the current build was causing Spark to attempt to scan all table
> files during query analysis. Changing this setting to NEVER_INFER disabled
> this operation and resolved the issue we had.
>
> Michael
>
>
> On Apr 20, 2017, at 3:42 PM, Michael Allman <mich...@videoamp.com> wrote:
>
> I want to caution that in testing a build from this morning's branch-2.1
> we found that Hive partition pruning was not working. We found that Spark
> SQL was fetching all Hive table partitions for a very simple query whereas
> in a build from several weeks ago it was fetching only the required
> partitions. I cannot currently think of a reason for the regression outside
> of some difference between branch-2.1 from our previous build and
> branch-2.1 from this morning.
>
> That's all I know right now. We are actively investigating to find the
> root cause of this problem, and specifically whether this is a problem in
> the Spark codebase or not. I will report back when I have an answer to that
> question.
>
> Michael
>
>
> On Apr 18, 2017, at 11:59 AM, Michael Armbrust <mich...@databricks.com>
> wrote:
>
> Please vote on releasing the following candidate as Apache Spark version
> 2.1.1. The vote is open until Fri, April 21st, 2018 at 13:00 PST and
> passes if a majority of at least 3 +1 PMC votes are cast.
>
> [ ] +1 Release this package as Apache Spark 2.1.1
> [ ] -1 Do not release this package because ...
>
>
> To learn more about Apache Spark, please see http://spark.apache.org/
>
> The tag to be voted on is v2.1.1-rc3
> <https://github.com/apache/spark/tree/v2.1.1-rc3> (2ed19cff2f6ab79
> a718526e5d16633412d8c4dd4)
>
> List of JIRA tickets resolved can be found with this filter
> <https://issues.apache.org/jira/browse/SPARK-20134?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%202.1.1>
> .
>
> The release files, including signatures, digests, etc. can be found at:
> http://home.apache.org/~pwendell/spark-releases/spark-2.1.1-rc3-bin/
>
> Release artifacts are signed with the following key:
> https://people.apache.org/keys/committer/pwendell.asc
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1230/
>
> The documentation corresponding to this release can be found at:
> http://people.apache.org/~pwendell/spark-releases/spark-2.1.1-rc3-docs/
>
>
> *FAQ*
>
> *How can I help test this release?*
>
> If you are a Spark user, you can help us test this release by taking an
> existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> *What should happen to JIRA tickets still targeting 2.1.1?*
>
> Committers should look at those and triage. Extremely important bug fixes,
> documentation, and API tweaks that impact compatibility should be worked on
> immediately. Everything else please retarget to 2.1.2 or 2.2.0.
>
> *But my bug isn't fixed!??!*
>
> In order to make timely releases, we will typically not hold the release
> unless the bug in question is a regression from 2.1.0.
>
> *What happened to RC1?*
>
> There were issues with the release packaging and as a result was skipped.
>
>
>
>

Reply via email to