GitHub user ericl opened a pull request:
https://github.com/apache/spark/pull/15725
[SPARK-18167] Print out spark confs, and hive confs when SQLQuerySuite fails
## What changes were proposed in this pull request?
It seems the proximate cause of the test failures is that `cast(str as
decimal)` in derby will raise an exception instead of returning NULL. This is a
problem since Hive sometimes inserts `__HIVE_DEFAULT_PARTITION__` entries into
the partition table as documented here:
https://github.com/apache/hive/blob/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java#L1034
Basically, when these special default partitions are present, partition
pruning pushdown using the SQL-direct mode will fail due this cast exception.
As commented on in `MetaStoreDirectSql.java` above, this is normally fine since
Hive falls back to JDO pruning, however when the pruning predicate contains an
unsupported operator such as `>`, that will fail as well.
The only remaining question is why this behavior is nondeterministic. We
know that when the test flakes, retries do not help, therefore the cause must
be environmental. The current best hypothesis is that some config is different
between different jenkins runs, which is why this PR prints out the Spark SQL
and Hive confs for the test. The hope is that by comparing the config state for
failure vs success we can isolate the root cause of the flakiness.
## How was this patch tested?
I verified the confs are printed out in a readable form locally.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ericl/spark print-confs-out
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/15725.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #15725
----
commit 88adf1d41bce4b2eb1e8c255de3008800f6913a7
Author: Eric Liang <[email protected]>
Date: 2016-11-02T01:38:57Z
Tue Nov 1 18:38:57 PDT 2016
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]