GitHub user dilipbiswal opened a pull request:
https://github.com/apache/spark/pull/18804
[SPARK-21599] Collecting column statistics for datasource tables may fail
with java.util.NoSuchElementException
## What changes were proposed in this pull request?
In case of datasource tables (when they are stored in non-hive compatible
way) , the schema information is recorded as table properties in hive
meta-store. The alterTableStats method needs to get the schema information from
table properties for data source tables before recording the column level
statistics. Currently, we don't get the correct schema information and fail
with java.util.NoSuchElement exception.
## How was this patch tested?
A new test case is added in StatisticsSuite.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/dilipbiswal/spark datasource_stats
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/18804.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #18804
----
commit 0afefd5dde2ddbe03ded3f0e85c21b5bc65040b3
Author: Dilip Biswal <[email protected]>
Date: 2017-07-27T22:41:53Z
[SPARK-21599] Collecting column statistics for datasource tables may fail
with java.util.NoSuchElementException
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]