GitHub user jayadevanmurali opened a pull request:
https://github.com/apache/spark/pull/16635
[SPARK-19059] [SQL] Unable to retrieve data from parquet table whose name
startswith underscore
## What changes were proposed in this pull request?
The initial shouldFilterOut() method invocation filter the root path
name(table name in the intial call) and remove if it contains _. I moved the
check one level below, so it first list files/directories in the given root
path and then apply filter.
(Please fill in changes proposed in this fix)
## How was this patch tested?
Added new test case for this scenario
(Please explain how this patch was tested. E.g. unit tests, integration
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise,
remove this)
Please review http://spark.apache.org/contributing.html before opening a
pull request.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/jayadevanmurali/spark branch-0.1-SPARK-19059
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/16635.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #16635
----
commit 290b86ba76e8f6032824c95079748670c2257db0
Author: jayadevanmurali <[email protected]>
Date: 2016-01-31T02:28:52Z
Merge pull request #1 from apache/master
Update from original
commit 9a7f87f7bd4b75b5149e826a7a5868f5549132cf
Author: jayadevan <[email protected]>
Date: 2016-05-10T04:04:32Z
Merge remote-tracking branch 'upstream/master'
commit d8ed5862f887ced075bf73f58e1f9c13ff340527
Author: jayadevan <[email protected]>
Date: 2016-10-24T17:32:19Z
Merge remote-tracking branch 'upstream/master'
commit cffbec2d6c044a7fb568dcb19b28b07158b870ea
Author: jayadevan <[email protected]>
Date: 2016-11-26T07:30:36Z
Merge remote-tracking branch 'upstream/master'
commit e6727500925be983bf7c345f1e3e6ad6756f19cb
Author: jayadevan <[email protected]>
Date: 2017-01-11T15:43:52Z
Merge remote-tracking branch 'upstream/master'
commit f5999b9577aaaab0d2ca27c699dfc7a1662933dd
Author: jayadevanmurali <[email protected]>
Date: 2017-01-18T18:25:15Z
Added test case to handle SPARK-19059
commit 74e4a1a2740262fcf84e8c0704ddf2df57e614a0
Author: jayadevanmurali <[email protected]>
Date: 2017-01-18T18:38:36Z
Updated listLeafFiles() handile SPARK-19059
commit dec96d848948d6c5f263b1e5535345d24c1058d3
Author: jayadevanmurali <[email protected]>
Date: 2017-01-18T18:52:05Z
Update PartitioningAwareFileIndex.scala
commit 4334b2ba2a882a94d59459a3d73b5fdb6fda6b80
Author: jayadevanmurali <[email protected]>
Date: 2017-01-18T18:58:22Z
Update PartitioningAwareFileIndex.scala
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]