GitHub user cloud-fan opened a pull request:
https://github.com/apache/spark/pull/10604
[SPARK-12649][SQL][WIP] support reading bucketed table
TODO:
* better integration with data source API.
* correctly populate outputPartitioning/outputOrdering
* bucket pruning
* doc and tests
This PR also includes https://github.com/apache/spark/pull/10498, will
rebase after it merged.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/cloud-fan/spark bucket-read
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/10604.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #10604
----
commit 8cb24942890153bad70e003110a28d6111f0c407
Author: Wenchen Fan <[email protected]>
Date: 2015-12-28T12:15:34Z
write bucketed table
commit a9dc99722bfea886c6381abbd2e1e9366fcf9064
Author: Wenchen Fan <[email protected]>
Date: 2015-12-29T14:31:50Z
code refine
commit 4c9969848fdc35b8a5afe83f80b071d9ad310636
Author: Wenchen Fan <[email protected]>
Date: 2015-12-30T14:58:12Z
add more tests
commit d2dc9b3ce51bd84ed5f59137d2eb76c8b6bd4f9c
Author: Wenchen Fan <[email protected]>
Date: 2015-12-30T15:09:19Z
add more comments
commit b6d0a0bedb34525da1c1ed0380891d30615f62cf
Author: Wenchen Fan <[email protected]>
Date: 2016-01-04T11:25:59Z
Merge remote-tracking branch 'origin/master' into bucket-write
commit ba2329261740b08bd1a19dc8be0ef281281b84c9
Author: Wenchen Fan <[email protected]>
Date: 2016-01-04T11:58:18Z
address comments
commit 21e0c48e83319f7319ba339deb2bffde0188583d
Author: Wenchen Fan <[email protected]>
Date: 2016-01-04T12:12:52Z
fix typo
commit e3c3728fd67aea1849c8d4d1dab3658b1efb7417
Author: Wenchen Fan <[email protected]>
Date: 2016-01-04T13:45:59Z
do not break existing data source API
commit 70ebd69190e1ebd27362e17240b20bf60b5fdf16
Author: Wenchen Fan <[email protected]>
Date: 2016-01-04T14:41:15Z
debug
commit d9ad70cb5d5175ede3ebcc964ed88067bc3e3f18
Author: Wenchen Fan <[email protected]>
Date: 2016-01-05T00:46:25Z
Merge remote-tracking branch 'origin/master' into bucket-write
commit 6e3c1c0370dec30002992a3a83b2066f4d5278df
Author: Wenchen Fan <[email protected]>
Date: 2016-01-05T00:56:06Z
debug
commit d7f3000254238f47c46dfd8b3eb1299805d0dc7a
Author: Wenchen Fan <[email protected]>
Date: 2016-01-05T05:53:23Z
Merge remote-tracking branch 'origin/master' into bucket-write
commit 3df61dcf76f7991a3fc47254a54e135ad2c044dd
Author: Wenchen Fan <[email protected]>
Date: 2016-01-05T05:55:55Z
fix tests
commit d5f390d1d54bceafc7ed8bade76cd0831d095cac
Author: Wenchen Fan <[email protected]>
Date: 2016-01-05T08:18:15Z
refine
commit 3ff968b29d3852c92952454254ae6e1f7ba6599d
Author: Wenchen Fan <[email protected]>
Date: 2016-01-05T12:00:58Z
address comments
commit 74bd52461f381f67da737b8c9db595b09c77ad8d
Author: Wenchen Fan <[email protected]>
Date: 2016-01-05T13:05:35Z
improve test
commit a0c6bc3387a69a9df6868d1486af9b679f2f008e
Author: Wenchen Fan <[email protected]>
Date: 2016-01-05T15:06:01Z
support bucket read
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]