GitHub user ravipesala opened a pull request:
https://github.com/apache/carbondata/pull/1179
[WIP] Added the blocklet info to index file and make the datamap
distributable with job
In this PR following tasks are completed.
1. Added the blocklet info to the carbonindex file, so datamap not required
to read each carbondata file footer to the blocklet information. This makes the
datamap loading faster.
2. Made the data map distributable and added the spark job. So datamap
pruning could happen distributable and pruned blocklet list would be sent to
driver.
This PR cannot compile as carbondata format changes are present.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ravipesala/incubator-carbondata datamap
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/carbondata/pull/1179.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1179
----
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---