GitHub user xuchuanyin opened a pull request:
https://github.com/apache/carbondata/pull/2679
WIP: [CARBONDATA-2904] Support minmax datamap for external format table
Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:
- [ ] Any interfaces changed?
- [ ] Any backward compatibility impacted?
- [ ] Document update required?
- [ ] Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance
test report.
- Any additional information to help reviewers in testing this
change.
- [ ] For large changes, please consider breaking it into sub-tasks under
an umbrella JIRA.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/xuchuanyin/carbondata ef_index_dm_minmax
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/carbondata/pull/2679.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2679
----
commit 8c0e84804c266c56cada024384d9ab2eaa89e9f2
Author: xuchuanyin <xuchuanyin@...>
Date: 2018-08-20T01:38:12Z
Support build file leve index for external format table
+ support directly generate file level index
+ support create and generate file index on existing data
+ We will flatten the input files recursively and remove the duplicated
input files in one load
The folder structure of the index file looks like below:
${datamap_name}/${segment_name}/File_level_${fact_file1_path_with_base64_encoding}/${column_name}.bloomindex
../File_level_${fact_file2_path_with_base64_encoding}/${column_name}.bloomindex
Note that in this commit, the index datamap is not used during query.
commit 30a861a92ff53df8befd14ee48b7a37499ab7c96
Author: xuchuanyin <xuchuanyin@...>
Date: 2018-08-25T10:49:43Z
Support query external format using bloomfilter datamaps
support query external format using bloomfilter datamap
commit 0664a1abd19e97cbe09920635a00619c945f0a20
Author: xuchuanyin <xuchuanyin@...>
Date: 2018-08-28T06:17:51Z
rename path for minmax datamap
commit f29ec1d80acea6fcb85a55ab37c02847e9282e5b
Author: xuchuanyin <xuchuanyin@...>
Date: 2018-08-29T12:26:38Z
Fix bugs in MinMaxDataMap
make minmax datamap useable and add more tests for it
----
---