GitHub user jackylk opened a pull request:
https://github.com/apache/carbondata/pull/2003
[CARBONDATA-2206] support lucene index datamap
This PR is an initial effort to integrate lucene as an index datamap into
carbondata.
A new module called carbondata-lucene is added to support lucene datamap:
1. Add LuceneFineGrainDataMap, implement FineGrainDataMap interface.
2. Add LuceneCoarseGrainDataMap, implement CoarseGrainDataMap interface.
3. Support writing lucene index via LuceneDataMapWriter.
4. Implement LuceneDataMapFactory
5. A UDF called `text_match` is added
User can use lucene datamap as:
```
CREATE TABLE main(id INT, name STRING, city STRING, age INT)
STORED BY 'carbondata'
CREATE DATAMAP dm ON TABLE main
USING
'org.apache.carbondata.datamap.lucene.LuceneFineGrainDataMapFactory'
SELECT * FROM main WHERE TEXT_MATCH('name:n10')
```
- [ ] Any interfaces changed?
- [ ] Any backward compatibility impacted?
- [ ] Document update required?
- [ ] Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance
test report.
- Any additional information to help reviewers in testing this
change.
- [ ] For large changes, please consider breaking it into sub-tasks under
an umbrella JIRA.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/jackylk/incubator-carbondata
lucene-datamap-initial2
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/carbondata/pull/2003.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2003
----
commit e1d5b6c88b06d0c9d418008002d10a52368a0d84
Author: Jacky Li <jacky.likun@...>
Date: 2018-02-26T08:30:38Z
support lucene index datamap
----
---