[
https://issues.apache.org/jira/browse/CARBONDATA-2747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
jiangmanhua updated CARBONDATA-2747:
------------------------------------
Description:
similar problem in bloom datamap is in issue CARBONDATA-2746;
Analysis:
In `DataMapChooser#extractColumnExpression`, it does not deal with
`MatchExpression`. This makes no information to use the column name to filter
datamap.
In `DataMapChooser#contains`, all datamap are marked as useful if lucene
datamap is hit ( `ExpressionType.TEXT_MATCH`). Then the first datamap is chosen
after sort step(sort by number of index column) .
In `LuceneDataMapFactoryBase#toDistributable`, carbon getAllIndexDirs and build
DataMapDistributable for each index in same segment. This means that one
segment will be applied `prune` by different index datamap(lucene use
`indexPath` in `LuceneDataMapDistributable` to init its datamap object and
build the `indexSearcherMap`)
In out test case, we build datamaps on columns:name and city, one for each.
Query uses column `name` as filter. Unfortunately, in the `DataMapChooser`, it
chooses datamap of city
So,
was:similar problem in bloom datamap is in issue CARBONDATA-2746
> Lucene build wrong DataMapDistributable for all datamaps with same
> DataMapSchema
> --------------------------------------------------------------------------------
>
> Key: CARBONDATA-2747
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2747
> Project: CarbonData
> Issue Type: Bug
> Reporter: jiangmanhua
> Priority: Major
>
> similar problem in bloom datamap is in issue CARBONDATA-2746;
>
> Analysis:
> In `DataMapChooser#extractColumnExpression`, it does not deal with
> `MatchExpression`. This makes no information to use the column name to filter
> datamap.
>
> In `DataMapChooser#contains`, all datamap are marked as useful if lucene
> datamap is hit ( `ExpressionType.TEXT_MATCH`). Then the first datamap is
> chosen after sort step(sort by number of index column) .
>
> In `LuceneDataMapFactoryBase#toDistributable`, carbon getAllIndexDirs and
> build DataMapDistributable for each index in same segment. This means that
> one segment will be applied `prune` by different index datamap(lucene use
> `indexPath` in `LuceneDataMapDistributable` to init its datamap object and
> build the `indexSearcherMap`)
>
> In out test case, we build datamaps on columns:name and city, one for each.
> Query uses column `name` as filter. Unfortunately, in the `DataMapChooser`,
> it chooses datamap of city
> So,
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)