[
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036685#comment-13036685
]
Russell Melick commented on HIVE-2036:
--------------------------------------
To expand a bit on Marquis' comments.
In CompactIndexHandler.getIndexPredicateAnalyzer(), we instantiate a predicate
analyzer. My theory is that you're going to want a whole new PredicateAnalyzer
class to deal with bitmaps, and then you'll instantiate it in a very similar
way inside BitmapIndexHandler. You can also see here how we only search for
columns on which we have indexes. This is going to need to be modified, since
it currently only allows columns from a single index.
You may also want to rewrite some of the logic in
IndexWhereProcessor.process():110. It currently loops through every index
available and asks it to do a rewrite. Perhaps it should loop through every
index type and try to find the rewrites possible only using indexes of that
type.
If you look at IndexPredicateAnalyzer:123, you can see where it's making sure
that all the parent operators are AND operations. It should be easy to modify
this to allow OR operations, but I'm not sure that simply allowing them and
using the current system will maintain logical correctness. It's probably
better to start off with just AND's.
The pushedPredicate is the important thing returned by the predicate analyzer.
The pushed predicate is what it was able to recognize/process. That's the tree
you'll want to use to generate the bitmap query. The residual predicate is
what it couldn't process. There's a separate JIRA open (HIVE-2115) to use the
residual to cut down on remaining work.
The query generation lives in the IndexHandlers.generateIndexQuery(...).
You'll definitely need more logic than the simple call to
decomposedPredicate.pushedPredicate.getExprString() that is in the
CompactIndexHandler.
There are a few spots where hive.index.compact.file is used. These may need
generalized. However, Marquis may have already taken care of this with the
bitmap stuff. I don't remember what the new name for it was (I think it's
hive.index.blockfilter.file), but it's probably easiest to look in one of his
unit tests for it.
The last thing I can think of is that having multiple index types on a single
table, or queries that use multiple tables may become an issue. I created
HIVE-2128 to deal with the multiple tables.
Good luck!
> Update bitmap indexes for automatic usage
> -----------------------------------------
>
> Key: HIVE-2036
> URL: https://issues.apache.org/jira/browse/HIVE-2036
> Project: Hive
> Issue Type: Improvement
> Components: Indexing
> Affects Versions: 0.8.0
> Reporter: Russell Melick
> Assignee: Jeffrey Lym
>
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap
> index support. The bitmap code will need to be extended after it is
> committed to enable automatic use of indexing. Most work will be focused in
> the BitmapIndexHandler, which needs to generate the re-entrant QL index
> query. There may also be significant work in the IndexPredicateAnalyzer to
> support predicates with OR's, instead of just AND's as it is currently.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira