[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-06-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054058#comment-13054058
 ] 

Hudson commented on HIVE-2036:
--

Integrated in Hive-trunk-h0.21 #790 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/790/])


 Update bitmap indexes for automatic usage
 -

 Key: HIVE-2036
 URL: https://issues.apache.org/jira/browse/HIVE-2036
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Syed S. Albiz
 Fix For: 0.8.0

 Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch, HIVE-2036.8.patch


 HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap 
 index support.  The bitmap code will need to be extended after it is 
 committed to enable automatic use of indexing.  Most work will be focused in 
 the BitmapIndexHandler, which needs to generate the re-entrant QL index 
 query.  There may also be significant work in the IndexPredicateAnalyzer to 
 support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-06-20 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13052196#comment-13052196
 ] 

John Sichi commented on HIVE-2036:
--

+1.  Will commit when tests pass.

 Update bitmap indexes for automatic usage
 -

 Key: HIVE-2036
 URL: https://issues.apache.org/jira/browse/HIVE-2036
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Syed S. Albiz
 Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch


 HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap 
 index support.  The bitmap code will need to be extended after it is 
 committed to enable automatic use of indexing.  Most work will be focused in 
 the BitmapIndexHandler, which needs to generate the re-entrant QL index 
 query.  There may also be significant work in the IndexPredicateAnalyzer to 
 support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-06-20 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13052197#comment-13052197
 ] 

John Sichi commented on HIVE-2036:
--

I mean, once the latest patch gets uploaded.

 Update bitmap indexes for automatic usage
 -

 Key: HIVE-2036
 URL: https://issues.apache.org/jira/browse/HIVE-2036
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Syed S. Albiz
 Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch


 HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap 
 index support.  The bitmap code will need to be extended after it is 
 committed to enable automatic use of indexing.  Most work will be focused in 
 the BitmapIndexHandler, which needs to generate the re-entrant QL index 
 query.  There may also be significant work in the IndexPredicateAnalyzer to 
 support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-06-17 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051368#comment-13051368
 ] 

jirapos...@reviews.apache.org commented on HIVE-2036:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/
---

(Updated 2011-06-17 22:34:18.950303)


Review request for hive and John Sichi.


Changes
---

added comments, only push filter expr into TS operator when automatic indexing 
is turned on.


Summary
---

Add support for generating index queries to support automatic usage of bitmap 
indexes. This required changing the interface to the IndexHandlers to support 
accepting queries on multiple indexes. The compact indexes were modified to use 
this new interface as well, although no functional changes were made to how 
they work. Only supports AND predicates right now, but it should be possibly to 
extend the BitmapQuery interface defined in this patch to easily support OR 
predicates as well. Currently benchmarking these changes on a test cluster.


This addresses bug HIVE-2036.
https://issues.apache.org/jira/browse/HIVE-2036


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
  ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 
3caa4cc 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
af9d7b1 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
56e7609 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
 268560d 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
 0873e1a 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 95fef73 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java d22654b 
  ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
  ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
  ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/index_auto.q.out 713bb40 
  ql/src/test/results/clientpositive/index_auto_file_format.q.out 894a556 
  ql/src/test/results/clientpositive/index_auto_multiple.q.out 27092dc 
  ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
  ql/src/test/results/clientpositive/index_auto_unused.q.out 8a1eda5 
  ql/src/test/results/clientpositive/index_bitmap3.q.out dadfa77 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/857/diff


Testing
---

Passes unit tests, additional testcase to test automatic bitmap indexing 
index_bitmap_auto.q was also added to the TestCliDriver suite. Currently 
benchmarking changes on a test cluster.


Thanks,

Syed



 Update bitmap indexes for automatic usage
 -

 Key: HIVE-2036
 URL: https://issues.apache.org/jira/browse/HIVE-2036
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Syed S. Albiz
 Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch


 HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap 
 index support.  The bitmap code will need to be extended after it is 
 committed to enable automatic use of indexing.  Most work will be focused in 
 the BitmapIndexHandler, which needs to generate the re-entrant QL index 
 query.  There may also be significant work in the IndexPredicateAnalyzer to 
 support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-06-16 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050610#comment-13050610
 ] 

jirapos...@reviews.apache.org commented on HIVE-2036:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review856
---



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
https://reviews.apache.org/r/857/#comment1865

Need to update this comment now, explaining why we don't even look for the 
filter operator any more.


- John


On 2011-06-15 23:46:24, Syed Albiz wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/857/
bq.  ---
bq.  
bq.  (Updated 2011-06-15 23:46:24)
bq.  
bq.  
bq.  Review request for hive and John Sichi.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Add support for generating index queries to support automatic usage of 
bitmap indexes. This required changing the interface to the IndexHandlers to 
support accepting queries on multiple indexes. The compact indexes were 
modified to use this new interface as well, although no functional changes were 
made to how they work. Only supports AND predicates right now, but it should be 
possibly to extend the BitmapQuery interface defined in this patch to easily 
support OR predicates as well. Currently benchmarking these changes on a test 
cluster.
bq.  
bq.  
bq.  This addresses bug HIVE-2036.
bq.  https://issues.apache.org/jira/browse/HIVE-2036
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java 
e5ee183 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 
3caa4cc 
bq.
ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
af9d7b1 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java 
PRE-CREATION 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java 
PRE-CREATION 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java 
PRE-CREATION 
bq.
ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
56e7609 
bq.ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 
d64e88b 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
 268560d 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
 0873e1a 
bq.ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 
95fef73 
bq.ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java d22654b 
bq.ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
bq.ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
bq.ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q 
PRE-CREATION 
bq.ql/src/test/results/clientpositive/index_auto.q.out 713bb40 
bq.ql/src/test/results/clientpositive/index_auto_file_format.q.out 894a556 
bq.ql/src/test/results/clientpositive/index_auto_multiple.q.out 27092dc 
bq.ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
bq.ql/src/test/results/clientpositive/index_auto_unused.q.out 8a1eda5 
bq.ql/src/test/results/clientpositive/index_bitmap3.q.out dadfa77 
bq.ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
bq.ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out 
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/857/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Passes unit tests, additional testcase to test automatic bitmap indexing 
index_bitmap_auto.q was also added to the TestCliDriver suite. Currently 
benchmarking changes on a test cluster.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Syed
bq.  
bq.



 Update bitmap indexes for automatic usage
 -

 Key: HIVE-2036
 URL: https://issues.apache.org/jira/browse/HIVE-2036
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Syed S. Albiz
 Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch


 HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap 
 index support.  The bitmap code will need to be extended after it is 
 committed to enable automatic use of indexing.  Most work will be focused in 
 the BitmapIndexHandler, which needs to 

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-06-14 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049324#comment-13049324
 ] 

jirapos...@reviews.apache.org commented on HIVE-2036:
-



bq.  On 2011-06-13 22:57:46, John Sichi wrote:
bq.   ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java, 
line 114
bq.   https://reviews.apache.org/r/857/diff/4/?file=20984#file20984line114
bq.  
bq.   I don't think this should be necessary.  We just want to propagate 
the partition column predicate (whatever it is) from the base table query to 
the index table query; partition pruning on the index table query will do the 
rest of the work.
bq.   
bq.   In other words, if the original query had
bq.   
bq.   part_key=whatever
bq.   
bq.   we want to preserve that on the index table query.  That's what the 
code is already supposed to be doing before your change; was it not working?
bq.  
bq.  
bq.  Syed Albiz wrote:
bq.  This code is to prevent automatic usage from kicking in if the index 
has not been built on the partition specified in the partition predicate. (i.e. 
if the index has only been built on partition ds=foo, and the query is select 
key from src where ds=bar; We do not want to execute an index query in this 
case. It seems like adding a test for bitmaps specifically to mirror 
index_auto_unused.q(which is where this functionality is tested for Compact 
indices) would be a good idea.

The logic for making sure that the necessary index partitions exist is already 
present in IndexWhereProcessor.checkPartitionsCoveredByIndex.  If that's not 
working, we should fix it; it should not be necessary to change the predicate 
analyzer at all.


- John


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review825
---


On 2011-06-14 04:05:43, Syed Albiz wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/857/
bq.  ---
bq.  
bq.  (Updated 2011-06-14 04:05:43)
bq.  
bq.  
bq.  Review request for hive and John Sichi.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Add support for generating index queries to support automatic usage of 
bitmap indexes. This required changing the interface to the IndexHandlers to 
support accepting queries on multiple indexes. The compact indexes were 
modified to use this new interface as well, although no functional changes were 
made to how they work. Only supports AND predicates right now, but it should be 
possibly to extend the BitmapQuery interface defined in this patch to easily 
support OR predicates as well. Currently benchmarking these changes on a test 
cluster.
bq.  
bq.  
bq.  This addresses bug HIVE-2036.
bq.  https://issues.apache.org/jira/browse/HIVE-2036
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java 
e5ee183 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 
3caa4cc 
bq.
ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
af9d7b1 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java 
PRE-CREATION 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java 
PRE-CREATION 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java 
PRE-CREATION 
bq.
ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
56e7609 
bq.ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 
d64e88b 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
 268560d 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
 0873e1a 
bq.ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
bq.ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
bq.ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q 
PRE-CREATION 
bq.ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
bq.ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
bq.ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out 
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/857/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Passes unit tests, additional testcase to test automatic bitmap indexing 
index_bitmap_auto.q was also added to the TestCliDriver suite. Currently 
benchmarking changes on a test cluster.
bq.  
bq.  
bq.  Thanks,
bq.  

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-06-14 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049406#comment-13049406
 ] 

jirapos...@reviews.apache.org commented on HIVE-2036:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review836
---



ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java
https://reviews.apache.org/r/857/#comment1806

Slight rephrasing suggested:

If multiple indexes are provided, it is up to handler to decide whether to 
use none, one, some, or all of them.  The supplied predicate may reference any 
of the columns from any of the indexes.  If the handler decides to use more 
than one index, then it is responsible for generating tasks to combine their 
search results (e.g. via a JOIN).



ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java
https://reviews.apache.org/r/857/#comment1805

This should be gone.



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java
https://reviews.apache.org/r/857/#comment1807

Delete commented-out code, or convert it into a TODO with a corresponding 
JIRA issue link.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
https://reviews.apache.org/r/857/#comment1808

Could you explain more about what's going on here?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
https://reviews.apache.org/r/857/#comment1817

Only do indexes.get(0) once.


- John


On 2011-06-14 04:05:43, Syed Albiz wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/857/
bq.  ---
bq.  
bq.  (Updated 2011-06-14 04:05:43)
bq.  
bq.  
bq.  Review request for hive and John Sichi.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Add support for generating index queries to support automatic usage of 
bitmap indexes. This required changing the interface to the IndexHandlers to 
support accepting queries on multiple indexes. The compact indexes were 
modified to use this new interface as well, although no functional changes were 
made to how they work. Only supports AND predicates right now, but it should be 
possibly to extend the BitmapQuery interface defined in this patch to easily 
support OR predicates as well. Currently benchmarking these changes on a test 
cluster.
bq.  
bq.  
bq.  This addresses bug HIVE-2036.
bq.  https://issues.apache.org/jira/browse/HIVE-2036
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java 
e5ee183 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 
3caa4cc 
bq.
ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
af9d7b1 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java 
PRE-CREATION 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java 
PRE-CREATION 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java 
PRE-CREATION 
bq.
ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
56e7609 
bq.ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 
d64e88b 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
 268560d 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
 0873e1a 
bq.ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
bq.ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
bq.ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q 
PRE-CREATION 
bq.ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
bq.ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
bq.ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out 
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/857/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Passes unit tests, additional testcase to test automatic bitmap indexing 
index_bitmap_auto.q was also added to the TestCliDriver suite. Currently 
benchmarking changes on a test cluster.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Syed
bq.  
bq.



 Update bitmap indexes for automatic usage
 -

 Key: HIVE-2036
 URL: https://issues.apache.org/jira/browse/HIVE-2036
 Project: Hive
  Issue Type: Improvement
  

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-06-14 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049436#comment-13049436
 ] 

jirapos...@reviews.apache.org commented on HIVE-2036:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/
---

(Updated 2011-06-14 21:26:21.276789)


Review request for hive and John Sichi.


Changes
---

Addressed comments, added some more commenting for why we use indexes.get(0) in 
IndexWhereProcessor as that seemed a bit unclear


Summary
---

Add support for generating index queries to support automatic usage of bitmap 
indexes. This required changing the interface to the IndexHandlers to support 
accepting queries on multiple indexes. The compact indexes were modified to use 
this new interface as well, although no functional changes were made to how 
they work. Only supports AND predicates right now, but it should be possibly to 
extend the BitmapQuery interface defined in this patch to easily support OR 
predicates as well. Currently benchmarking these changes on a test cluster.


This addresses bug HIVE-2036.
https://issues.apache.org/jira/browse/HIVE-2036


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
  ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 
3caa4cc 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
af9d7b1 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
56e7609 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
 268560d 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
 0873e1a 
  ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
  ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
  ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/857/diff


Testing
---

Passes unit tests, additional testcase to test automatic bitmap indexing 
index_bitmap_auto.q was also added to the TestCliDriver suite. Currently 
benchmarking changes on a test cluster.


Thanks,

Syed



 Update bitmap indexes for automatic usage
 -

 Key: HIVE-2036
 URL: https://issues.apache.org/jira/browse/HIVE-2036
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Syed S. Albiz
 Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch


 HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap 
 index support.  The bitmap code will need to be extended after it is 
 committed to enable automatic use of indexing.  Most work will be focused in 
 the BitmapIndexHandler, which needs to generate the re-entrant QL index 
 query.  There may also be significant work in the IndexPredicateAnalyzer to 
 support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-06-13 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13048854#comment-13048854
 ] 

jirapos...@reviews.apache.org commented on HIVE-2036:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review825
---



ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java
https://reviews.apache.org/r/857/#comment1790

I don't think this should be necessary.  We just want to propagate the 
partition column predicate (whatever it is) from the base table query to the 
index table query; partition pruning on the index table query will do the rest 
of the work.

In other words, if the original query had

part_key=whatever

we want to preserve that on the index table query.  That's what the code is 
already supposed to be doing before your change; was it not working?



- John


On 2011-06-11 19:05:42, Syed Albiz wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/857/
bq.  ---
bq.  
bq.  (Updated 2011-06-11 19:05:42)
bq.  
bq.  
bq.  Review request for hive and John Sichi.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Add support for generating index queries to support automatic usage of 
bitmap indexes. This required changing the interface to the IndexHandlers to 
support accepting queries on multiple indexes. The compact indexes were 
modified to use this new interface as well, although no functional changes were 
made to how they work. Only supports AND predicates right now, but it should be 
possibly to extend the BitmapQuery interface defined in this patch to easily 
support OR predicates as well. Currently benchmarking these changes on a test 
cluster.
bq.  
bq.  
bq.  This addresses bug HIVE-2036.
bq.  https://issues.apache.org/jira/browse/HIVE-2036
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java 
e5ee183 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 
3caa4cc 
bq.
ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
af9d7b1 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java 
PRE-CREATION 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java 
PRE-CREATION 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java 
PRE-CREATION 
bq.
ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
56e7609 
bq.ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 
d64e88b 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
 268560d 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
 0873e1a 
bq.ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
bq.ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
bq.ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q 
PRE-CREATION 
bq.ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
bq.ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
bq.ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out 
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/857/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Passes unit tests, additional testcase to test automatic bitmap indexing 
index_bitmap_auto.q was also added to the TestCliDriver suite. Currently 
benchmarking changes on a test cluster.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Syed
bq.  
bq.



 Update bitmap indexes for automatic usage
 -

 Key: HIVE-2036
 URL: https://issues.apache.org/jira/browse/HIVE-2036
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Syed S. Albiz
 Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch


 HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap 
 index support.  The bitmap code will need to be extended after it is 
 committed to enable automatic use of indexing.  Most work will be focused in 
 the BitmapIndexHandler, which needs to generate the re-entrant QL index 
 query.  There may also be significant work in the IndexPredicateAnalyzer to 
 support predicates with OR's, instead of just AND's as it is 

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-06-13 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13048857#comment-13048857
 ] 

jirapos...@reviews.apache.org commented on HIVE-2036:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review826
---



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java
https://reviews.apache.org/r/857/#comment1792

Don't bother with empty return statements.


- John


On 2011-06-11 19:05:42, Syed Albiz wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/857/
bq.  ---
bq.  
bq.  (Updated 2011-06-11 19:05:42)
bq.  
bq.  
bq.  Review request for hive and John Sichi.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Add support for generating index queries to support automatic usage of 
bitmap indexes. This required changing the interface to the IndexHandlers to 
support accepting queries on multiple indexes. The compact indexes were 
modified to use this new interface as well, although no functional changes were 
made to how they work. Only supports AND predicates right now, but it should be 
possibly to extend the BitmapQuery interface defined in this patch to easily 
support OR predicates as well. Currently benchmarking these changes on a test 
cluster.
bq.  
bq.  
bq.  This addresses bug HIVE-2036.
bq.  https://issues.apache.org/jira/browse/HIVE-2036
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java 
e5ee183 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 
3caa4cc 
bq.
ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
af9d7b1 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java 
PRE-CREATION 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java 
PRE-CREATION 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java 
PRE-CREATION 
bq.
ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
56e7609 
bq.ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 
d64e88b 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
 268560d 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
 0873e1a 
bq.ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
bq.ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
bq.ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q 
PRE-CREATION 
bq.ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
bq.ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
bq.ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out 
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/857/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Passes unit tests, additional testcase to test automatic bitmap indexing 
index_bitmap_auto.q was also added to the TestCliDriver suite. Currently 
benchmarking changes on a test cluster.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Syed
bq.  
bq.



 Update bitmap indexes for automatic usage
 -

 Key: HIVE-2036
 URL: https://issues.apache.org/jira/browse/HIVE-2036
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Syed S. Albiz
 Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch


 HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap 
 index support.  The bitmap code will need to be extended after it is 
 committed to enable automatic use of indexing.  Most work will be focused in 
 the BitmapIndexHandler, which needs to generate the re-entrant QL index 
 query.  There may also be significant work in the IndexPredicateAnalyzer to 
 support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-06-13 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13048859#comment-13048859
 ] 

jirapos...@reviews.apache.org commented on HIVE-2036:
-



bq.  On 2011-06-13 22:57:46, John Sichi wrote:
bq.   ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java, 
line 114
bq.   https://reviews.apache.org/r/857/diff/4/?file=20984#file20984line114
bq.  
bq.   I don't think this should be necessary.  We just want to propagate 
the partition column predicate (whatever it is) from the base table query to 
the index table query; partition pruning on the index table query will do the 
rest of the work.
bq.   
bq.   In other words, if the original query had
bq.   
bq.   part_key=whatever
bq.   
bq.   we want to preserve that on the index table query.  That's what the 
code is already supposed to be doing before your change; was it not working?
bq.  

This code is to prevent automatic usage from kicking in if the index has not 
been built on the partition specified in the partition predicate. (i.e. if the 
index has only been built on partition ds=foo, and the query is select key from 
src where ds=bar; We do not want to execute an index query in this case. It 
seems like adding a test for bitmaps specifically to mirror 
index_auto_unused.q(which is where this functionality is tested for Compact 
indices) would be a good idea.


- Syed


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review825
---


On 2011-06-11 19:05:42, Syed Albiz wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/857/
bq.  ---
bq.  
bq.  (Updated 2011-06-11 19:05:42)
bq.  
bq.  
bq.  Review request for hive and John Sichi.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Add support for generating index queries to support automatic usage of 
bitmap indexes. This required changing the interface to the IndexHandlers to 
support accepting queries on multiple indexes. The compact indexes were 
modified to use this new interface as well, although no functional changes were 
made to how they work. Only supports AND predicates right now, but it should be 
possibly to extend the BitmapQuery interface defined in this patch to easily 
support OR predicates as well. Currently benchmarking these changes on a test 
cluster.
bq.  
bq.  
bq.  This addresses bug HIVE-2036.
bq.  https://issues.apache.org/jira/browse/HIVE-2036
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java 
e5ee183 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 
3caa4cc 
bq.
ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
af9d7b1 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java 
PRE-CREATION 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java 
PRE-CREATION 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java 
PRE-CREATION 
bq.
ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
56e7609 
bq.ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 
d64e88b 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
 268560d 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
 0873e1a 
bq.ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
bq.ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
bq.ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q 
PRE-CREATION 
bq.ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
bq.ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
bq.ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out 
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/857/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Passes unit tests, additional testcase to test automatic bitmap indexing 
index_bitmap_auto.q was also added to the TestCliDriver suite. Currently 
benchmarking changes on a test cluster.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Syed
bq.  
bq.



 Update bitmap indexes for automatic usage
 -

 Key: HIVE-2036
 URL: https://issues.apache.org/jira/browse/HIVE-2036
 Project: Hive
  Issue Type: Improvement
  

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-06-11 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047980#comment-13047980
 ] 

jirapos...@reviews.apache.org commented on HIVE-2036:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/
---

(Updated 2011-06-11 19:05:42.241706)


Review request for hive and John Sichi.


Changes
---

Fix index_auto_unused.q testcase by adding a check for partitions in the index 
and ensuring that only partitions actually in the index are used to compute 
index predicates.


Summary
---

Add support for generating index queries to support automatic usage of bitmap 
indexes. This required changing the interface to the IndexHandlers to support 
accepting queries on multiple indexes. The compact indexes were modified to use 
this new interface as well, although no functional changes were made to how 
they work. Only supports AND predicates right now, but it should be possibly to 
extend the BitmapQuery interface defined in this patch to easily support OR 
predicates as well. Currently benchmarking these changes on a test cluster.


This addresses bug HIVE-2036.
https://issues.apache.org/jira/browse/HIVE-2036


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
  ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 
3caa4cc 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
af9d7b1 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
56e7609 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
 268560d 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
 0873e1a 
  ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
  ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
  ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/857/diff


Testing
---

Passes unit tests, additional testcase to test automatic bitmap indexing 
index_bitmap_auto.q was also added to the TestCliDriver suite. Currently 
benchmarking changes on a test cluster.


Thanks,

Syed



 Update bitmap indexes for automatic usage
 -

 Key: HIVE-2036
 URL: https://issues.apache.org/jira/browse/HIVE-2036
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Syed S. Albiz
 Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch


 HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap 
 index support.  The bitmap code will need to be extended after it is 
 committed to enable automatic use of indexing.  Most work will be focused in 
 the BitmapIndexHandler, which needs to generate the re-entrant QL index 
 query.  There may also be significant work in the IndexPredicateAnalyzer to 
 support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-06-10 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047029#comment-13047029
 ] 

jirapos...@reviews.apache.org commented on HIVE-2036:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/
---

(Updated 2011-06-10 06:35:32.125295)


Review request for hive and John Sichi.


Changes
---

Based on a discussion with yongqian, I re-implemented the predicate 
decomposition into two steps, computing the overall residual predicate from the 
union of all columns in the available indexes, and then computing the 
predicates to apply to each index individually. Additionally I have also 
extended the functionality to pass in partition columns to allowColumnNames and 
added/extended the testcases to check that partition predicates are propagated 
correctly. This required adding a check in IndexWhereProcessor.java that the 
correct FilterOperator was passed to the process(...) method (apparently a 
duplicate FilterOperator that does not have the entire predicate gets created).


Summary
---

Add support for generating index queries to support automatic usage of bitmap 
indexes. This required changing the interface to the IndexHandlers to support 
accepting queries on multiple indexes. The compact indexes were modified to use 
this new interface as well, although no functional changes were made to how 
they work. Only supports AND predicates right now, but it should be possibly to 
extend the BitmapQuery interface defined in this patch to easily support OR 
predicates as well. Currently benchmarking these changes on a test cluster.


This addresses bug HIVE-2036.
https://issues.apache.org/jira/browse/HIVE-2036


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
af9d7b1 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
56e7609 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
 268560d 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
 0873e1a 
  ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
  ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
  ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/857/diff


Testing
---

Passes unit tests, additional testcase to test automatic bitmap indexing 
index_bitmap_auto.q was also added to the TestCliDriver suite. Currently 
benchmarking changes on a test cluster.


Thanks,

Syed



 Update bitmap indexes for automatic usage
 -

 Key: HIVE-2036
 URL: https://issues.apache.org/jira/browse/HIVE-2036
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Syed S. Albiz
 Attachments: HIVE-2036.1.patch


 HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap 
 index support.  The bitmap code will need to be extended after it is 
 committed to enable automatic use of indexing.  Most work will be focused in 
 the BitmapIndexHandler, which needs to generate the re-entrant QL index 
 query.  There may also be significant work in the IndexPredicateAnalyzer to 
 support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-06-08 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046277#comment-13046277
 ] 

John Sichi commented on HIVE-2036:
--

Added a few new comments on review board.

 Update bitmap indexes for automatic usage
 -

 Key: HIVE-2036
 URL: https://issues.apache.org/jira/browse/HIVE-2036
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Syed S. Albiz
 Attachments: HIVE-2036.1.patch


 HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap 
 index support.  The bitmap code will need to be extended after it is 
 committed to enable automatic use of indexing.  Most work will be focused in 
 the BitmapIndexHandler, which needs to generate the re-entrant QL index 
 query.  There may also be significant work in the IndexPredicateAnalyzer to 
 support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-06-08 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046278#comment-13046278
 ] 

jirapos...@reviews.apache.org commented on HIVE-2036:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review785
---



ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java
https://reviews.apache.org/r/857/#comment1681

I think that should be a period instead of a comma in indexes, if



ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java
https://reviews.apache.org/r/857/#comment1682

How exactly are they combined?  This Javadoc should be written as a 
contract between the optimizer and the index plugin author, so that the author 
knows exactly how to interpret the inputs and also what is going to be done 
with the output.




ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java
https://reviews.apache.org/r/857/#comment1683

Why do you need to use toArray here?  indexCols.keySet is already a 
collection.



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java
https://reviews.apache.org/r/857/#comment1684

Why are you converting the search conditions back into predicate form here? 
 Wouldn't it be easier to analyze them as search conditions?


- John


On 2011-06-08 00:22:37, Syed Albiz wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/857/
bq.  ---
bq.  
bq.  (Updated 2011-06-08 00:22:37)
bq.  
bq.  
bq.  Review request for hive and John Sichi.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Add support for generating index queries to support automatic usage of 
bitmap indexes. This required changing the interface to the IndexHandlers to 
support accepting queries on multiple indexes. The compact indexes were 
modified to use this new interface as well, although no functional changes were 
made to how they work. Only supports AND predicates right now, but it should be 
possibly to extend the BitmapQuery interface defined in this patch to easily 
support OR predicates as well. Currently benchmarking these changes on a test 
cluster.
bq.  
bq.  
bq.  This addresses bug HIVE-2036.
bq.  https://issues.apache.org/jira/browse/HIVE-2036
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q 
PRE-CREATION 
bq.ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
bq.ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
 0873e1a 
bq.
ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
56e7609 
bq.ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 
d64e88b 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
 268560d 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java 
PRE-CREATION 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java 
PRE-CREATION 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java 
PRE-CREATION 
bq.ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java 
e5ee183 
bq.
ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
af9d7b1 
bq.ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out 
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/857/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Passes unit tests, additional testcase to test automatic bitmap indexing 
index_bitmap_auto.q was also added to the TestCliDriver suite. Currently 
benchmarking changes on a test cluster.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Syed
bq.  
bq.



 Update bitmap indexes for automatic usage
 -

 Key: HIVE-2036
 URL: https://issues.apache.org/jira/browse/HIVE-2036
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Syed S. Albiz
 Attachments: HIVE-2036.1.patch


 HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap 
 index support.  The bitmap code will need to be extended after it is 
 committed to enable automatic use of indexing.  Most work will be focused in 
 the BitmapIndexHandler, which needs to generate the 

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-06-07 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045578#comment-13045578
 ] 

jirapos...@reviews.apache.org commented on HIVE-2036:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review773
---



ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java
https://reviews.apache.org/r/857/#comment1666

Update Javadoc and param name, including an explanation of what handler is 
supposed to do when multiple indexes are passed in.



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java
https://reviews.apache.org/r/857/#comment1675

I'm confused by the logic here.  You are throwing together all of the 
columns for all of the indexes, but we need to keep them segregated, don't we?  
Each subquery should only contain references to the columns relevant to the 
corresponding index.

(But the partitioning predicates need to be applied to each index.)




ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java
https://reviews.apache.org/r/857/#comment1668

Why is this public instead of private?



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java
https://reviews.apache.org/r/857/#comment1667

Use HiveUtils.unparseIdentifier



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/HiveBitmapIndexInputFormat.java
https://reviews.apache.org/r/857/#comment1669

Why do we need this class at all?  The superclass already uses 
hive.index.blockfilter.file by default.




ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
https://reviews.apache.org/r/857/#comment1672

Seems like we should only be looking at the indexes on the table accessed 
by this table scan.  (This comment is retroactive to the original version of 
the file.)




ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
https://reviews.apache.org/r/857/#comment1673

Seems like the costing comment below applies to this too.



ql/src/test/queries/clientpositive/index_bitmap3.q
https://reviews.apache.org/r/857/#comment1670

Why do we need this setting at all?  (I'm not sure why it was there in the 
original version of the file.)


- John


On 2011-06-06 21:37:38, Syed Albiz wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/857/
bq.  ---
bq.  
bq.  (Updated 2011-06-06 21:37:38)
bq.  
bq.  
bq.  Review request for hive and John Sichi.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Add support for generating index queries to support automatic usage of 
bitmap indexes. This required changing the interface to the IndexHandlers to 
support accepting queries on multiple indexes. The compact indexes were 
modified to use this new interface as well, although no functional changes were 
made to how they work. Only supports AND predicates right now, but it should be 
possibly to extend the BitmapQuery interface defined in this patch to easily 
support OR predicates as well. Currently benchmarking these changes on a test 
cluster.
bq.  
bq.  
bq.  This addresses bug HIVE-2036.
bq.  https://issues.apache.org/jira/browse/HIVE-2036
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java 
e5ee183 
bq.
ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
af9d7b1 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java 
PRE-CREATION 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java 
PRE-CREATION 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java 
PRE-CREATION 
bq.
ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/HiveBitmapIndexInputFormat.java
 PRE-CREATION 
bq.
ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
56e7609 
bq.ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 
d64e88b 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
 268560d 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
 0873e1a 
bq.ql/src/test/queries/clientpositive/index_bitmap3.q 508eb94 
bq.ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
bq.ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/857/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  

[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-06-07 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045718#comment-13045718
 ] 

jirapos...@reviews.apache.org commented on HIVE-2036:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/
---

(Updated 2011-06-08 00:22:37.292935)


Review request for hive and John Sichi.


Changes
---

Addressed comments. Still does not propagate partition predicates to every 
single index sub-query, but it does ensure that predicates are only applied to 
indexes for which there are matching columns. After looking at the behavior of 
CompactIndexHandler on partitioned tables (and in testcase 
index_auto_partitioned.q) I can't quite see how the CompactIndexHandler 
identifies and propagates partitioning predicates correctly.


Summary
---

Add support for generating index queries to support automatic usage of bitmap 
indexes. This required changing the interface to the IndexHandlers to support 
accepting queries on multiple indexes. The compact indexes were modified to use 
this new interface as well, although no functional changes were made to how 
they work. Only supports AND predicates right now, but it should be possibly to 
extend the BitmapQuery interface defined in this patch to easily support OR 
predicates as well. Currently benchmarking these changes on a test cluster.


This addresses bug HIVE-2036.
https://issues.apache.org/jira/browse/HIVE-2036


Diffs (updated)
-

  ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
  ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
 0873e1a 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
56e7609 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
 268560d 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
af9d7b1 
  ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/857/diff


Testing
---

Passes unit tests, additional testcase to test automatic bitmap indexing 
index_bitmap_auto.q was also added to the TestCliDriver suite. Currently 
benchmarking changes on a test cluster.


Thanks,

Syed



 Update bitmap indexes for automatic usage
 -

 Key: HIVE-2036
 URL: https://issues.apache.org/jira/browse/HIVE-2036
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Syed S. Albiz
 Attachments: HIVE-2036.1.patch


 HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap 
 index support.  The bitmap code will need to be extended after it is 
 committed to enable automatic use of indexing.  Most work will be focused in 
 the BitmapIndexHandler, which needs to generate the re-entrant QL index 
 query.  There may also be significant work in the IndexPredicateAnalyzer to 
 support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-06-07 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045770#comment-13045770
 ] 

John Sichi commented on HIVE-2036:
--

I'll take a look at the new patch tomorrow.  index_auto_partitioned.q does not 
actually include a predicate on the partitioning column, so it should be 
enhanced to do that.

The way it works for the compact index handler is that if we have a predicate 
like

{noformat}
WHERE part_col = 1 AND index_col = 2 AND some_other_col = 3
{noformat}

then it should generate

{noformat}
WHERE part_col = 1 AND index_col = 2
{noformat}

in the internal query against the index table.  That's the reason that 
getIndexPredicateAnalyzer walks through all the partitions and adds the 
predicate columns via allowColumnName.  (The way it does it isn't so great 
since it repeats it for each partition, when in fact one partition should be 
good enough.)


 Update bitmap indexes for automatic usage
 -

 Key: HIVE-2036
 URL: https://issues.apache.org/jira/browse/HIVE-2036
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Syed S. Albiz
 Attachments: HIVE-2036.1.patch


 HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap 
 index support.  The bitmap code will need to be extended after it is 
 committed to enable automatic use of indexing.  Most work will be focused in 
 the BitmapIndexHandler, which needs to generate the re-entrant QL index 
 query.  There may also be significant work in the IndexPredicateAnalyzer to 
 support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-06-06 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045140#comment-13045140
 ] 

jirapos...@reviews.apache.org commented on HIVE-2036:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/
---

Review request for hive and John Sichi.


Summary
---

Add support for generating index queries to support automatic usage of bitmap 
indexes. This required changing the interface to the IndexHandlers to support 
accepting queries on multiple indexes. The compact indexes were modified to use 
this new interface as well, although no functional changes were made to how 
they work. Only supports AND predicates right now, but it should be possibly to 
extend the BitmapQuery interface defined in this patch to easily support OR 
predicates as well. Currently benchmarking these changes on a test cluster.


This addresses bug HIVE-2036.
https://issues.apache.org/jira/browse/HIVE-2036


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
af9d7b1 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/HiveBitmapIndexInputFormat.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
56e7609 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
 268560d 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
 0873e1a 
  ql/src/test/queries/clientpositive/index_bitmap3.q 508eb94 
  ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/857/diff


Testing
---

Passes unit tests, additional testcase to test automatic bitmap indexing 
index_bitmap_auto.q was also added to the TestCliDriver suite. Currently 
benchmarking changes on a test cluster.


Thanks,

Syed



 Update bitmap indexes for automatic usage
 -

 Key: HIVE-2036
 URL: https://issues.apache.org/jira/browse/HIVE-2036
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Syed S. Albiz

 HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap 
 index support.  The bitmap code will need to be extended after it is 
 committed to enable automatic use of indexing.  Most work will be focused in 
 the BitmapIndexHandler, which needs to generate the re-entrant QL index 
 query.  There may also be significant work in the IndexPredicateAnalyzer to 
 support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-05-20 Thread Marquis Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036709#comment-13036709
 ] 

Marquis Wang commented on HIVE-2036:


Russell is right. hive.index.compact.file is deprecated and replaced with 
hive.index.blockfilter.file (I think). I kept the former around for 
backwards-compatibility reasons, but we should try to avoid using it.

 Update bitmap indexes for automatic usage
 -

 Key: HIVE-2036
 URL: https://issues.apache.org/jira/browse/HIVE-2036
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Syed S. Albiz

 HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap 
 index support.  The bitmap code will need to be extended after it is 
 committed to enable automatic use of indexing.  Most work will be focused in 
 the BitmapIndexHandler, which needs to generate the re-entrant QL index 
 query.  There may also be significant work in the IndexPredicateAnalyzer to 
 support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-05-19 Thread Marquis Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036449#comment-13036449
 ] 

Marquis Wang commented on HIVE-2036:


Making notes on how to do this:

One of the difficult/different parts about using bitmap indexes is that the 
only time they become useful is when multiple indexes are combined. Thus, you 
need a query that joins the various bitmap index tables and returns the blocks 
that contain the rows we want.

Thus the two parts to writing the automatic use index handler for bitmap 
indexes are:

1. Figuring out what indexes to use:

As mentioned above, you may need to extend the IndexPredicateAnalyzer to 
support ORs and possibly to return a tree of predicates (I don't think it 
already does this).

2. Building a query that accesses the index tables:

This is an example query that I know works for querying the index tables in the 
query

{noformat}
SELECT * FROM lineitem WHERE  L_QUANTITY = 50.0 AND L_DISCOUNT = 0.08 AND L_TAX 
= 0.01;
{noformat}

{noformat}
SELECT bucketname AS `_bucketname`, COLLECT_SET(offset) as `_offsets`
FROM (SELECT
`_bucketname` AS bucketname, `_offset` AS offset
  FROM
(SELECT ab.`_bucketname`, ab.`_offset`, EWAH_BITMAP_AND(ab.bitmap, 
c.`_bitmaps`) as bitmap FROM
  (SELECT a.`_bucketname`, b.`_offset`, EWAH_BITMAP_AND(a.`_bitmaps`, 
b.`_bitmaps`) as bitmap FROM 
(SELECT * FROM default__lineitem_quantity__ WHERE L_QUANTITY = 
50.0) a JOIN 
(SELECT * FROM default__lineitem_discount__ WHERE L_DISCOUNT = 
0.08) b 
ON a.`_bucketname` = b.`_bucketname` AND a.`_offset` = 
b.`_offset`) ab JOIN
  (SELECT * FROM default__lineitem_tax__ WHERE L_TAX = 0.01) c
ON ab.`_bucketname` = c.`_bucketname` AND ab.`_offset` = 
c.`_offset`) abc 
  WHERE 
NOT EWAH_BITMAP_EMPTY(abc.bitmap)
) t
GROUP BY bucketname;
{noformat}

This format is perfect for joining any number of AND predicates. I'm pretty 
sure you can figure out how to expand them to include OR predicates and 
different grounping of predicates as well. If you make any changes/extensions 
to the format you should be sure to test them to make sure they have the 
performance characteristics you want.

 Update bitmap indexes for automatic usage
 -

 Key: HIVE-2036
 URL: https://issues.apache.org/jira/browse/HIVE-2036
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Jeffrey Lym

 HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap 
 index support.  The bitmap code will need to be extended after it is 
 committed to enable automatic use of indexing.  Most work will be focused in 
 the BitmapIndexHandler, which needs to generate the re-entrant QL index 
 query.  There may also be significant work in the IndexPredicateAnalyzer to 
 support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-05-19 Thread Russell Melick (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036685#comment-13036685
 ] 

Russell Melick commented on HIVE-2036:
--

To expand a bit on Marquis' comments.

In CompactIndexHandler.getIndexPredicateAnalyzer(), we instantiate a predicate 
analyzer.  My theory is that you're going to want a whole new PredicateAnalyzer 
class to deal with bitmaps, and then you'll instantiate it in a very similar 
way inside BitmapIndexHandler.  You can also see here how we only search for 
columns on which we have indexes.  This is going to need to be modified, since 
it currently only allows columns from a single index.

You may also want to rewrite some of the logic in 
IndexWhereProcessor.process():110.  It currently loops through every index 
available and asks it to do a rewrite.  Perhaps it should loop through every 
index type and try to find the rewrites possible only using indexes of that 
type.

If you look at IndexPredicateAnalyzer:123, you can see where it's making sure 
that all the parent operators are AND operations.  It should be easy to modify 
this to allow OR operations, but I'm not sure that simply allowing them and 
using the current system will maintain logical correctness.  It's probably 
better to start off with just AND's.

The pushedPredicate is the important thing returned by the predicate analyzer.  
The pushed predicate is what it was able to recognize/process.  That's the tree 
you'll want to use to generate the bitmap query.  The residual predicate is 
what it couldn't process. There's a separate JIRA open (HIVE-2115) to use the 
residual to cut down on remaining work.

The query generation lives in the IndexHandlers.generateIndexQuery(...).  
You'll definitely need more logic than the simple call to 
decomposedPredicate.pushedPredicate.getExprString() that is in the 
CompactIndexHandler.

There are a few spots where hive.index.compact.file is used.  These may need 
generalized.  However, Marquis may have already taken care of this with the 
bitmap stuff.  I don't remember what the new name for it was (I think it's 
hive.index.blockfilter.file), but it's probably easiest to look in one of his 
unit tests for it.

The last thing I can think of is that having multiple index types on a single 
table, or queries that use multiple tables may become an issue.  I created 
HIVE-2128 to deal with the multiple tables.

Good luck!

 Update bitmap indexes for automatic usage
 -

 Key: HIVE-2036
 URL: https://issues.apache.org/jira/browse/HIVE-2036
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Jeffrey Lym

 HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap 
 index support.  The bitmap code will need to be extended after it is 
 committed to enable automatic use of indexing.  Most work will be focused in 
 the BitmapIndexHandler, which needs to generate the re-entrant QL index 
 query.  There may also be significant work in the IndexPredicateAnalyzer to 
 support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira