[jira] [Created] (HIVE-2204) unable to get column names for a table that has '_' as part of its table name

2011-06-07 Thread Mythili Gopalakrishnan (JIRA)
unable to get column names for a table that has '_' as part of its table name
-

 Key: HIVE-2204
 URL: https://issues.apache.org/jira/browse/HIVE-2204
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.8.0
Reporter: Mythili Gopalakrishnan


I have a table age_group and I am trying to get list of columns for this table 
name. As '_' and '% have special meaning in table search pattern according to 
JDBC searchPattern string specification, I escape the '_' in my table name when 
I call getColumns. But HIVE does not return any columns. My call to getColumns 
is as follows
catalog null
schemaPattern   %
tableNamePatternage\_group
columnNamePattern   %

If I don't escape the '_' in my table name, I am able to get the list of 
columns.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-2205) Miscellaneous code improvements in all the packages

2011-06-07 Thread Chinna Rao Lalam (JIRA)
Miscellaneous code improvements in all the packages
---

 Key: HIVE-2205
 URL: https://issues.apache.org/jira/browse/HIVE-2205
 Project: Hive
  Issue Type: Bug
  Components: CLI, Query Processor, Serializers/Deserializers, Server 
Infrastructure
 Environment: Hadoop 0.20.1, Hive0.8.0 and SUSE Linux Enterprise Server 
10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5)
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam


Miscellaneous code improvements from all the packages and some code cleanup.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2178) Log related Check style Comments fixes

2011-06-07 Thread Chinna Rao Lalam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-2178:
---

Attachment: HIVE-2178.patch

 Log related Check style Comments fixes
 --

 Key: HIVE-2178
 URL: https://issues.apache.org/jira/browse/HIVE-2178
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.5.0, 0.8.0
 Environment: Hadoop 0.20.1, Hive0.8.0 and SUSE Linux Enterprise 
 Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5)
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-2178.patch


 Fix Log related Check style Comments

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2198) While using Hive in server mode, HiveConnection.close() is not cleaning up server side resources

2011-06-07 Thread Chinna Rao Lalam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-2198:
---

Attachment: HIVE-2198.patch

 While using Hive in server mode, HiveConnection.close() is not cleaning up 
 server side resources
 

 Key: HIVE-2198
 URL: https://issues.apache.org/jira/browse/HIVE-2198
 Project: Hive
  Issue Type: Bug
  Components: Server Infrastructure
Affects Versions: 0.5.0, 0.8.0
 Environment: Hadoop 0.20.1, Hive0.8.0 and SUSE Linux Enterprise 
 Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5)
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-2198.patch


 org.apache.hadoop.hive.service.ThriftHive.Client.clean() method is called for 
 every session end in CLI mode for the cleanup but in HiveServer mode this is 
 not called.
 So this can be integrate with the HiveConnection.close()

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2205) Miscellaneous code improvements in all the packages

2011-06-07 Thread Chinna Rao Lalam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-2205:
---

Attachment: HIVE-2205.patch

 Miscellaneous code improvements in all the packages
 ---

 Key: HIVE-2205
 URL: https://issues.apache.org/jira/browse/HIVE-2205
 Project: Hive
  Issue Type: Bug
  Components: CLI, Query Processor, Serializers/Deserializers, Server 
 Infrastructure
 Environment: Hadoop 0.20.1, Hive0.8.0 and SUSE Linux Enterprise 
 Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5)
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-2205.patch


 Miscellaneous code improvements from all the packages and some code cleanup.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: HIVE-2036: Update bitmap indexes for automatic usage

2011-06-07 Thread John Sichi

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review773
---



ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java
https://reviews.apache.org/r/857/#comment1666

Update Javadoc and param name, including an explanation of what handler is 
supposed to do when multiple indexes are passed in.



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java
https://reviews.apache.org/r/857/#comment1675

I'm confused by the logic here.  You are throwing together all of the 
columns for all of the indexes, but we need to keep them segregated, don't we?  
Each subquery should only contain references to the columns relevant to the 
corresponding index.

(But the partitioning predicates need to be applied to each index.)




ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java
https://reviews.apache.org/r/857/#comment1668

Why is this public instead of private?



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java
https://reviews.apache.org/r/857/#comment1667

Use HiveUtils.unparseIdentifier



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/HiveBitmapIndexInputFormat.java
https://reviews.apache.org/r/857/#comment1669

Why do we need this class at all?  The superclass already uses 
hive.index.blockfilter.file by default.




ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
https://reviews.apache.org/r/857/#comment1672

Seems like we should only be looking at the indexes on the table accessed 
by this table scan.  (This comment is retroactive to the original version of 
the file.)




ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
https://reviews.apache.org/r/857/#comment1673

Seems like the costing comment below applies to this too.



ql/src/test/queries/clientpositive/index_bitmap3.q
https://reviews.apache.org/r/857/#comment1670

Why do we need this setting at all?  (I'm not sure why it was there in the 
original version of the file.)


- John


On 2011-06-06 21:37:38, Syed Albiz wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/857/
 ---
 
 (Updated 2011-06-06 21:37:38)
 
 
 Review request for hive and John Sichi.
 
 
 Summary
 ---
 
 Add support for generating index queries to support automatic usage of bitmap 
 indexes. This required changing the interface to the IndexHandlers to support 
 accepting queries on multiple indexes. The compact indexes were modified to 
 use this new interface as well, although no functional changes were made to 
 how they work. Only supports AND predicates right now, but it should be 
 possibly to extend the BitmapQuery interface defined in this patch to easily 
 support OR predicates as well. Currently benchmarking these changes on a test 
 cluster.
 
 
 This addresses bug HIVE-2036.
 https://issues.apache.org/jira/browse/HIVE-2036
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
   ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
 af9d7b1 
   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java 
 PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/HiveBitmapIndexInputFormat.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
 56e7609 
   ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
  268560d 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
  0873e1a 
   ql/src/test/queries/clientpositive/index_bitmap3.q 508eb94 
   ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
   ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/857/diff
 
 
 Testing
 ---
 
 Passes unit tests, additional testcase to test automatic bitmap indexing 
 index_bitmap_auto.q was also added to the TestCliDriver suite. Currently 
 benchmarking changes on a test cluster.
 
 
 Thanks,
 
 Syed
 




[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-06-07 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045578#comment-13045578
 ] 

jirapos...@reviews.apache.org commented on HIVE-2036:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review773
---



ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java
https://reviews.apache.org/r/857/#comment1666

Update Javadoc and param name, including an explanation of what handler is 
supposed to do when multiple indexes are passed in.



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java
https://reviews.apache.org/r/857/#comment1675

I'm confused by the logic here.  You are throwing together all of the 
columns for all of the indexes, but we need to keep them segregated, don't we?  
Each subquery should only contain references to the columns relevant to the 
corresponding index.

(But the partitioning predicates need to be applied to each index.)




ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java
https://reviews.apache.org/r/857/#comment1668

Why is this public instead of private?



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java
https://reviews.apache.org/r/857/#comment1667

Use HiveUtils.unparseIdentifier



ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/HiveBitmapIndexInputFormat.java
https://reviews.apache.org/r/857/#comment1669

Why do we need this class at all?  The superclass already uses 
hive.index.blockfilter.file by default.




ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
https://reviews.apache.org/r/857/#comment1672

Seems like we should only be looking at the indexes on the table accessed 
by this table scan.  (This comment is retroactive to the original version of 
the file.)




ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
https://reviews.apache.org/r/857/#comment1673

Seems like the costing comment below applies to this too.



ql/src/test/queries/clientpositive/index_bitmap3.q
https://reviews.apache.org/r/857/#comment1670

Why do we need this setting at all?  (I'm not sure why it was there in the 
original version of the file.)


- John


On 2011-06-06 21:37:38, Syed Albiz wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/857/
bq.  ---
bq.  
bq.  (Updated 2011-06-06 21:37:38)
bq.  
bq.  
bq.  Review request for hive and John Sichi.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Add support for generating index queries to support automatic usage of 
bitmap indexes. This required changing the interface to the IndexHandlers to 
support accepting queries on multiple indexes. The compact indexes were 
modified to use this new interface as well, although no functional changes were 
made to how they work. Only supports AND predicates right now, but it should be 
possibly to extend the BitmapQuery interface defined in this patch to easily 
support OR predicates as well. Currently benchmarking these changes on a test 
cluster.
bq.  
bq.  
bq.  This addresses bug HIVE-2036.
bq.  https://issues.apache.org/jira/browse/HIVE-2036
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java 
e5ee183 
bq.
ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
af9d7b1 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java 
PRE-CREATION 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java 
PRE-CREATION 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java 
PRE-CREATION 
bq.
ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/HiveBitmapIndexInputFormat.java
 PRE-CREATION 
bq.
ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
56e7609 
bq.ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 
d64e88b 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
 268560d 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
 0873e1a 
bq.ql/src/test/queries/clientpositive/index_bitmap3.q 508eb94 
bq.ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
bq.ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/857/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  

Skew Join Optimization in hive

2011-06-07 Thread Shantian Purkad
Hi,

I have a query which joins 12 different tables (most of them left outer joins) 
and the query takes almost 3 hours. 90% of the time is taken by a single 
reducer. One reducer is getting bulk of the data to process.

How can I get around this and have fair distribution of data across all 
reducers? I tried to enable the skewjoin optimization but getting below NPE 
after first step of the job is executed.

Any suggestions/ideas will be or great help.

Thanks,
Shantian

2011-06-07 19:22:28,923 Stage-11 map = 100%,  reduce = 85%
2011-06-07 19:22:30,932 Stage-11 map = 100%,  reduce = 100%
Ended Job = job_201106071542_0010
java.lang.NullPointerException
    at 
org.apache.hadoop.hive.ql.plan.ConditionalResolverSkewJoin.getTasks(ConditionalResolverSkewJoin.java:97)
    at 
org.apache.hadoop.hive.ql.exec.ConditionalTask.execute(ConditionalTask.java:81)
    at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130)
    at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
    at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1063)
    at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:900)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:748)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
FAILED: Execution Error, return code -101 from 
org.apache.hadoop.hive.ql.exec.ConditionalTask
hive 

Re: Review Request: HIVE-2188: Add a function to retrieve multiple tables on trip to the hive metastore

2011-06-07 Thread Carl Steinbach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/831/#review778
---



trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
https://reviews.apache.org/r/831/#comment1676

Sorry I missed this before, but you need to add a getTableObjectsByName() 
method to IMetaStoreClient and HiveMetaStoreClient, and use that method here 
instead of calling client.client.get_table_objects_by_name(). Everything else 
looks good so, so you can either update this patch, or else open another ticket 
to address this issue. Please let me know what you want to do. Thanks.


- Carl


On 2011-06-07 01:11:44, Sohan Jain wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/831/
 ---
 
 (Updated 2011-06-07 01:11:44)
 
 
 Review request for hive, Paul Yang and Ashutosh Chauhan.
 
 
 Summary
 ---
 
 Created a function multi_get_table that retrieves multiple tables on one 
 trip to the hive metastore, saving round trip time.
 
 
 This addresses bug HIVE-2188.
 https://issues.apache.org/jira/browse/HIVE-2188
 
 
 Diffs
 -
 
   trunk/metastore/if/hive_metastore.thrift 1130342 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 1130342 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 1130342 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
 1130342 
   
 trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
  1130342 
 
 Diff: https://reviews.apache.org/r/831/diff
 
 
 Testing
 ---
 
 Added a test case to testMetasore() in TestHiveServer.  Also tested for speed 
 improvements in a client session.
 
 
 Thanks,
 
 Sohan
 




[jira] [Commented] (HIVE-2188) Add multi_get_table function in Hive Metastore

2011-06-07 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045610#comment-13045610
 ] 

jirapos...@reviews.apache.org commented on HIVE-2188:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/831/#review778
---



trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
https://reviews.apache.org/r/831/#comment1676

Sorry I missed this before, but you need to add a getTableObjectsByName() 
method to IMetaStoreClient and HiveMetaStoreClient, and use that method here 
instead of calling client.client.get_table_objects_by_name(). Everything else 
looks good so, so you can either update this patch, or else open another ticket 
to address this issue. Please let me know what you want to do. Thanks.


- Carl


On 2011-06-07 01:11:44, Sohan Jain wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/831/
bq.  ---
bq.  
bq.  (Updated 2011-06-07 01:11:44)
bq.  
bq.  
bq.  Review request for hive, Paul Yang and Ashutosh Chauhan.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Created a function multi_get_table that retrieves multiple tables on one 
trip to the hive metastore, saving round trip time.
bq.  
bq.  
bq.  This addresses bug HIVE-2188.
bq.  https://issues.apache.org/jira/browse/HIVE-2188
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.trunk/metastore/if/hive_metastore.thrift 1130342 
bq.
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1130342 
bq.
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
1130342 
bq.trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
1130342 
bq.
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 1130342 
bq.  
bq.  Diff: https://reviews.apache.org/r/831/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Added a test case to testMetasore() in TestHiveServer.  Also tested for 
speed improvements in a client session.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Sohan
bq.  
bq.



 Add multi_get_table function in Hive Metastore
 --

 Key: HIVE-2188
 URL: https://issues.apache.org/jira/browse/HIVE-2188
 Project: Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Sohan Jain
Assignee: Sohan Jain
Priority: Minor
 Attachments: HIVE-2188.1.patch, HIVE-2188.3.patch


 This function would get multiple tables from the hive metastore as opposed to 
 just one at a time, saving round trip time to the metastore.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-06-07 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045718#comment-13045718
 ] 

jirapos...@reviews.apache.org commented on HIVE-2036:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/
---

(Updated 2011-06-08 00:22:37.292935)


Review request for hive and John Sichi.


Changes
---

Addressed comments. Still does not propagate partition predicates to every 
single index sub-query, but it does ensure that predicates are only applied to 
indexes for which there are matching columns. After looking at the behavior of 
CompactIndexHandler on partitioned tables (and in testcase 
index_auto_partitioned.q) I can't quite see how the CompactIndexHandler 
identifies and propagates partitioning predicates correctly.


Summary
---

Add support for generating index queries to support automatic usage of bitmap 
indexes. This required changing the interface to the IndexHandlers to support 
accepting queries on multiple indexes. The compact indexes were modified to use 
this new interface as well, although no functional changes were made to how 
they work. Only supports AND predicates right now, but it should be possibly to 
extend the BitmapQuery interface defined in this patch to easily support OR 
predicates as well. Currently benchmarking these changes on a test cluster.


This addresses bug HIVE-2036.
https://issues.apache.org/jira/browse/HIVE-2036


Diffs (updated)
-

  ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
  ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
 0873e1a 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
56e7609 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
 268560d 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
af9d7b1 
  ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/857/diff


Testing
---

Passes unit tests, additional testcase to test automatic bitmap indexing 
index_bitmap_auto.q was also added to the TestCliDriver suite. Currently 
benchmarking changes on a test cluster.


Thanks,

Syed



 Update bitmap indexes for automatic usage
 -

 Key: HIVE-2036
 URL: https://issues.apache.org/jira/browse/HIVE-2036
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Syed S. Albiz
 Attachments: HIVE-2036.1.patch


 HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap 
 index support.  The bitmap code will need to be extended after it is 
 committed to enable automatic use of indexing.  Most work will be focused in 
 the BitmapIndexHandler, which needs to generate the re-entrant QL index 
 query.  There may also be significant work in the IndexPredicateAnalyzer to 
 support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1595) job name for alter table T archive partition P is not correct

2011-06-07 Thread Paul Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Yang updated HIVE-1595:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed. Thanks Sohan!

 job name for alter table T archive partition P is not correct
 -

 Key: HIVE-1595
 URL: https://issues.apache.org/jira/browse/HIVE-1595
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Sohan Jain
 Attachments: Hive-1595.1.patch, Hive-1595.2.patch


 For some internal runs, I saw the job name as hadoop-0.20.1-tools.jar, which 
 makes it difficult to identify

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2156) Improve error messages emitted during task execution

2011-06-07 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045730#comment-13045730
 ] 

jirapos...@reviews.apache.org commented on HIVE-2156:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/777/
---

(Updated 2011-06-08 00:55:49.148248)


Review request for hive and John Sichi.


Changes
---

Addressed Ning's comments. After testing on the cluster, I ran into the coarse 
timeout a few times, so I think it makes sense to have a fine timeout on 
grabbing the task completions and then proceed with the rest of the JobDebugger 
stuff. I have added a separate configuration variable to toggle the timeout for 
grabbing task completions, we may just want to scale the overall timeout by 
some factor.


Summary
---

- Add local error messages to point to job logs and provide TaskIDs
- Add a timeout to the fetching of task logs and errors


This addresses bug HIVE-2156.
https://issues.apache.org/jira/browse/HIVE-2156


Diffs (updated)
-

  build-common.xml a2236e1 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b94cdc6 
  conf/hive-default.xml 1317a9c 
  ql/build.xml 449b47a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 40d2644 
  ql/src/java/org/apache/hadoop/hive/ql/exec/JobDebugger.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java 53769a0 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapredLocalTask.java 691f038 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ec816e9 
  ql/src/test/queries/clientnegative/minimr_broken_pipe.q PRE-CREATION 
  ql/src/test/results/clientnegative/dyn_part3.q.out 5f4df65 
  ql/src/test/results/clientnegative/index_compact_entry_limit.q.out fcb2673 
  ql/src/test/results/clientnegative/index_compact_size_limit.q.out fcb2673 
  ql/src/test/results/clientnegative/minimr_broken_pipe.q.out PRE-CREATION 
  ql/src/test/results/clientnegative/script_broken_pipe1.q.out d33d2cc 
  ql/src/test/results/clientnegative/script_broken_pipe2.q.out afbaa44 
  ql/src/test/results/clientnegative/script_broken_pipe3.q.out fe8f757 
  ql/src/test/results/clientnegative/script_error.q.out c72d780 
  ql/src/test/results/clientnegative/udf_reflect_neg.q.out f2082a3 
  ql/src/test/results/clientnegative/udf_test_error.q.out 5fd9a00 
  ql/src/test/results/clientnegative/udf_test_error_reduce.q.out ddc5e5b 
  ql/src/test/templates/TestNegativeCliDriver.vm ec13f79 

Diff: https://reviews.apache.org/r/777/diff


Testing
---

Tested TestNegativeCliDriver in both local and miniMR mode


Thanks,

Syed



 Improve error messages emitted during task execution
 

 Key: HIVE-2156
 URL: https://issues.apache.org/jira/browse/HIVE-2156
 Project: Hive
  Issue Type: Improvement
Reporter: Syed S. Albiz
Assignee: Syed S. Albiz
 Attachments: HIVE-2156.1.patch, HIVE-2156.2.patch


 Follow-up to HIVE-1731
 A number of issues were related to reporting errors from task execution and 
 surfacing these in a more useful form.
 Currently a cryptic message with Execution Error and a return code and 
 class name of the task is emitted.
 The most useful log messages here are emitted to the local logs, which can be 
 found through jobtracker. Having either a pointer to these logs as part of 
 the error message or the actual content would improve the usefulness 
 substantially. It may also warrant looking into how the underlying error 
 reporting through Hadoop is done and if more information can be propagated up 
 from there.
 Specific issues raised in  HIVE-1731:
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.MapRedTask
 * issue was in regexp_extract syntax
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 * tried: desc table_does_not_exist;

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Hive-trunk-h0.21 #766

2011-06-07 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-trunk-h0.21/766/changes

Changes:

[sdong] HIVE-2199. Bug in Block-level Merge Task When Doing Temp Directory Move 
(Franklin Hu via Siying Dong)

--
[...truncated 30722 lines...]
[junit] OK
[junit] PREHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-06-07_18-13-57_364_749641891527943147/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks determined at compile time: 1
[junit] In order to change the average load for a reducer (in bytes):
[junit]   set hive.exec.reducers.bytes.per.reducer=number
[junit] In order to limit the maximum number of reducers:
[junit]   set hive.exec.reducers.max=number
[junit] In order to set a constant number of reducers:
[junit]   set mapred.reduce.tasks=number
[junit] Job running in-process (local Hadoop)
[junit] Hadoop job information for null: number of mappers: 0; number of 
reducers: 0
[junit] 2011-06-07 18:14:00,525 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-06-07_18-13-57_364_749641891527943147/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/build/service/tmp/hive_job_log_hudson_201106071814_631627579.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/data/files/kv1.txt' 
into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] Copying data from 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/data/files/kv1.txt' 
into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-06-07_18-14-02_291_319657495146013165/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-06-07_18-14-02_291_319657495146013165/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/build/service/tmp/hive_job_log_hudson_201106071814_1424942808.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] 

Re: Review Request: HIVE-2156: Improve Execution Error Messages

2011-06-07 Thread Syed Albiz

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/777/
---

(Updated 2011-06-08 00:55:49.148248)


Review request for hive and John Sichi.


Changes
---

Addressed Ning's comments. After testing on the cluster, I ran into the coarse 
timeout a few times, so I think it makes sense to have a fine timeout on 
grabbing the task completions and then proceed with the rest of the JobDebugger 
stuff. I have added a separate configuration variable to toggle the timeout for 
grabbing task completions, we may just want to scale the overall timeout by 
some factor.


Summary
---

- Add local error messages to point to job logs and provide TaskIDs
- Add a timeout to the fetching of task logs and errors


This addresses bug HIVE-2156.
https://issues.apache.org/jira/browse/HIVE-2156


Diffs (updated)
-

  build-common.xml a2236e1 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b94cdc6 
  conf/hive-default.xml 1317a9c 
  ql/build.xml 449b47a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 40d2644 
  ql/src/java/org/apache/hadoop/hive/ql/exec/JobDebugger.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java 53769a0 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapredLocalTask.java 691f038 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ec816e9 
  ql/src/test/queries/clientnegative/minimr_broken_pipe.q PRE-CREATION 
  ql/src/test/results/clientnegative/dyn_part3.q.out 5f4df65 
  ql/src/test/results/clientnegative/index_compact_entry_limit.q.out fcb2673 
  ql/src/test/results/clientnegative/index_compact_size_limit.q.out fcb2673 
  ql/src/test/results/clientnegative/minimr_broken_pipe.q.out PRE-CREATION 
  ql/src/test/results/clientnegative/script_broken_pipe1.q.out d33d2cc 
  ql/src/test/results/clientnegative/script_broken_pipe2.q.out afbaa44 
  ql/src/test/results/clientnegative/script_broken_pipe3.q.out fe8f757 
  ql/src/test/results/clientnegative/script_error.q.out c72d780 
  ql/src/test/results/clientnegative/udf_reflect_neg.q.out f2082a3 
  ql/src/test/results/clientnegative/udf_test_error.q.out 5fd9a00 
  ql/src/test/results/clientnegative/udf_test_error_reduce.q.out ddc5e5b 
  ql/src/test/templates/TestNegativeCliDriver.vm ec13f79 

Diff: https://reviews.apache.org/r/777/diff


Testing
---

Tested TestNegativeCliDriver in both local and miniMR mode


Thanks,

Syed



Re: Review Request: HIVE-2036: Update bitmap indexes for automatic usage

2011-06-07 Thread Syed Albiz


 On 2011-06-07 18:30:15, John Sichi wrote:
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java, 
  line 54
  https://reviews.apache.org/r/857/diff/1/?file=20596#file20596line54
 
  Use HiveUtils.unparseIdentifier

HiveUtils.unparseIdentifier is used on the argument passed in through to the 
constructor.


 On 2011-06-07 18:30:15, John Sichi wrote:
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/HiveBitmapIndexInputFormat.java,
   line 25
  https://reviews.apache.org/r/857/diff/1/?file=20599#file20599line25
 
  Why do we need this class at all?  The superclass already uses 
  hive.index.blockfilter.file by default.
 

removed in next diff


- Syed


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review773
---


On 2011-06-08 00:22:37, Syed Albiz wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/857/
 ---
 
 (Updated 2011-06-08 00:22:37)
 
 
 Review request for hive and John Sichi.
 
 
 Summary
 ---
 
 Add support for generating index queries to support automatic usage of bitmap 
 indexes. This required changing the interface to the IndexHandlers to support 
 accepting queries on multiple indexes. The compact indexes were modified to 
 use this new interface as well, although no functional changes were made to 
 how they work. Only supports AND predicates right now, but it should be 
 possibly to extend the BitmapQuery interface defined in this patch to easily 
 support OR predicates as well. Currently benchmarking these changes on a test 
 cluster.
 
 
 This addresses bug HIVE-2036.
 https://issues.apache.org/jira/browse/HIVE-2036
 
 
 Diffs
 -
 
   ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q 
 PRE-CREATION 
   ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
   ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
  0873e1a 
   
 ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
 56e7609 
   ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
  268560d 
   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
   ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
 af9d7b1 
   ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/857/diff
 
 
 Testing
 ---
 
 Passes unit tests, additional testcase to test automatic bitmap indexing 
 index_bitmap_auto.q was also added to the TestCliDriver suite. Currently 
 benchmarking changes on a test cluster.
 
 
 Thanks,
 
 Syed
 




Re: Review Request: HIVE-2036: Update bitmap indexes for automatic usage

2011-06-07 Thread Syed Albiz

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/
---

(Updated 2011-06-08 00:22:37.292935)


Review request for hive and John Sichi.


Changes
---

Addressed comments. Still does not propagate partition predicates to every 
single index sub-query, but it does ensure that predicates are only applied to 
indexes for which there are matching columns. After looking at the behavior of 
CompactIndexHandler on partitioned tables (and in testcase 
index_auto_partitioned.q) I can't quite see how the CompactIndexHandler 
identifies and propagates partitioning predicates correctly.


Summary
---

Add support for generating index queries to support automatic usage of bitmap 
indexes. This required changing the interface to the IndexHandlers to support 
accepting queries on multiple indexes. The compact indexes were modified to use 
this new interface as well, although no functional changes were made to how 
they work. Only supports AND predicates right now, but it should be possibly to 
extend the BitmapQuery interface defined in this patch to easily support OR 
predicates as well. Currently benchmarking these changes on a test cluster.


This addresses bug HIVE-2036.
https://issues.apache.org/jira/browse/HIVE-2036


Diffs (updated)
-

  ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
  ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
 0873e1a 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
56e7609 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
 268560d 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
af9d7b1 
  ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/857/diff


Testing
---

Passes unit tests, additional testcase to test automatic bitmap indexing 
index_bitmap_auto.q was also added to the TestCliDriver suite. Currently 
benchmarking changes on a test cluster.


Thanks,

Syed



Re: Review Request: HIVE-2188: Add a function to retrieve multiple tables on trip to the hive metastore

2011-06-07 Thread Sohan Jain

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/831/
---

(Updated 2011-06-08 02:53:35.735457)


Review request for hive, Paul Yang and Ashutosh Chauhan.


Changes
---

- added getTableObjectsByName to IMetaStoreClient and HiveMetaStoreClient
- modified the unit tests to reflect this change


Summary
---

Created a function multi_get_table that retrieves multiple tables on one trip 
to the hive metastore, saving round trip time.


This addresses bug HIVE-2188.
https://issues.apache.org/jira/browse/HIVE-2188


Diffs (updated)
-

  trunk/metastore/if/hive_metastore.thrift 1133230 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1133230 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1133230 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
1133230 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
1133230 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
1133230 
  
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 1133230 

Diff: https://reviews.apache.org/r/831/diff


Testing
---

Added a test case to testMetasore() in TestHiveServer.  Also tested for speed 
improvements in a client session.


Thanks,

Sohan



[jira] [Commented] (HIVE-2188) Add multi_get_table function in Hive Metastore

2011-06-07 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045759#comment-13045759
 ] 

jirapos...@reviews.apache.org commented on HIVE-2188:
-



bq.  On 2011-06-07 20:09:34, Carl Steinbach wrote:
bq.   
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java,
 line 748
bq.   https://reviews.apache.org/r/831/diff/3/?file=20654#file20654line748
bq.  
bq.   Sorry I missed this before, but you need to add a 
getTableObjectsByName() method to IMetaStoreClient and HiveMetaStoreClient, and 
use that method here instead of calling 
client.client.get_table_objects_by_name(). Everything else looks good so, so 
you can either update this patch, or else open another ticket to address this 
issue. Please let me know what you want to do. Thanks.

Hi Carl, I have updated the patch here.  Thanks for your help.


- Sohan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/831/#review778
---


On 2011-06-07 01:11:44, Sohan Jain wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/831/
bq.  ---
bq.  
bq.  (Updated 2011-06-07 01:11:44)
bq.  
bq.  
bq.  Review request for hive, Paul Yang and Ashutosh Chauhan.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Created a function multi_get_table that retrieves multiple tables on one 
trip to the hive metastore, saving round trip time.
bq.  
bq.  
bq.  This addresses bug HIVE-2188.
bq.  https://issues.apache.org/jira/browse/HIVE-2188
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.trunk/metastore/if/hive_metastore.thrift 1130342 
bq.
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1130342 
bq.
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
1130342 
bq.trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
1130342 
bq.
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 1130342 
bq.  
bq.  Diff: https://reviews.apache.org/r/831/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Added a test case to testMetasore() in TestHiveServer.  Also tested for 
speed improvements in a client session.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Sohan
bq.  
bq.



 Add multi_get_table function in Hive Metastore
 --

 Key: HIVE-2188
 URL: https://issues.apache.org/jira/browse/HIVE-2188
 Project: Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Sohan Jain
Assignee: Sohan Jain
Priority: Minor
 Attachments: HIVE-2188.1.patch, HIVE-2188.3.patch


 This function would get multiple tables from the hive metastore as opposed to 
 just one at a time, saving round trip time to the metastore.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2188) Add multi_get_table function in Hive Metastore

2011-06-07 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045758#comment-13045758
 ] 

jirapos...@reviews.apache.org commented on HIVE-2188:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/831/
---

(Updated 2011-06-08 02:53:35.735457)


Review request for hive, Paul Yang and Ashutosh Chauhan.


Changes
---

- added getTableObjectsByName to IMetaStoreClient and HiveMetaStoreClient
- modified the unit tests to reflect this change


Summary
---

Created a function multi_get_table that retrieves multiple tables on one trip 
to the hive metastore, saving round trip time.


This addresses bug HIVE-2188.
https://issues.apache.org/jira/browse/HIVE-2188


Diffs (updated)
-

  trunk/metastore/if/hive_metastore.thrift 1133230 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1133230 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1133230 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
1133230 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
1133230 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
1133230 
  
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 1133230 

Diff: https://reviews.apache.org/r/831/diff


Testing
---

Added a test case to testMetasore() in TestHiveServer.  Also tested for speed 
improvements in a client session.


Thanks,

Sohan



 Add multi_get_table function in Hive Metastore
 --

 Key: HIVE-2188
 URL: https://issues.apache.org/jira/browse/HIVE-2188
 Project: Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Sohan Jain
Assignee: Sohan Jain
Priority: Minor
 Attachments: HIVE-2188.1.patch, HIVE-2188.3.patch


 This function would get multiple tables from the hive metastore as opposed to 
 just one at a time, saving round trip time to the metastore.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


purging old releases

2011-06-07 Thread John Sichi
Apache Infra has asked us to delete from our dist area releases which are no 
longer under active development:

http://www.apache.org/dist/hive/

They suggested deleting 0.6; I'll go ahead and do that unless anyone considers 
it likely that there will be an 0.6.1 in the future.

Note that 0.6 will be around forever in:

http://archive.apache.org/dist/hive/hive-0.6.0/

JVS



[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-06-07 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045770#comment-13045770
 ] 

John Sichi commented on HIVE-2036:
--

I'll take a look at the new patch tomorrow.  index_auto_partitioned.q does not 
actually include a predicate on the partitioning column, so it should be 
enhanced to do that.

The way it works for the compact index handler is that if we have a predicate 
like

{noformat}
WHERE part_col = 1 AND index_col = 2 AND some_other_col = 3
{noformat}

then it should generate

{noformat}
WHERE part_col = 1 AND index_col = 2
{noformat}

in the internal query against the index table.  That's the reason that 
getIndexPredicateAnalyzer walks through all the partitions and adds the 
predicate columns via allowColumnName.  (The way it does it isn't so great 
since it repeats it for each partition, when in fact one partition should be 
good enough.)


 Update bitmap indexes for automatic usage
 -

 Key: HIVE-2036
 URL: https://issues.apache.org/jira/browse/HIVE-2036
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Syed S. Albiz
 Attachments: HIVE-2036.1.patch


 HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap 
 index support.  The bitmap code will need to be extended after it is 
 committed to enable automatic use of indexing.  Most work will be focused in 
 the BitmapIndexHandler, which needs to generate the re-entrant QL index 
 query.  There may also be significant work in the IndexPredicateAnalyzer to 
 support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Jenkins build is back to normal : Hive-trunk-h0.21 #767

2011-06-07 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-trunk-h0.21/767/changes




[jira] [Commented] (HIVE-1595) job name for alter table T archive partition P is not correct

2011-06-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045785#comment-13045785
 ] 

Hudson commented on HIVE-1595:
--

Integrated in Hive-trunk-h0.21 #767 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/767/])
HIVE-1595. job name for alter table T archive partition P is not correct
(Sohan Jain via pauly)

pauly : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1133219
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java


 job name for alter table T archive partition P is not correct
 -

 Key: HIVE-1595
 URL: https://issues.apache.org/jira/browse/HIVE-1595
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Sohan Jain
 Attachments: Hive-1595.1.patch, Hive-1595.2.patch


 For some internal runs, I saw the job name as hadoop-0.20.1-tools.jar, which 
 makes it difficult to identify

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira