[jira] [Commented] (HIVE-306) Support INSERT [INTO] destination

2011-06-16 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050511#comment-13050511
 ] 

jirapos...@reviews.apache.org commented on HIVE-306:



---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/926/
---

Review request for hive.


Summary
---

New feature to allow INSERT INTO a table or partition


This addresses bug HIVE-306.
https://issues.apache.org/jira/browse/HIVE-306


Diffs
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 1133562 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ErrorMsg.java 1133562 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1133562 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java 1133562 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
1133562 
  trunk/ql/src/test/queries/clientnegative/insert_into1.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/insert_into2.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/insert_into3.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/insert_into4.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/insert_into1.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/insert_into2.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/insert_into3.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/insert_into4.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/insert_into5.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/insert_into6.q PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/insert_into1.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/insert_into2.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/insert_into3.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/insert_into4.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/insert_into1.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/insert_into2.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/insert_into3.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/insert_into4.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/insert_into5.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/insert_into6.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/926/diff


Testing
---


Thanks,

Franklin



 Support INSERT [INTO] destination
 ---

 Key: HIVE-306
 URL: https://issues.apache.org/jira/browse/HIVE-306
 Project: Hive
  Issue Type: New Feature
Reporter: Zheng Shao
Assignee: Franklin Hu
 Attachments: hive-306.1.patch


 Currently hive only supports INSERT OVERWRITE destination. We should 
 support INSERT [INTO] destination.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Review Request: HIVE-306 Support INSERT INTO

2011-06-16 Thread Franklin Hu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/926/
---

Review request for hive.


Summary
---

New feature to allow INSERT INTO a table or partition


This addresses bug HIVE-306.
https://issues.apache.org/jira/browse/HIVE-306


Diffs
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 1133562 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ErrorMsg.java 1133562 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1133562 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java 1133562 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
1133562 
  trunk/ql/src/test/queries/clientnegative/insert_into1.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/insert_into2.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/insert_into3.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/insert_into4.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/insert_into1.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/insert_into2.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/insert_into3.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/insert_into4.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/insert_into5.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/insert_into6.q PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/insert_into1.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/insert_into2.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/insert_into3.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/insert_into4.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/insert_into1.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/insert_into2.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/insert_into3.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/insert_into4.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/insert_into5.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/insert_into6.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/926/diff


Testing
---


Thanks,

Franklin



[jira] [Commented] (HIVE-2215) Add api for marking / querying set of partitions for events

2011-06-16 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050528#comment-13050528
 ] 

Ashutosh Chauhan commented on HIVE-2215:


Anyone care to commit this one?

 Add api for marking / querying set of partitions for events
 ---

 Key: HIVE-2215
 URL: https://issues.apache.org/jira/browse/HIVE-2215
 Project: Hive
  Issue Type: New Feature
  Components: Metastore
Affects Versions: 0.8.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.8.0

 Attachments: hive-2215_full-1.patch, hive_2215.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2218) speedup addInputPaths

2011-06-16 Thread Ning Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-2218:
-

   Resolution: Fixed
Fix Version/s: 0.8.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

Committed. Thanks Yongqiang!

 speedup addInputPaths
 -

 Key: HIVE-2218
 URL: https://issues.apache.org/jira/browse/HIVE-2218
 Project: Hive
  Issue Type: Improvement
Reporter: He Yongqiang
Assignee: He Yongqiang
 Fix For: 0.8.0

 Attachments: HIVE-2218.1.patch, HIVE-2218.2.patch, HIVE-2218.3.patch


 Speedup the addInputPaths for combined symlink inputformat, and added some 
 other micro optimizations which also work for normal cases.
 This can help reducing the start time of one query from 5 hours to less than 
 20 mins.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: HIVE-2213: Optimize get_partition_names_ps()

2011-06-16 Thread Paul Yang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/878/#review853
---



trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
https://reviews.apache.org/r/878/#comment1862

Line exceeds 100 char limit


- Paul


On 2011-06-13 21:11:38, Sohan Jain wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/878/
 ---
 
 (Updated 2011-06-13 21:11:38)
 
 
 Review request for hive and Paul Yang.
 
 
 Summary
 ---
 
 If a table has a large number of partitions, get_partition_names_ps() make 
 take a long time to execute, because we get all of the partition names from 
 the database. This is not very memory efficient, and the operation can be 
 pushed down to the JDO layer without getting all of the names first.
 
 
 This addresses bug HIVE-2213.
 https://issues.apache.org/jira/browse/HIVE-2213
 
 
 Diffs
 -
 
   trunk/common/src/java/org/apache/hadoop/hive/common/FileUtils.java 1135227 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 1135227 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 1135227 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
 1135227 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 
 1135227 
   
 trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
  1135227 
 
 Diff: https://reviews.apache.org/r/878/diff
 
 
 Testing
 ---
 
 Passes previous test cases for get_partition_names_ps() in TestHiveMetaStore.
 
 
 Thanks,
 
 Sohan
 




[jira] [Commented] (HIVE-2213) Optimize get_partition_names_ps()

2011-06-16 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050585#comment-13050585
 ] 

jirapos...@reviews.apache.org commented on HIVE-2213:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/878/#review853
---



trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
https://reviews.apache.org/r/878/#comment1862

Line exceeds 100 char limit


- Paul


On 2011-06-13 21:11:38, Sohan Jain wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/878/
bq.  ---
bq.  
bq.  (Updated 2011-06-13 21:11:38)
bq.  
bq.  
bq.  Review request for hive and Paul Yang.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  If a table has a large number of partitions, get_partition_names_ps() make 
take a long time to execute, because we get all of the partition names from the 
database. This is not very memory efficient, and the operation can be pushed 
down to the JDO layer without getting all of the names first.
bq.  
bq.  
bq.  This addresses bug HIVE-2213.
bq.  https://issues.apache.org/jira/browse/HIVE-2213
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.trunk/common/src/java/org/apache/hadoop/hive/common/FileUtils.java 
1135227 
bq.
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1135227 
bq.
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
1135227 
bq.trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
1135227 
bq.trunk/metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 
1135227 
bq.
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 1135227 
bq.  
bq.  Diff: https://reviews.apache.org/r/878/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Passes previous test cases for get_partition_names_ps() in 
TestHiveMetaStore.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Sohan
bq.  
bq.



 Optimize get_partition_names_ps()
 -

 Key: HIVE-2213
 URL: https://issues.apache.org/jira/browse/HIVE-2213
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sohan Jain
Assignee: Sohan Jain
 Attachments: HIVE-2213.1.patch


 If a table has a large number of partitions, get_partition_names_ps() make 
 take a long time to execute, because we get all of the partition names from 
 the database.  This is not very memory efficient, and the operation can be 
 pushed down to the JDO layer without getting all of the names first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2213) Optimize get_partition_names_ps()

2011-06-16 Thread Paul Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050586#comment-13050586
 ] 

Paul Yang commented on HIVE-2213:
-

Looks good, but can you do a minor update to fix lines longer than 100 chars?

 Optimize get_partition_names_ps()
 -

 Key: HIVE-2213
 URL: https://issues.apache.org/jira/browse/HIVE-2213
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sohan Jain
Assignee: Sohan Jain
 Attachments: HIVE-2213.1.patch


 If a table has a large number of partitions, get_partition_names_ps() make 
 take a long time to execute, because we get all of the partition names from 
 the database.  This is not very memory efficient, and the operation can be 
 pushed down to the JDO layer without getting all of the names first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: HIVE-2036: Update bitmap indexes for automatic usage

2011-06-16 Thread John Sichi

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review856
---



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
https://reviews.apache.org/r/857/#comment1865

Need to update this comment now, explaining why we don't even look for the 
filter operator any more.


- John


On 2011-06-15 23:46:24, Syed Albiz wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/857/
 ---
 
 (Updated 2011-06-15 23:46:24)
 
 
 Review request for hive and John Sichi.
 
 
 Summary
 ---
 
 Add support for generating index queries to support automatic usage of bitmap 
 indexes. This required changing the interface to the IndexHandlers to support 
 accepting queries on multiple indexes. The compact indexes were modified to 
 use this new interface as well, although no functional changes were made to 
 how they work. Only supports AND predicates right now, but it should be 
 possibly to extend the BitmapQuery interface defined in this patch to easily 
 support OR predicates as well. Currently benchmarking these changes on a test 
 cluster.
 
 
 This addresses bug HIVE-2036.
 https://issues.apache.org/jira/browse/HIVE-2036
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
   ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 
   ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 
 3caa4cc 
   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
 af9d7b1 
   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java 
 PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
 56e7609 
   ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
  268560d 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
  0873e1a 
   ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 
 95fef73 
   ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java d22654b 
   ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
   ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
   ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q 
 PRE-CREATION 
   ql/src/test/results/clientpositive/index_auto.q.out 713bb40 
   ql/src/test/results/clientpositive/index_auto_file_format.q.out 894a556 
   ql/src/test/results/clientpositive/index_auto_multiple.q.out 27092dc 
   ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
   ql/src/test/results/clientpositive/index_auto_unused.q.out 8a1eda5 
   ql/src/test/results/clientpositive/index_bitmap3.q.out dadfa77 
   ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
   ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/857/diff
 
 
 Testing
 ---
 
 Passes unit tests, additional testcase to test automatic bitmap indexing 
 index_bitmap_auto.q was also added to the TestCliDriver suite. Currently 
 benchmarking changes on a test cluster.
 
 
 Thanks,
 
 Syed
 




[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-06-16 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050610#comment-13050610
 ] 

jirapos...@reviews.apache.org commented on HIVE-2036:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/857/#review856
---



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
https://reviews.apache.org/r/857/#comment1865

Need to update this comment now, explaining why we don't even look for the 
filter operator any more.


- John


On 2011-06-15 23:46:24, Syed Albiz wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/857/
bq.  ---
bq.  
bq.  (Updated 2011-06-15 23:46:24)
bq.  
bq.  
bq.  Review request for hive and John Sichi.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Add support for generating index queries to support automatic usage of 
bitmap indexes. This required changing the interface to the IndexHandlers to 
support accepting queries on multiple indexes. The compact indexes were 
modified to use this new interface as well, although no functional changes were 
made to how they work. Only supports AND predicates right now, but it should be 
possibly to extend the BitmapQuery interface defined in this patch to easily 
support OR predicates as well. Currently benchmarking these changes on a test 
cluster.
bq.  
bq.  
bq.  This addresses bug HIVE-2036.
bq.  https://issues.apache.org/jira/browse/HIVE-2036
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java 
e5ee183 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 
3caa4cc 
bq.
ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
af9d7b1 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java 
PRE-CREATION 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java 
PRE-CREATION 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java 
PRE-CREATION 
bq.
ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
56e7609 
bq.ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 
d64e88b 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
 268560d 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
 0873e1a 
bq.ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 
95fef73 
bq.ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java d22654b 
bq.ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 
bq.ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION 
bq.ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q 
PRE-CREATION 
bq.ql/src/test/results/clientpositive/index_auto.q.out 713bb40 
bq.ql/src/test/results/clientpositive/index_auto_file_format.q.out 894a556 
bq.ql/src/test/results/clientpositive/index_auto_multiple.q.out 27092dc 
bq.ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a 
bq.ql/src/test/results/clientpositive/index_auto_unused.q.out 8a1eda5 
bq.ql/src/test/results/clientpositive/index_bitmap3.q.out dadfa77 
bq.ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION 
bq.ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out 
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/857/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Passes unit tests, additional testcase to test automatic bitmap indexing 
index_bitmap_auto.q was also added to the TestCliDriver suite. Currently 
benchmarking changes on a test cluster.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Syed
bq.  
bq.



 Update bitmap indexes for automatic usage
 -

 Key: HIVE-2036
 URL: https://issues.apache.org/jira/browse/HIVE-2036
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Syed S. Albiz
 Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch


 HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap 
 index support.  The bitmap code will need to be extended after it is 
 committed to enable automatic use of indexing.  Most work will be focused in 
 the BitmapIndexHandler, which needs to 

[jira] [Commented] (HIVE-306) Support INSERT [INTO] destination

2011-06-16 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050614#comment-13050614
 ] 

jirapos...@reviews.apache.org commented on HIVE-306:



---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/926/#review852
---



trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
https://reviews.apache.org/r/926/#comment1861

Is that difficult to extend it with INSERT INTO DIRECTORY?


- Ning


On 2011-06-16 15:53:03, Franklin Hu wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/926/
bq.  ---
bq.  
bq.  (Updated 2011-06-16 15:53:03)
bq.  
bq.  
bq.  Review request for hive.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  New feature to allow INSERT INTO a table or partition
bq.  
bq.  
bq.  This addresses bug HIVE-306.
bq.  https://issues.apache.org/jira/browse/HIVE-306
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 1133562 
bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ErrorMsg.java 1133562 
bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1133562 
bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java 
1133562 
bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
1133562 
bq.trunk/ql/src/test/queries/clientnegative/insert_into1.q PRE-CREATION 
bq.trunk/ql/src/test/queries/clientnegative/insert_into2.q PRE-CREATION 
bq.trunk/ql/src/test/queries/clientnegative/insert_into3.q PRE-CREATION 
bq.trunk/ql/src/test/queries/clientnegative/insert_into4.q PRE-CREATION 
bq.trunk/ql/src/test/queries/clientpositive/insert_into1.q PRE-CREATION 
bq.trunk/ql/src/test/queries/clientpositive/insert_into2.q PRE-CREATION 
bq.trunk/ql/src/test/queries/clientpositive/insert_into3.q PRE-CREATION 
bq.trunk/ql/src/test/queries/clientpositive/insert_into4.q PRE-CREATION 
bq.trunk/ql/src/test/queries/clientpositive/insert_into5.q PRE-CREATION 
bq.trunk/ql/src/test/queries/clientpositive/insert_into6.q PRE-CREATION 
bq.trunk/ql/src/test/results/clientnegative/insert_into1.q.out PRE-CREATION 
bq.trunk/ql/src/test/results/clientnegative/insert_into2.q.out PRE-CREATION 
bq.trunk/ql/src/test/results/clientnegative/insert_into3.q.out PRE-CREATION 
bq.trunk/ql/src/test/results/clientnegative/insert_into4.q.out PRE-CREATION 
bq.trunk/ql/src/test/results/clientpositive/insert_into1.q.out PRE-CREATION 
bq.trunk/ql/src/test/results/clientpositive/insert_into2.q.out PRE-CREATION 
bq.trunk/ql/src/test/results/clientpositive/insert_into3.q.out PRE-CREATION 
bq.trunk/ql/src/test/results/clientpositive/insert_into4.q.out PRE-CREATION 
bq.trunk/ql/src/test/results/clientpositive/insert_into5.q.out PRE-CREATION 
bq.trunk/ql/src/test/results/clientpositive/insert_into6.q.out PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/926/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Franklin
bq.  
bq.



 Support INSERT [INTO] destination
 ---

 Key: HIVE-306
 URL: https://issues.apache.org/jira/browse/HIVE-306
 Project: Hive
  Issue Type: New Feature
Reporter: Zheng Shao
Assignee: Franklin Hu
 Attachments: hive-306.1.patch


 Currently hive only supports INSERT OVERWRITE destination. We should 
 support INSERT [INTO] destination.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-872) Allow type widening on COALESCE/UNION ALL

2011-06-16 Thread Syed S. Albiz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Syed S. Albiz updated HIVE-872:
---

Status: Patch Available  (was: Open)

 Allow type widening on COALESCE/UNION ALL
 -

 Key: HIVE-872
 URL: https://issues.apache.org/jira/browse/HIVE-872
 Project: Hive
  Issue Type: New Feature
Reporter: Zheng Shao
Assignee: Syed S. Albiz
 Attachments: HIVE-872.1.patch, HIVE-872.6.patch


 Original request: We should allow 0L to be interpreted as a bigint constant.
 Instead of this, we have decided that the usecases for this do not merit 
 modifications to the ql. Instead we enable type widening on the UDF COALESCE 
 and on the UNION ALL operator

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: [VOTE] Hive 0.7.1 Release Candidate 0

2011-06-16 Thread John Sichi
+1 from me.  I downloaded and verified build+run.  I did not verify tests or 
upgrades.

JVS

On Jun 15, 2011, at 12:52 AM, Carl Steinbach wrote:

 Apache Hive 0.7.1 Release Candidate 0 is available here:
 
 http://people.apache.org/~cws/hive-0.7.1-candidate-0/
 
 We need three +1 votes from Hive PMC members in order to release. Please
 vote.
 
 Thanks.
 
 Carl



Re: Review Request: HIVE-306 Support INSERT INTO

2011-06-16 Thread Ning Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/926/#review852
---



trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
https://reviews.apache.org/r/926/#comment1861

Is that difficult to extend it with INSERT INTO DIRECTORY?


- Ning


On 2011-06-16 15:53:03, Franklin Hu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/926/
 ---
 
 (Updated 2011-06-16 15:53:03)
 
 
 Review request for hive.
 
 
 Summary
 ---
 
 New feature to allow INSERT INTO a table or partition
 
 
 This addresses bug HIVE-306.
 https://issues.apache.org/jira/browse/HIVE-306
 
 
 Diffs
 -
 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 1133562 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ErrorMsg.java 1133562 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1133562 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java 1133562 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
 1133562 
   trunk/ql/src/test/queries/clientnegative/insert_into1.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/insert_into2.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/insert_into3.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/insert_into4.q PRE-CREATION 
   trunk/ql/src/test/queries/clientpositive/insert_into1.q PRE-CREATION 
   trunk/ql/src/test/queries/clientpositive/insert_into2.q PRE-CREATION 
   trunk/ql/src/test/queries/clientpositive/insert_into3.q PRE-CREATION 
   trunk/ql/src/test/queries/clientpositive/insert_into4.q PRE-CREATION 
   trunk/ql/src/test/queries/clientpositive/insert_into5.q PRE-CREATION 
   trunk/ql/src/test/queries/clientpositive/insert_into6.q PRE-CREATION 
   trunk/ql/src/test/results/clientnegative/insert_into1.q.out PRE-CREATION 
   trunk/ql/src/test/results/clientnegative/insert_into2.q.out PRE-CREATION 
   trunk/ql/src/test/results/clientnegative/insert_into3.q.out PRE-CREATION 
   trunk/ql/src/test/results/clientnegative/insert_into4.q.out PRE-CREATION 
   trunk/ql/src/test/results/clientpositive/insert_into1.q.out PRE-CREATION 
   trunk/ql/src/test/results/clientpositive/insert_into2.q.out PRE-CREATION 
   trunk/ql/src/test/results/clientpositive/insert_into3.q.out PRE-CREATION 
   trunk/ql/src/test/results/clientpositive/insert_into4.q.out PRE-CREATION 
   trunk/ql/src/test/results/clientpositive/insert_into5.q.out PRE-CREATION 
   trunk/ql/src/test/results/clientpositive/insert_into6.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/926/diff
 
 
 Testing
 ---
 
 
 Thanks,
 
 Franklin
 




Re: [VOTE] Hive 0.7.1 Release Candidate 0

2011-06-16 Thread Carl Steinbach
+1 from me too.

We need one more +1 vote in order to release.

On Thu, Jun 16, 2011 at 11:31 AM, John Sichi jsi...@fb.com wrote:

 +1 from me.  I downloaded and verified build+run.  I did not verify tests
 or upgrades.

 JVS

 On Jun 15, 2011, at 12:52 AM, Carl Steinbach wrote:

  Apache Hive 0.7.1 Release Candidate 0 is available here:
 
  http://people.apache.org/~cws/hive-0.7.1-candidate-0/
 
  We need three +1 votes from Hive PMC members in order to release. Please
  vote.
 
  Thanks.
 
  Carl




RE: [VOTE] Hive 0.7.1 Release Candidate 0

2011-06-16 Thread Paul Yang
+1

-Original Message-
From: Carl Steinbach [mailto:c...@cloudera.com] 
Sent: Thursday, June 16, 2011 1:22 PM
To: dev@hive.apache.org
Subject: Re: [VOTE] Hive 0.7.1 Release Candidate 0

+1 from me too.

We need one more +1 vote in order to release.

On Thu, Jun 16, 2011 at 11:31 AM, John Sichi jsi...@fb.com wrote:

 +1 from me.  I downloaded and verified build+run.  I did not verify 
 +tests
 or upgrades.

 JVS

 On Jun 15, 2011, at 12:52 AM, Carl Steinbach wrote:

  Apache Hive 0.7.1 Release Candidate 0 is available here:
 
  http://people.apache.org/~cws/hive-0.7.1-candidate-0/
 
  We need three +1 votes from Hive PMC members in order to release. 
  Please vote.
 
  Thanks.
 
  Carl




Re: [VOTE] Hive 0.7.1 Release Candidate 0

2011-06-16 Thread Carl Steinbach
Thanks for voting everyone.

RC0 has been approved as the Hive 0.7.1 release after receiving three +1
votes from Hive PMC members.

I'll send out an official announcement once the tarballs hit the mirrors.

Thanks.

Carl

On Thu, Jun 16, 2011 at 2:48 PM, Paul Yang py...@fb.com wrote:

 +1

 -Original Message-
 From: Carl Steinbach [mailto:c...@cloudera.com]
 Sent: Thursday, June 16, 2011 1:22 PM
 To: dev@hive.apache.org
 Subject: Re: [VOTE] Hive 0.7.1 Release Candidate 0

 +1 from me too.

 We need one more +1 vote in order to release.

 On Thu, Jun 16, 2011 at 11:31 AM, John Sichi jsi...@fb.com wrote:

  +1 from me.  I downloaded and verified build+run.  I did not verify
  +tests
  or upgrades.
 
  JVS
 
  On Jun 15, 2011, at 12:52 AM, Carl Steinbach wrote:
 
   Apache Hive 0.7.1 Release Candidate 0 is available here:
  
   http://people.apache.org/~cws/hive-0.7.1-candidate-0/
  
   We need three +1 votes from Hive PMC members in order to release.
   Please vote.
  
   Thanks.
  
   Carl
 
 



Build failed in Jenkins: Hive-branch-0.7.1-h0.21 #26

2011-06-16 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-branch-0.7.1-h0.21/26/changes

Changes:

[cws] HIVE-BUILD. Update README.txt for 0.7.1 release (cws)

[cws] HIVE-BUILD. Update release notes. Set version=0.7.1 (cws)

--
[...truncated 27705 lines...]
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'https://builds.apache.org/job/Hive-branch-0.7.1-h0.21/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://builds.apache.org/job/Hive-branch-0.7.1-h0.21/ws/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'https://builds.apache.org/job/Hive-branch-0.7.1-h0.21/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-06-16_16-06-40_721_5866561282381092916/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks determined at compile time: 1
[junit] In order to change the average load for a reducer (in bytes):
[junit]   set hive.exec.reducers.bytes.per.reducer=number
[junit] In order to limit the maximum number of reducers:
[junit]   set hive.exec.reducers.max=number
[junit] In order to set a constant number of reducers:
[junit]   set mapred.reduce.tasks=number
[junit] Job running in-process (local Hadoop)
[junit] 2011-06-16 16:06:43,726 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-06-16_16-06-40_721_5866561282381092916/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=https://builds.apache.org/job/Hive-branch-0.7.1-h0.21/ws/hive/build/service/tmp/hive_job_log_hudson_201106161606_453554462.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'https://builds.apache.org/job/Hive-branch-0.7.1-h0.21/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://builds.apache.org/job/Hive-branch-0.7.1-h0.21/ws/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'https://builds.apache.org/job/Hive-branch-0.7.1-h0.21/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-06-16_16-06-45_198_5058862957123114552/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-06-16_16-06-45_198_5058862957123114552/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] 

Re: Review Request: HIVE-2213: Optimize get_partition_names_ps()

2011-06-16 Thread Sohan Jain

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/878/
---

(Updated 2011-06-16 23:30:02.425588)


Review request for hive and Paul Yang.


Changes
---

-Fixed line that exceeded 100 chars


Summary
---

If a table has a large number of partitions, get_partition_names_ps() make take 
a long time to execute, because we get all of the partition names from the 
database. This is not very memory efficient, and the operation can be pushed 
down to the JDO layer without getting all of the names first.


This addresses bug HIVE-2213.
https://issues.apache.org/jira/browse/HIVE-2213


Diffs (updated)
-

  trunk/common/src/java/org/apache/hadoop/hive/common/FileUtils.java 1135227 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1135227 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
1135227 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
1135227 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 
1135227 
  
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 1135227 

Diff: https://reviews.apache.org/r/878/diff


Testing
---

Passes previous test cases for get_partition_names_ps() in TestHiveMetaStore.


Thanks,

Sohan



[jira] [Updated] (HIVE-2213) Optimize get_partition_names_ps()

2011-06-16 Thread Sohan Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sohan Jain updated HIVE-2213:
-

Status: Patch Available  (was: Open)

 Optimize get_partition_names_ps()
 -

 Key: HIVE-2213
 URL: https://issues.apache.org/jira/browse/HIVE-2213
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sohan Jain
Assignee: Sohan Jain
 Attachments: HIVE-2213.1.patch, HIVE-2213.3.patch


 If a table has a large number of partitions, get_partition_names_ps() make 
 take a long time to execute, because we get all of the partition names from 
 the database.  This is not very memory efficient, and the operation can be 
 pushed down to the JDO layer without getting all of the names first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2213) Optimize get_partition_names_ps()

2011-06-16 Thread Sohan Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sohan Jain updated HIVE-2213:
-

Attachment: HIVE-2213.3.patch

-Fixed line that exceeded 100 chars

 Optimize get_partition_names_ps()
 -

 Key: HIVE-2213
 URL: https://issues.apache.org/jira/browse/HIVE-2213
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sohan Jain
Assignee: Sohan Jain
 Attachments: HIVE-2213.1.patch, HIVE-2213.3.patch


 If a table has a large number of partitions, get_partition_names_ps() make 
 take a long time to execute, because we get all of the partition names from 
 the database.  This is not very memory efficient, and the operation can be 
 pushed down to the JDO layer without getting all of the names first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2213) Optimize get_partition_names_ps()

2011-06-16 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050803#comment-13050803
 ] 

jirapos...@reviews.apache.org commented on HIVE-2213:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/878/
---

(Updated 2011-06-16 23:30:02.425588)


Review request for hive and Paul Yang.


Changes
---

-Fixed line that exceeded 100 chars


Summary
---

If a table has a large number of partitions, get_partition_names_ps() make take 
a long time to execute, because we get all of the partition names from the 
database. This is not very memory efficient, and the operation can be pushed 
down to the JDO layer without getting all of the names first.


This addresses bug HIVE-2213.
https://issues.apache.org/jira/browse/HIVE-2213


Diffs (updated)
-

  trunk/common/src/java/org/apache/hadoop/hive/common/FileUtils.java 1135227 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1135227 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
1135227 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
1135227 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 
1135227 
  
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 1135227 

Diff: https://reviews.apache.org/r/878/diff


Testing
---

Passes previous test cases for get_partition_names_ps() in TestHiveMetaStore.


Thanks,

Sohan



 Optimize get_partition_names_ps()
 -

 Key: HIVE-2213
 URL: https://issues.apache.org/jira/browse/HIVE-2213
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sohan Jain
Assignee: Sohan Jain
 Attachments: HIVE-2213.1.patch, HIVE-2213.3.patch


 If a table has a large number of partitions, get_partition_names_ps() make 
 take a long time to execute, because we get all of the partition names from 
 the database.  This is not very memory efficient, and the operation can be 
 pushed down to the JDO layer without getting all of the names first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: HIVE-2261: Add API to metastore for table filtering based on table properties

2011-06-16 Thread Paul Yang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/910/#review855
---



trunk/metastore/if/hive_metastore.thrift
https://reviews.apache.org/r/910/#comment1868

Can we rename this to TableQueryFilterType so that it's clear that it's 
only used for tables?



trunk/metastore/if/hive_metastore.thrift
https://reviews.apache.org/r/910/#comment1864

Where is this used?



trunk/metastore/if/hive_metastore.thrift
https://reviews.apache.org/r/910/#comment1866

Hive doesn't really use the retention field. Can you remove operations on 
this field from the rest of the diff?



trunk/metastore/if/hive_metastore.thrift
https://reviews.apache.org/r/910/#comment1869

The interface is a little odd because we have to use names like 'owner' or 
'retention' in addition to specifying the QueryFilterType. Maybe we should make 
the field that the QueryFilterType references be called 'field', so you'd have 
a filter like 'field = .*test_user.*' (for owner) or 'field  90' (for 
retention)



trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
https://reviews.apache.org/r/910/#comment1867

Style issue, { should be on same line as if



trunk/metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java
https://reviews.apache.org/r/910/#comment1870

JDO-174 looks like it was fixed a while back - is this still an issue?  
and  may be useful operators for the the parameters field. (e.g. if retention 
were stored there instead of the member field)


- Paul


On 2011-06-16 03:13:24, Sohan Jain wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/910/
 ---
 
 (Updated 2011-06-16 03:13:24)
 
 
 Review request for hive and Paul Yang.
 
 
 Summary
 ---
 
 Create a function listTableNamesByFilter that returns a list of names for 
 tables in a database that match a certain filter. The syntax of the filter is 
 similar to the one created by HIVE-1609. You can filter the table list based 
 on owner, retention, or table parameter key/values. The filtering takes place 
 at the JDO level for efficiency/speed.  Added a QueryFilterType enum to 
 easily add new filters and separate logic for filtering.  
 
 Example filter statements include: 
 filterType = QueryFilterType.OWNER; filter = owner like .*test_user.*
 filterType = QueryFilterType.RETENTION; filter = retention  90 and 
 retention  30
 filterType = QueryFilterType.PARAMS; filter = numPartitions = \2\ and  
 retention_days = \30\
 
 The filter can currently parse string or integer values, where values 
 interpreted as strings must be in quotes.
 
 
 This addresses bug HIVE-2226.
 https://issues.apache.org/jira/browse/HIVE-2226
 
 
 Diffs
 -
 
   trunk/metastore/if/hive_metastore.thrift 1135227 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 1135227 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
  1135227 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
  1135227 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 1135227 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
 1135227 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java
  1135227 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/parser/Filter.g 
 1135227 
   
 trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
  1135227 
 
 Diff: https://reviews.apache.org/r/910/diff
 
 
 Testing
 ---
 
 Added test cases to TestHiveMetaStore
 
 
 Thanks,
 
 Sohan
 




[jira] [Commented] (HIVE-2226) Add API to retrieve table names by an arbitrary filter, e.g., by owner, retention, parameters, etc.

2011-06-16 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050821#comment-13050821
 ] 

jirapos...@reviews.apache.org commented on HIVE-2226:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/910/#review855
---



trunk/metastore/if/hive_metastore.thrift
https://reviews.apache.org/r/910/#comment1868

Can we rename this to TableQueryFilterType so that it's clear that it's 
only used for tables?



trunk/metastore/if/hive_metastore.thrift
https://reviews.apache.org/r/910/#comment1864

Where is this used?



trunk/metastore/if/hive_metastore.thrift
https://reviews.apache.org/r/910/#comment1866

Hive doesn't really use the retention field. Can you remove operations on 
this field from the rest of the diff?



trunk/metastore/if/hive_metastore.thrift
https://reviews.apache.org/r/910/#comment1869

The interface is a little odd because we have to use names like 'owner' or 
'retention' in addition to specifying the QueryFilterType. Maybe we should make 
the field that the QueryFilterType references be called 'field', so you'd have 
a filter like 'field = .*test_user.*' (for owner) or 'field  90' (for 
retention)



trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
https://reviews.apache.org/r/910/#comment1867

Style issue, { should be on same line as if



trunk/metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java
https://reviews.apache.org/r/910/#comment1870

JDO-174 looks like it was fixed a while back - is this still an issue? bq.  
and  may be useful operators for the the parameters field. (e.g. if retention 
were stored there instead of the member field)


- Paul


On 2011-06-16 03:13:24, Sohan Jain wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/910/
bq.  ---
bq.  
bq.  (Updated 2011-06-16 03:13:24)
bq.  
bq.  
bq.  Review request for hive and Paul Yang.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Create a function listTableNamesByFilter that returns a list of names for 
tables in a database that match a certain filter. The syntax of the filter is 
similar to the one created by HIVE-1609. You can filter the table list based on 
owner, retention, or table parameter key/values. The filtering takes place at 
the JDO level for efficiency/speed.  Added a QueryFilterType enum to easily add 
new filters and separate logic for filtering.  
bq.  
bq.  Example filter statements include: 
bq.  filterType = QueryFilterType.OWNER; filter = owner like .*test_user.*
bq.  filterType = QueryFilterType.RETENTION; filter = retention  90 and 
retention  30
bq.  filterType = QueryFilterType.PARAMS; filter = numPartitions = \2\ and  
retention_days = \30\
bq.  
bq.  The filter can currently parse string or integer values, where values 
interpreted as strings must be in quotes.
bq.  
bq.  
bq.  This addresses bug HIVE-2226.
bq.  https://issues.apache.org/jira/browse/HIVE-2226
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.trunk/metastore/if/hive_metastore.thrift 1135227 
bq.
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1135227 
bq.
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1135227 
bq.
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
1135227 
bq.
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
1135227 
bq.trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
1135227 
bq.
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java
 1135227 
bq.
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/parser/Filter.g 
1135227 
bq.
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 1135227 
bq.  
bq.  Diff: https://reviews.apache.org/r/910/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Added test cases to TestHiveMetaStore
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Sohan
bq.  
bq.



 Add API to retrieve table names by an arbitrary filter, e.g., by owner, 
 retention, parameters, etc.
 ---

 Key: HIVE-2226
 URL: https://issues.apache.org/jira/browse/HIVE-2226
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sohan Jain
Assignee: Sohan Jain
 Attachments: HIVE-2226.1.patch


 Create a function called get_table_names_by_filter that 

[jira] [Updated] (HIVE-872) Allow type widening on COALESCE/UNION ALL

2011-06-16 Thread John Sichi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi updated HIVE-872:


Status: Open  (was: Patch Available)

3 tests failed:

explode_null.q
udf_coalesce.q
union2.q


 Allow type widening on COALESCE/UNION ALL
 -

 Key: HIVE-872
 URL: https://issues.apache.org/jira/browse/HIVE-872
 Project: Hive
  Issue Type: New Feature
Reporter: Zheng Shao
Assignee: Syed S. Albiz
 Attachments: HIVE-872.1.patch, HIVE-872.6.patch


 Original request: We should allow 0L to be interpreted as a bigint constant.
 Instead of this, we have decided that the usecases for this do not merit 
 modifications to the ql. Instead we enable type widening on the UDF COALESCE 
 and on the UNION ALL operator

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2227) Remove ProgressCounter enum in Operator

2011-06-16 Thread Zhuoluo (Clark) Yang (JIRA)
Remove ProgressCounter enum in Operator
---

 Key: HIVE-2227
 URL: https://issues.apache.org/jira/browse/HIVE-2227
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.8.0
Reporter: Zhuoluo (Clark) Yang
Priority: Minor
 Fix For: 0.8.0


After HIVE-1701, it is of no use to keep a heavy counterNameToEnum hashmap. We 
can use string directly, for the enum is only a hack for hadoop 0.17. The 
string will be human readable in the jobdetails.jsp instead of C1, C2, ... 
C1000.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2227) Remove ProgressCounter enum in Operator

2011-06-16 Thread Zhuoluo (Clark) Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhuoluo (Clark) Yang updated HIVE-2227:
---

Status: Patch Available  (was: Open)

 Remove ProgressCounter enum in Operator
 ---

 Key: HIVE-2227
 URL: https://issues.apache.org/jira/browse/HIVE-2227
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.8.0
Reporter: Zhuoluo (Clark) Yang
Priority: Minor
 Fix For: 0.8.0


 After HIVE-1701, it is of no use to keep a heavy counterNameToEnum hashmap. 
 We can use string directly, for the enum is only a hack for hadoop 0.17. The 
 string will be human readable in the jobdetails.jsp instead of C1, C2, ... 
 C1000.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2227) Remove ProgressCounter enum in Operator

2011-06-16 Thread Zhuoluo (Clark) Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhuoluo (Clark) Yang updated HIVE-2227:
---

Status: Open  (was: Patch Available)

 Remove ProgressCounter enum in Operator
 ---

 Key: HIVE-2227
 URL: https://issues.apache.org/jira/browse/HIVE-2227
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.8.0
Reporter: Zhuoluo (Clark) Yang
Priority: Minor
 Fix For: 0.8.0


 After HIVE-1701, it is of no use to keep a heavy counterNameToEnum hashmap. 
 We can use string directly, for the enum is only a hack for hadoop 0.17. The 
 string will be human readable in the jobdetails.jsp instead of C1, C2, ... 
 C1000.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2227) Remove ProgressCounter enum in Operator

2011-06-16 Thread Zhuoluo (Clark) Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhuoluo (Clark) Yang updated HIVE-2227:
---

Status: Patch Available  (was: Open)

 Remove ProgressCounter enum in Operator
 ---

 Key: HIVE-2227
 URL: https://issues.apache.org/jira/browse/HIVE-2227
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.8.0
Reporter: Zhuoluo (Clark) Yang
Priority: Minor
 Fix For: 0.8.0


 After HIVE-1701, it is of no use to keep a heavy counterNameToEnum hashmap. 
 We can use string directly, for the enum is only a hack for hadoop 0.17. The 
 string will be human readable in the jobdetails.jsp instead of C1, C2, ... 
 C1000.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2227) Remove ProgressCounter enum in Operator

2011-06-16 Thread Zhuoluo (Clark) Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhuoluo (Clark) Yang updated HIVE-2227:
---

Attachment: HIVE-2227-1.patch

Here is a patch.

 Remove ProgressCounter enum in Operator
 ---

 Key: HIVE-2227
 URL: https://issues.apache.org/jira/browse/HIVE-2227
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.8.0
Reporter: Zhuoluo (Clark) Yang
Priority: Minor
 Fix For: 0.8.0

 Attachments: HIVE-2227-1.patch


 After HIVE-1701, it is of no use to keep a heavy counterNameToEnum hashmap. 
 We can use string directly, for the enum is only a hack for hadoop 0.17. The 
 string will be human readable in the jobdetails.jsp instead of C1, C2, ... 
 C1000.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2227) Remove ProgressCounter enum in Operator

2011-06-16 Thread Zhuoluo (Clark) Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhuoluo (Clark) Yang updated HIVE-2227:
---

Status: Open  (was: Patch Available)

Not reviewed.

 Remove ProgressCounter enum in Operator
 ---

 Key: HIVE-2227
 URL: https://issues.apache.org/jira/browse/HIVE-2227
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.8.0
Reporter: Zhuoluo (Clark) Yang
Priority: Minor
 Fix For: 0.8.0

 Attachments: HIVE-2227-1.patch


 After HIVE-1701, it is of no use to keep a heavy counterNameToEnum hashmap. 
 We can use string directly, for the enum is only a hack for hadoop 0.17. The 
 string will be human readable in the jobdetails.jsp instead of C1, C2, ... 
 C1000.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: HIVE-2261: Add API to metastore for table filtering based on table properties

2011-06-16 Thread Sohan Jain


 On 2011-06-17 00:13:20, Paul Yang wrote:
  trunk/metastore/if/hive_metastore.thrift, line 46
  https://reviews.apache.org/r/910/diff/1/?file=21192#file21192line46
 
  Can we rename this to TableQueryFilterType so that it's clear that it's 
  only used for tables?

Sure thing.


 On 2011-06-17 00:13:20, Paul Yang wrote:
  trunk/metastore/if/hive_metastore.thrift, line 53
  https://reviews.apache.org/r/910/diff/1/?file=21192#file21192line53
 
  Where is this used?

Ah, that's not supposed to be there; I will remove it.


 On 2011-06-17 00:13:20, Paul Yang wrote:
  trunk/metastore/if/hive_metastore.thrift, lines 267-270
  https://reviews.apache.org/r/910/diff/1/?file=21192#file21192line267
 
  Hive doesn't really use the retention field. Can you remove operations 
  on this field from the rest of the diff?

Yep.


 On 2011-06-17 00:13:20, Paul Yang wrote:
  trunk/metastore/if/hive_metastore.thrift, line 277
  https://reviews.apache.org/r/910/diff/1/?file=21192#file21192line277
 
  The interface is a little odd because we have to use names like 'owner' 
  or 'retention' in addition to specifying the QueryFilterType. Maybe we 
  should make the field that the QueryFilterType references be called 
  'field', so you'd have a filter like 'field = .*test_user.*' (for owner) 
  or 'field  90' (for retention)

Actually, with the current implementation, the field in the filter string can 
be arbitrarily named; the field name is parsed out by the antlr grammar and 
renamed according to TableQueryFilterType.  So if TableQueryFilterType is 
OWNER, then the following filters are equivalent: 
owner like \.*test.\, 
key like \.*test.*\, 
field like \.*test.*\,
etc.


 On 2011-06-17 00:13:20, Paul Yang wrote:
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java,
   line 1203
  https://reviews.apache.org/r/910/diff/1/?file=21193#file21193line1203
 
  Style issue, { should be on same line as if

Will fix.


 On 2011-06-17 00:13:20, Paul Yang wrote:
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java,
   lines 189-197
  https://reviews.apache.org/r/910/diff/1/?file=21198#file21198line189
 
  JDO-174 looks like it was fixed a while back - is this still an issue? 
   and  may be useful operators for the the parameters field. (e.g. if 
  retention were stored there instead of the member field)

I agree that the  and  could be pretty useful operators here.  The issue was 
said to be resolved over 5 years ago, but surprisingly I think it just closed a 
few days ago 
https://issues.apache.org/jira/browse/JDO-174?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel#issue-tabs.
  

In any case, comparing the result of a map.get() with anything other than 
equality throws the same error as the one described in JDO-174.  Trying to cast 
the result of map.get() to a String with ((String) 
this.parameters.get(keyName)  value throws the following seemingly 
unresolvable error:

org.datanucleus.store.exceptions.NoTableManagedException: Persistent class 
java.lang.String has no table in the database, but the operation requires it. 
Please check the specification of the MetaData for this class.


- Sohan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/910/#review855
---


On 2011-06-16 03:13:24, Sohan Jain wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/910/
 ---
 
 (Updated 2011-06-16 03:13:24)
 
 
 Review request for hive and Paul Yang.
 
 
 Summary
 ---
 
 Create a function listTableNamesByFilter that returns a list of names for 
 tables in a database that match a certain filter. The syntax of the filter is 
 similar to the one created by HIVE-1609. You can filter the table list based 
 on owner, retention, or table parameter key/values. The filtering takes place 
 at the JDO level for efficiency/speed.  Added a QueryFilterType enum to 
 easily add new filters and separate logic for filtering.  
 
 Example filter statements include: 
 filterType = QueryFilterType.OWNER; filter = owner like .*test_user.*
 filterType = QueryFilterType.RETENTION; filter = retention  90 and 
 retention  30
 filterType = QueryFilterType.PARAMS; filter = numPartitions = \2\ and  
 retention_days = \30\
 
 The filter can currently parse string or integer values, where values 
 interpreted as strings must be in quotes.
 
 
 This addresses bug HIVE-2226.
 https://issues.apache.org/jira/browse/HIVE-2226
 
 
 Diffs
 -
 
   trunk/metastore/if/hive_metastore.thrift 1135227 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 1135227 
   
 

[jira] [Commented] (HIVE-2226) Add API to retrieve table names by an arbitrary filter, e.g., by owner, retention, parameters, etc.

2011-06-16 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050830#comment-13050830
 ] 

jirapos...@reviews.apache.org commented on HIVE-2226:
-



bq.  On 2011-06-17 00:13:20, Paul Yang wrote:
bq.   trunk/metastore/if/hive_metastore.thrift, line 46
bq.   https://reviews.apache.org/r/910/diff/1/?file=21192#file21192line46
bq.  
bq.   Can we rename this to TableQueryFilterType so that it's clear that 
it's only used for tables?

Sure thing.


bq.  On 2011-06-17 00:13:20, Paul Yang wrote:
bq.   trunk/metastore/if/hive_metastore.thrift, line 53
bq.   https://reviews.apache.org/r/910/diff/1/?file=21192#file21192line53
bq.  
bq.   Where is this used?

Ah, that's not supposed to be there; I will remove it.


bq.  On 2011-06-17 00:13:20, Paul Yang wrote:
bq.   trunk/metastore/if/hive_metastore.thrift, lines 267-270
bq.   https://reviews.apache.org/r/910/diff/1/?file=21192#file21192line267
bq.  
bq.   Hive doesn't really use the retention field. Can you remove 
operations on this field from the rest of the diff?

Yep.


bq.  On 2011-06-17 00:13:20, Paul Yang wrote:
bq.   trunk/metastore/if/hive_metastore.thrift, line 277
bq.   https://reviews.apache.org/r/910/diff/1/?file=21192#file21192line277
bq.  
bq.   The interface is a little odd because we have to use names like 
'owner' or 'retention' in addition to specifying the QueryFilterType. Maybe we 
should make the field that the QueryFilterType references be called 'field', so 
you'd have a filter like 'field = .*test_user.*' (for owner) or 'field  90' 
(for retention)

Actually, with the current implementation, the field in the filter string can 
be arbitrarily named; the field name is parsed out by the antlr grammar and 
renamed according to TableQueryFilterType.  So if TableQueryFilterType is 
OWNER, then the following filters are equivalent: 
owner like \.*test.\, 
key like \.*test.*\, 
field like \.*test.*\,
etc.


bq.  On 2011-06-17 00:13:20, Paul Yang wrote:
bq.   
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java, 
line 1203
bq.   https://reviews.apache.org/r/910/diff/1/?file=21193#file21193line1203
bq.  
bq.   Style issue, { should be on same line as if

Will fix.


bq.  On 2011-06-17 00:13:20, Paul Yang wrote:
bq.   
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java,
 lines 189-197
bq.   https://reviews.apache.org/r/910/diff/1/?file=21198#file21198line189
bq.  
bq.   JDO-174 looks like it was fixed a while back - is this still an 
issue?  and  may be useful operators for the the parameters field. (e.g. if 
retention were stored there instead of the member field)

I agree that the  and  could be pretty useful operators here.  The issue was 
said to be resolved over 5 years ago, but surprisingly I think it just closed a 
few days ago 
https://issues.apache.org/jira/browse/JDO-174?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel#issue-tabs.
  

In any case, comparing the result of a map.get() with anything other than 
equality throws the same error as the one described in JDO-174.  Trying to cast 
the result of map.get() to a String with ((String) 
this.parameters.get(keyName)  value throws the following seemingly 
unresolvable error:

org.datanucleus.store.exceptions.NoTableManagedException: Persistent class 
java.lang.String has no table in the database, but the operation requires it. 
Please check the specification of the MetaData for this class.


- Sohan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/910/#review855
---


On 2011-06-16 03:13:24, Sohan Jain wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/910/
bq.  ---
bq.  
bq.  (Updated 2011-06-16 03:13:24)
bq.  
bq.  
bq.  Review request for hive and Paul Yang.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Create a function listTableNamesByFilter that returns a list of names for 
tables in a database that match a certain filter. The syntax of the filter is 
similar to the one created by HIVE-1609. You can filter the table list based on 
owner, retention, or table parameter key/values. The filtering takes place at 
the JDO level for efficiency/speed.  Added a QueryFilterType enum to easily add 
new filters and separate logic for filtering.  
bq.  
bq.  Example filter statements include: 
bq.  filterType = QueryFilterType.OWNER; filter = owner like .*test_user.*
bq.  filterType = QueryFilterType.RETENTION; filter = retention  90 and 
retention  30
bq.  filterType = QueryFilterType.PARAMS; filter = numPartitions = \2\ and  

Re: Review Request: HIVE-2213: Optimize get_partition_names_ps()

2011-06-16 Thread Paul Yang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/878/#review858
---



trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
https://reviews.apache.org/r/878/#comment1877

Can we make this method parameterized to reduce the number of casts 
required? E.g.

private T Collection T getPartition...

We might have to do something like StringgetPartition... when making the 
call though.


- Paul


On 2011-06-16 23:30:02, Sohan Jain wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/878/
 ---
 
 (Updated 2011-06-16 23:30:02)
 
 
 Review request for hive and Paul Yang.
 
 
 Summary
 ---
 
 If a table has a large number of partitions, get_partition_names_ps() make 
 take a long time to execute, because we get all of the partition names from 
 the database. This is not very memory efficient, and the operation can be 
 pushed down to the JDO layer without getting all of the names first.
 
 
 This addresses bug HIVE-2213.
 https://issues.apache.org/jira/browse/HIVE-2213
 
 
 Diffs
 -
 
   trunk/common/src/java/org/apache/hadoop/hive/common/FileUtils.java 1135227 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 1135227 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 1135227 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
 1135227 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 
 1135227 
   
 trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
  1135227 
 
 Diff: https://reviews.apache.org/r/878/diff
 
 
 Testing
 ---
 
 Passes previous test cases for get_partition_names_ps() in TestHiveMetaStore.
 
 
 Thanks,
 
 Sohan
 




[jira] [Commented] (HIVE-2213) Optimize get_partition_names_ps()

2011-06-16 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050841#comment-13050841
 ] 

jirapos...@reviews.apache.org commented on HIVE-2213:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/878/#review858
---



trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
https://reviews.apache.org/r/878/#comment1877

Can we make this method parameterized to reduce the number of casts 
required? E.g.

private T Collection T getPartition...

We might have to do something like StringgetPartition... when making the 
call though.


- Paul


On 2011-06-16 23:30:02, Sohan Jain wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/878/
bq.  ---
bq.  
bq.  (Updated 2011-06-16 23:30:02)
bq.  
bq.  
bq.  Review request for hive and Paul Yang.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  If a table has a large number of partitions, get_partition_names_ps() make 
take a long time to execute, because we get all of the partition names from the 
database. This is not very memory efficient, and the operation can be pushed 
down to the JDO layer without getting all of the names first.
bq.  
bq.  
bq.  This addresses bug HIVE-2213.
bq.  https://issues.apache.org/jira/browse/HIVE-2213
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.trunk/common/src/java/org/apache/hadoop/hive/common/FileUtils.java 
1135227 
bq.
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1135227 
bq.
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
1135227 
bq.trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
1135227 
bq.trunk/metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 
1135227 
bq.
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 1135227 
bq.  
bq.  Diff: https://reviews.apache.org/r/878/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Passes previous test cases for get_partition_names_ps() in 
TestHiveMetaStore.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Sohan
bq.  
bq.



 Optimize get_partition_names_ps()
 -

 Key: HIVE-2213
 URL: https://issues.apache.org/jira/browse/HIVE-2213
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sohan Jain
Assignee: Sohan Jain
 Attachments: HIVE-2213.1.patch, HIVE-2213.3.patch


 If a table has a large number of partitions, get_partition_names_ps() make 
 take a long time to execute, because we get all of the partition names from 
 the database.  This is not very memory efficient, and the operation can be 
 pushed down to the JDO layer without getting all of the names first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Review Request: HIVE-2227 Remove ProgressCounter enum in Operator

2011-06-16 Thread Zhuoluo Yang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/931/
---

Review request for hive and Yongqiang He.


Summary
---

After HIVE-1701, it is of no use to keep a heavy counterNameToEnum hashmap. We 
can use string directly, for the enum is only a hack for hadoop 0.17. The 
string will be human readable in the jobdetails.jsp instead of C1, C2, ... 
C1000.


This addresses bug 2227.
https://issues.apache.org/jira/browse/2227


Diffs
-

  
http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
 1136381 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java
 1136381 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java
 1136381 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java
 1136381 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 1136381 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java
 1136381 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
 1136381 

Diff: https://reviews.apache.org/r/931/diff


Testing
---


Thanks,

Zhuoluo



Sets(Collection) Support

2011-06-16 Thread Sophia Cui
Hi,

I've been working with hive lately.  There's currently support for Maps and
Lists as collections.  I was wondering how I would go about building support
for Sets?  Will I need to build ObjectInspectors? Or at the very least have
the parser treat sets as Lists.

Thanks!
  Sophia


[jira] [Commented] (HIVE-2219) Make alter table drop partition more efficient

2011-06-16 Thread Paul Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050847#comment-13050847
 ] 

Paul Yang commented on HIVE-2219:
-

Can you make a reviewboard instance?

 Make alter table drop partition more efficient
 

 Key: HIVE-2219
 URL: https://issues.apache.org/jira/browse/HIVE-2219
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Sohan Jain
Assignee: Sohan Jain
 Attachments: HIVE-2219.1.patch


 The current function dropTable() that handles dropping multiple partitions is 
 somewhat inefficient.  For each partition you want to drop, it loops through 
 each partition in the table to see if the partition exists.  This is an 
 _O(mn)_ operation, where _m_ is the number of partitions to drop, and _n_ is 
 the number of partitions in the table.  The running time of this function can 
 be improved, which is useful for tables with many partitions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: HIVE-2227 Remove ProgressCounter enum in Operator

2011-06-16 Thread Zhuoluo Yang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/931/
---

(Updated 2011-06-17 02:49:48.793536)


Review request for hive and Yongqiang He.


Changes
---

update bugs number


Summary
---

After HIVE-1701, it is of no use to keep a heavy counterNameToEnum hashmap. We 
can use string directly, for the enum is only a hack for hadoop 0.17. The 
string will be human readable in the jobdetails.jsp instead of C1, C2, ... 
C1000.


This addresses bug HIVE-2227.
https://issues.apache.org/jira/browse/HIVE-2227


Diffs
-

  
http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
 1136381 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java
 1136381 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java
 1136381 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java
 1136381 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 1136381 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java
 1136381 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
 1136381 

Diff: https://reviews.apache.org/r/931/diff


Testing
---


Thanks,

Zhuoluo



[jira] [Commented] (HIVE-2227) Remove ProgressCounter enum in Operator

2011-06-16 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050876#comment-13050876
 ] 

jirapos...@reviews.apache.org commented on HIVE-2227:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/931/
---

(Updated 2011-06-17 02:49:48.793536)


Review request for hive and Yongqiang He.


Changes
---

update bugs number


Summary
---

After HIVE-1701, it is of no use to keep a heavy counterNameToEnum hashmap. We 
can use string directly, for the enum is only a hack for hadoop 0.17. The 
string will be human readable in the jobdetails.jsp instead of C1, C2, ... 
C1000.


This addresses bug HIVE-2227.
https://issues.apache.org/jira/browse/HIVE-2227


Diffs
-

  
http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
 1136381 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java
 1136381 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java
 1136381 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java
 1136381 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 1136381 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java
 1136381 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
 1136381 

Diff: https://reviews.apache.org/r/931/diff


Testing
---


Thanks,

Zhuoluo



 Remove ProgressCounter enum in Operator
 ---

 Key: HIVE-2227
 URL: https://issues.apache.org/jira/browse/HIVE-2227
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.8.0
Reporter: Zhuoluo (Clark) Yang
Priority: Minor
 Fix For: 0.8.0

 Attachments: HIVE-2227-1.patch


 After HIVE-1701, it is of no use to keep a heavy counterNameToEnum hashmap. 
 We can use string directly, for the enum is only a hack for hadoop 0.17. The 
 string will be human readable in the jobdetails.jsp instead of C1, C2, ... 
 C1000.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2227) Remove ProgressCounter enum in Operator

2011-06-16 Thread Zhuoluo (Clark) Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050875#comment-13050875
 ] 

Zhuoluo (Clark) Yang commented on HIVE-2227:


Review board
https://reviews.apache.org/r/931/

 Remove ProgressCounter enum in Operator
 ---

 Key: HIVE-2227
 URL: https://issues.apache.org/jira/browse/HIVE-2227
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.8.0
Reporter: Zhuoluo (Clark) Yang
Priority: Minor
 Fix For: 0.8.0

 Attachments: HIVE-2227-1.patch


 After HIVE-1701, it is of no use to keep a heavy counterNameToEnum hashmap. 
 We can use string directly, for the enum is only a hack for hadoop 0.17. The 
 string will be human readable in the jobdetails.jsp instead of C1, C2, ... 
 C1000.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2219) Make alter table drop partition more efficient

2011-06-16 Thread Sohan Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sohan Jain updated HIVE-2219:
-

Status: Open  (was: Patch Available)

 Make alter table drop partition more efficient
 

 Key: HIVE-2219
 URL: https://issues.apache.org/jira/browse/HIVE-2219
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Sohan Jain
Assignee: Sohan Jain
 Attachments: HIVE-2219.1.patch


 The current function dropTable() that handles dropping multiple partitions is 
 somewhat inefficient.  For each partition you want to drop, it loops through 
 each partition in the table to see if the partition exists.  This is an 
 _O(mn)_ operation, where _m_ is the number of partitions to drop, and _n_ is 
 the number of partitions in the table.  The running time of this function can 
 be improved, which is useful for tables with many partitions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2219) Make alter table drop partition more efficient

2011-06-16 Thread Sohan Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050874#comment-13050874
 ] 

Sohan Jain commented on HIVE-2219:
--

Ah sorry, after another round of testing, I realized this doesn't work 
correctly at all for partial partition specs!  I will re-implement it and test 
again for speed / full correctness.

 Make alter table drop partition more efficient
 

 Key: HIVE-2219
 URL: https://issues.apache.org/jira/browse/HIVE-2219
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Sohan Jain
Assignee: Sohan Jain
 Attachments: HIVE-2219.1.patch


 The current function dropTable() that handles dropping multiple partitions is 
 somewhat inefficient.  For each partition you want to drop, it loops through 
 each partition in the table to see if the partition exists.  This is an 
 _O(mn)_ operation, where _m_ is the number of partitions to drop, and _n_ is 
 the number of partitions in the table.  The running time of this function can 
 be improved, which is useful for tables with many partitions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira