[jira] [Assigned] (HIVE-2125) alter table concatenate fails and deletes data

2011-04-22 Thread Joydeep Sen Sarma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joydeep Sen Sarma reassigned HIVE-2125:
---

Assignee: He Yongqiang

 alter table concatenate fails and deletes data
 --

 Key: HIVE-2125
 URL: https://issues.apache.org/jira/browse/HIVE-2125
 Project: Hive
  Issue Type: Bug
Reporter: Joydeep Sen Sarma
Assignee: He Yongqiang
Priority: Critical

 the number of reducers is not set by this command (unlike other hive 
 queries). since mapred.reduce.tasks=-1 (to let hive infer this automatically) 
 - jobtracker fails the job (number of reducers cannot be negative)
 hive alter table ad_imps_2 partition(ds='2009-06-16') concatenate;
 alter table ad_imps_2 partition(ds='2009-06-16') concatenate;
 Starting Job = job_201103101203_453180, Tracking URL = 
 http://curium.data.facebook.com:50030/jobdetails.jsp?jobid=job_201103101203_453180
 Kill Command = /mnt/vol/hive/sites/curium/hadoop/bin/../bin/hadoop job  
 -Dmapred.job.tracker=curium.data.facebook.com:50029 -kill 
 job_201103101203_453180
 Hadoop job information for null: number of mappers: 0; number of reducers: 0
 2011-04-22 10:21:24,046 null map = 100%,  reduce = 100%
 Ended Job = job_201103101203_453180 with errors
 Moved to trash: /user/facebook/warehouse/ad_imps_2/_backup.ds=2009-06-16
 after the job fails - the partition is deleted
 thankfully it's still in trash

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-2125) alter table concatenate fails and deletes data

2011-04-22 Thread Joydeep Sen Sarma (JIRA)
alter table concatenate fails and deletes data
--

 Key: HIVE-2125
 URL: https://issues.apache.org/jira/browse/HIVE-2125
 Project: Hive
  Issue Type: Bug
Reporter: Joydeep Sen Sarma
Priority: Critical


the number of reducers is not set by this command (unlike other hive queries). 
since mapred.reduce.tasks=-1 (to let hive infer this automatically) - 
jobtracker fails the job (number of reducers cannot be negative)

hive alter table ad_imps_2 partition(ds='2009-06-16') concatenate;
alter table ad_imps_2 partition(ds='2009-06-16') concatenate;
Starting Job = job_201103101203_453180, Tracking URL = 
http://curium.data.facebook.com:50030/jobdetails.jsp?jobid=job_201103101203_453180
Kill Command = /mnt/vol/hive/sites/curium/hadoop/bin/../bin/hadoop job  
-Dmapred.job.tracker=curium.data.facebook.com:50029 -kill 
job_201103101203_453180
Hadoop job information for null: number of mappers: 0; number of reducers: 0
2011-04-22 10:21:24,046 null map = 100%,  reduce = 100%
Ended Job = job_201103101203_453180 with errors
Moved to trash: /user/facebook/warehouse/ad_imps_2/_backup.ds=2009-06-16
after the job fails - the partition is deleted

thankfully it's still in trash

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2038) Metastore listener

2011-04-22 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13023320#comment-13023320
 ] 

Carl Steinbach commented on HIVE-2038:
--

I think the meaning of finalizing a partition is actually defined by the 
metastore client, since it's the client that has to call finalizePartition(). 
But when is the client supposed to call this? What happens if you have multiple 
listeners registered which each has a different idea of what it means to 
finalize a partition? I think the main problem with this is that the name of 
the method gives the impression that this is somehow well defined, when in fact 
the definition is left completely up to the application.

It sounds like what you actually want is a mechanism that allows the metastore 
client to send application specific events to metastore listeners. Is this 
accurate?



 Metastore listener
 --

 Key: HIVE-2038
 URL: https://issues.apache.org/jira/browse/HIVE-2038
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.8.0

 Attachments: hive-2038.patch, metastore_listener.patch, 
 metastore_listener.patch, metastore_listener.patch


 Provide to way to observe changes happening on Metastore

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-2126) Hive's symlink text input format should be able to work with ComineHiveInputFormat

2011-04-22 Thread He Yongqiang (JIRA)
Hive's symlink text input format should be able to work with 
ComineHiveInputFormat
--

 Key: HIVE-2126
 URL: https://issues.apache.org/jira/browse/HIVE-2126
 Project: Hive
  Issue Type: Improvement
Reporter: He Yongqiang
Assignee: He Yongqiang


at compile time, if a partition's file format is SymlinkTextInputFormat, will 
replace the symlink path with paths in the symlink file. This way, it will work 
with Hive's HiveCombineFileInputFormat.

The reason we are doing it at compile time is because:
1) At run time, the input path is not only used to get record reader, but also 
used for hive to get aliases and thus operator tree. But the 
CombineHiveInputFormat can have multiple paths for each split, and when 
switching paths, it also set the job with new input file name. So it always 
require a real input path name. Can not fake it.
2) if write a new input format, it will require a lot of duplication work with 
existing CombineHiveInputFormat.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Hive-0.7.0-h0.20 #83

2011-04-22 Thread Apache Jenkins Server
See https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/83/

--
[...truncated 26989 lines...]
[junit] OK
[junit] PREHOOK: query: LOAD DATA LOCAL INPATH 
'https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt'
 INTO TABLE src
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt
[junit] Copying file: 
https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table default.src
[junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 
'https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt'
 INTO TABLE src
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@src
[junit] OK
[junit] Copying file: 
https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv3.txt
[junit] PREHOOK: query: LOAD DATA LOCAL INPATH 
'https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv3.txt'
 INTO TABLE src1
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv3.txt
[junit] Loading data to table default.src1
[junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 
'https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv3.txt'
 INTO TABLE src1
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@src1
[junit] OK
[junit] Copying file: 
https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.seq
[junit] PREHOOK: query: LOAD DATA LOCAL INPATH 
'https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.seq'
 INTO TABLE src_sequencefile
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.seq
[junit] Loading data to table default.src_sequencefile
[junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 
'https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.seq'
 INTO TABLE src_sequencefile
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@src_sequencefile
[junit] OK
[junit] Copying file: 
https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/complex.seq
[junit] PREHOOK: query: LOAD DATA LOCAL INPATH 
'https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/complex.seq'
 INTO TABLE src_thrift
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/complex.seq
[junit] Loading data to table default.src_thrift
[junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 
'https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/complex.seq'
 INTO TABLE src_thrift
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@src_thrift
[junit] OK
[junit] PREHOOK: query: LOAD DATA LOCAL INPATH 
'https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/json.txt'
 INTO TABLE src_json
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/json.txt
[junit] Copying file: 
https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/json.txt
[junit] Loading data to table default.src_json
[junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 
'https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/json.txt'
 INTO TABLE src_json
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@src_json
[junit] OK
[junit] diff 
https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build/ql/test/logs/negative/wrong_distinct1.q.out
 
https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/ql/src/test/results/compiler/errors/wrong_distinct1.q.out
[junit] Hive history 
file=https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build/ql/tmp/hive_job_log_hudson_201104221207_784944570.txt
[junit] Done query: wrong_distinct1.q
[junit] Begin query: wrong_distinct2.q
[junit] Hive history 
file=https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build/ql/tmp/hive_job_log_hudson_201104221207_731004933.txt
[junit] PREHOOK: query: LOAD DATA LOCAL INPATH 
'https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt'
 OVERWRITE INTO TABLE srcpart PARTITION (ds='2008-04-08',hr='11')
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt
[junit] Copying file: 
https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table default.srcpart partition (ds=2008-04-08, 
hr=11)
[junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 

Build failed in Jenkins: Hive-trunk-h0.20 #686

2011-04-22 Thread Apache Jenkins Server
See https://builds.apache.org/hudson/job/Hive-trunk-h0.20/686/

--
[...truncated 30060 lines...]
[junit] OK
[junit] PREHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-04-22_12-28-43_332_4721842587008106825/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks determined at compile time: 1
[junit] In order to change the average load for a reducer (in bytes):
[junit]   set hive.exec.reducers.bytes.per.reducer=number
[junit] In order to limit the maximum number of reducers:
[junit]   set hive.exec.reducers.max=number
[junit] In order to set a constant number of reducers:
[junit]   set mapred.reduce.tasks=number
[junit] Job running in-process (local Hadoop)
[junit] Hadoop job information for null: number of mappers: 0; number of 
reducers: 0
[junit] 2011-04-22 12:28:46,419 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-04-22_12-28-43_332_4721842587008106825/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=https://builds.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/build/service/tmp/hive_job_log_hudson_201104221228_1306966870.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'https://builds.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] Copying data from 
https://builds.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'https://builds.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-04-22_12-28-47_967_3588446909744167948/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-04-22_12-28-47_967_3588446909744167948/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=https://builds.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/build/service/tmp/hive_job_log_hudson_201104221228_594334426.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE

[jira] [Commented] (HIVE-1803) Implement bitmap indexing in Hive

2011-04-22 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13023355#comment-13023355
 ] 

John Sichi commented on HIVE-1803:
--

Meh, I'm still getting numRows failures myself.  I noticed that your patch 
includes some changes to existing test outputs (e.g. bucketmapjoin1.q.out) 
where it is setting the expected numRows to 0; you should have reverted those 
before generating the patch.  But the failure I got was in another existing 
test (filter_join_breaktask).  I'm trying again after reverting the ones you 
changed (in case the failure I saw was a side effect), but I'm pessimistic; I'm 
wondering if something innocuous about the change is somehow exposing some 
existing non-determinism.


 Implement bitmap indexing in Hive
 -

 Key: HIVE-1803
 URL: https://issues.apache.org/jira/browse/HIVE-1803
 Project: Hive
  Issue Type: New Feature
  Components: Indexing
Reporter: Marquis Wang
Assignee: Marquis Wang
 Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, 
 HIVE-1803.11.patch, HIVE-1803.12.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, 
 HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, 
 HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, 
 bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, 
 unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch


 Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: HIVE-1644 Use filter pushdown for automatically accessing indexes

2011-04-22 Thread John Sichi

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/558/#review530
---



ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java
https://reviews.apache.org/r/558/#comment1106

Create a followup task for dealing with jobs which access multiple tables.  
For that, we need to associate the index formats/files with specific tables, 
and that requires modifying the way the index input format works.




ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java
https://reviews.apache.org/r/558/#comment1105

Create a followup task for displaying these in the plan (to indicate that a 
table scan's input is being filtered by the intermediate file).  We only want 
to do that when they are non-null (to avoid upsetting all the existing test 
reference files).





ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java
https://reviews.apache.org/r/558/#comment1099

spacing



ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java
https://reviews.apache.org/r/558/#comment1100

spacing




ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
https://reviews.apache.org/r/558/#comment1102

spacing




ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
https://reviews.apache.org/r/558/#comment1101

When logging errors being propagated, use the two-arg version of the method 
and pass e as the second arg.  Same thing in a few other places.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
https://reviews.apache.org/r/558/#comment1103

curly bracket placement



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
https://reviews.apache.org/r/558/#comment1104

create a followup for this one



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
https://reviews.apache.org/r/558/#comment1098

This is not an error, just a condition that prevents usage of the index, so 
it should be logged as info rather than error.


- John


On 2011-04-22 03:50:54, Russell Melick wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/558/
 ---
 
 (Updated 2011-04-22 03:50:54)
 
 
 Review request for hive.
 
 
 Summary
 ---
 
 Review request for HIVE-1644.12.patch
 
 
 This addresses bug HIVE-1644.
 https://issues.apache.org/jira/browse/HIVE-1644
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 2cdaeb6 
   conf/hive-default.xml 79ea477 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java ca337a8 
   ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 69ee03b 
   ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java c02d90b 
   ql/src/java/org/apache/hadoop/hive/ql/index/AbstractIndexHandler.java 
 dd0186d 
   ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java 411b78f 
   ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexQueryContext.java 
 PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
 1f01446 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 50db44c 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
 6162676 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/IndexWhereResolver.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/PhysicalOptimizer.java
  0ae9fa2 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcCtx.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 374e123 
   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java c41bb32 
   ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java 73391e9 
   ql/src/test/queries/clientpositive/index_auto.q PRE-CREATION 
   ql/src/test/queries/clientpositive/index_auto_file_format.q PRE-CREATION 
   ql/src/test/queries/clientpositive/index_auto_multiple.q PRE-CREATION 
   ql/src/test/queries/clientpositive/index_auto_partitioned.q PRE-CREATION 
   ql/src/test/queries/clientpositive/index_auto_unused.q PRE-CREATION 
   ql/src/test/results/clientpositive/index_auto.q.out PRE-CREATION 
   ql/src/test/results/clientpositive/index_auto_file_format.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/index_auto_multiple.q.out PRE-CREATION 
   

[jira] [Commented] (HIVE-1644) use filter pushdown for automatically accessing indexes

2011-04-22 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13023362#comment-13023362
 ] 

John Sichi commented on HIVE-1644:
--

Looks good, I added a few minor comments and requests for followup creation.


 use filter pushdown for automatically accessing indexes
 ---

 Key: HIVE-1644
 URL: https://issues.apache.org/jira/browse/HIVE-1644
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: John Sichi
Assignee: Russell Melick
 Attachments: HIVE-1644.1.patch, HIVE-1644.10.patch, 
 HIVE-1644.11.patch, HIVE-1644.12.patch, HIVE-1644.13.patch, 
 HIVE-1644.14.patch, HIVE-1644.15.patch, HIVE-1644.2.patch, HIVE-1644.3.patch, 
 HIVE-1644.4.patch, HIVE-1644.5.patch, HIVE-1644.6.patch, HIVE-1644.7.patch, 
 HIVE-1644.8.patch, HIVE-1644.9.patch


 HIVE-1226 provides utilities for analyzing filters which have been pushed 
 down to a table scan.  The next step is to use these for selecting available 
 indexes and generating access plans for those indexes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-1644) use filter pushdown for automatically accessing indexes

2011-04-22 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13023365#comment-13023365
 ] 

jirapos...@reviews.apache.org commented on HIVE-1644:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/558/#review530
---



ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java
https://reviews.apache.org/r/558/#comment1106

Create a followup task for dealing with jobs which access multiple tables.  
For that, we need to associate the index formats/files with specific tables, 
and that requires modifying the way the index input format works.




ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java
https://reviews.apache.org/r/558/#comment1105

Create a followup task for displaying these in the plan (to indicate that a 
table scan's input is being filtered by the intermediate file).  We only want 
to do that when they are non-null (to avoid upsetting all the existing test 
reference files).





ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java
https://reviews.apache.org/r/558/#comment1099

spacing



ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java
https://reviews.apache.org/r/558/#comment1100

spacing




ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
https://reviews.apache.org/r/558/#comment1102

spacing




ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
https://reviews.apache.org/r/558/#comment1101

When logging errors being propagated, use the two-arg version of the method 
and pass e as the second arg.  Same thing in a few other places.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
https://reviews.apache.org/r/558/#comment1103

curly bracket placement



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
https://reviews.apache.org/r/558/#comment1104

create a followup for this one



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
https://reviews.apache.org/r/558/#comment1098

This is not an error, just a condition that prevents usage of the index, so 
it should be logged as info rather than error.


- John


On 2011-04-22 03:50:54, Russell Melick wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/558/
bq.  ---
bq.  
bq.  (Updated 2011-04-22 03:50:54)
bq.  
bq.  
bq.  Review request for hive.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Review request for HIVE-1644.12.patch
bq.  
bq.  
bq.  This addresses bug HIVE-1644.
bq.  https://issues.apache.org/jira/browse/HIVE-1644
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 2cdaeb6 
bq.conf/hive-default.xml 79ea477 
bq.ql/src/java/org/apache/hadoop/hive/ql/Driver.java ca337a8 
bq.ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 69ee03b 
bq.ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java 
c02d90b 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/AbstractIndexHandler.java 
dd0186d 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java 
411b78f 
bq.ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexQueryContext.java 
PRE-CREATION 
bq.
ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
1f01446 
bq.ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 50db44c 
bq.ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 
6162676 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/IndexWhereResolver.java
 PRE-CREATION 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/PhysicalOptimizer.java 
0ae9fa2 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcCtx.java
 PRE-CREATION 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
 PRE-CREATION 
bq.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
 PRE-CREATION 
bq.ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 374e123 
bq.ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
c41bb32 
bq.ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java 73391e9 
bq.ql/src/test/queries/clientpositive/index_auto.q PRE-CREATION 
bq.ql/src/test/queries/clientpositive/index_auto_file_format.q PRE-CREATION 
bq.

[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

2011-04-22 Thread John Sichi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi updated HIVE-1803:
-

Status: Open  (was: Patch Available)

OK, I dug into it and found out that it was a problem with HADOOP_CLASSPATH 
preventing derby.jar getting loaded (so stats couldn't be written from Hadoop 
tasks, hence numRows=0).

The existing HADOOP_CLASSPATH was already incorrect, but the problem was only 
exposed by the addition of the javaewah-0.2.jar.  It was using commas for 
separators instead of colons (and it should not have been using file: at all!).

Here's the correct format with which I was able to pass a few failing tests I 
tried individually:

{noformat}
  env key=HADOOP_CLASSPATH value=${test.src.data.dir}/conf:${build.dir.\
hive}/dist/lib/derby.jar:${build.dir.hive}/dist/lib/javaewah-0.2.jar/
{noformat}

Can you give me another patch which fixes this and omits all .q.out updates for 
existing tests unless they need it?  Fingers crossed that will be the last one.


 Implement bitmap indexing in Hive
 -

 Key: HIVE-1803
 URL: https://issues.apache.org/jira/browse/HIVE-1803
 Project: Hive
  Issue Type: New Feature
  Components: Indexing
Reporter: Marquis Wang
Assignee: Marquis Wang
 Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, 
 HIVE-1803.11.patch, HIVE-1803.12.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, 
 HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, 
 HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, 
 bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, 
 unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch


 Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2038) Metastore listener

2011-04-22 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13023429#comment-13023429
 ] 

Ashutosh Chauhan commented on HIVE-2038:


Ya, you are accurate. A mechanism for metastore client to send application 
specific event to metastore listener. I agree finalize may not be an 
appropriate choice here. Can't think of anything better. Any suggestions : )

 Metastore listener
 --

 Key: HIVE-2038
 URL: https://issues.apache.org/jira/browse/HIVE-2038
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.8.0

 Attachments: hive-2038.patch, metastore_listener.patch, 
 metastore_listener.patch, metastore_listener.patch


 Provide to way to observe changes happening on Metastore

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2038) Metastore listener

2011-04-22 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13023430#comment-13023430
 ] 

Ashutosh Chauhan commented on HIVE-2038:


Shall we call it takeActionOnPartition() ? Too verbose ?

 Metastore listener
 --

 Key: HIVE-2038
 URL: https://issues.apache.org/jira/browse/HIVE-2038
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.8.0

 Attachments: hive-2038.patch, metastore_listener.patch, 
 metastore_listener.patch, metastore_listener.patch


 Provide to way to observe changes happening on Metastore

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2038) Metastore listener

2011-04-22 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13023439#comment-13023439
 ] 

Carl Steinbach commented on HIVE-2038:
--

If you do it this way I think you can only support registering a single 
listener. This follows from the fact that the meaning of a 
takeActionOnPartition() event is specific to a particular application, but the 
listener has no way of knowing which application fired the event. I don't think 
this is an acceptable limitation.

You can get around this by defining a ListenerEvent base class that third-party 
applications are allowed to extend. Applications can then fire this event from 
the client side, and listeners can register events that they are interested in 
listening for using an event type registry. Getting this to work is further 
complicated by the fact that you have to support serialization of the event 
objects over the Thrift interface.

I think it's appropriate to tackle this problem in a separate JIRA. I'd like to 
see some concrete use cases and discuss alternatives. I'm not convinced that 
the MetastoreClient/MetastoreListener should support the ability to fire 
arbitrary events.

 Metastore listener
 --

 Key: HIVE-2038
 URL: https://issues.apache.org/jira/browse/HIVE-2038
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.8.0

 Attachments: hive-2038.patch, metastore_listener.patch, 
 metastore_listener.patch, metastore_listener.patch


 Provide to way to observe changes happening on Metastore

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

2011-04-22 Thread Marquis Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marquis Wang updated HIVE-1803:
---

Status: Patch Available  (was: Open)

 Implement bitmap indexing in Hive
 -

 Key: HIVE-1803
 URL: https://issues.apache.org/jira/browse/HIVE-1803
 Project: Hive
  Issue Type: New Feature
  Components: Indexing
Reporter: Marquis Wang
Assignee: Marquis Wang
 Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, 
 HIVE-1803.11.patch, HIVE-1803.12.patch, HIVE-1803.13.patch, 
 HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, 
 HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, 
 JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, 
 javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch


 Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

2011-04-22 Thread Marquis Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marquis Wang updated HIVE-1803:
---

Attachment: HIVE-1803.13.patch

New patch that updates HADOOP_CLASSPATH and doesn't change tests except adding 
new tests and show_functions.q. Fingers crossed for this one passing. I'm 
optimistic.

 Implement bitmap indexing in Hive
 -

 Key: HIVE-1803
 URL: https://issues.apache.org/jira/browse/HIVE-1803
 Project: Hive
  Issue Type: New Feature
  Components: Indexing
Reporter: Marquis Wang
Assignee: Marquis Wang
 Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, 
 HIVE-1803.11.patch, HIVE-1803.12.patch, HIVE-1803.13.patch, 
 HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, 
 HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, 
 JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, 
 javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch


 Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2123) CommandNeedRetryException needs release locks

2011-04-22 Thread Siying Dong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13023474#comment-13023474
 ] 

Siying Dong commented on HIVE-2123:
---

@Carl?

 CommandNeedRetryException needs release locks
 -

 Key: HIVE-2123
 URL: https://issues.apache.org/jira/browse/HIVE-2123
 Project: Hive
  Issue Type: Bug
Reporter: Siying Dong
Assignee: Siying Dong
 Attachments: HIVE-2123.1.patch, HIVE-2123.2.patch


 now when CommandNeedRetryException is thrown, locks are not released. Not 
 sure whether it will cause problem, since the same locks will be acquired 
 when retrying it. It is anyway something we need to fix. Also we can do some 
 little code cleaning up to make future mistakes less likely.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2123) CommandNeedRetryException needs release locks

2011-04-22 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13023489#comment-13023489
 ] 

Carl Steinbach commented on HIVE-2123:
--

@Siying: Thanks for making the change.

+1 on this patch, but I'm -1 on the overall approach of using status return 
codes instead of exceptions. Hopefully we can replace the status codes with 
exceptions during some future cleanup effort.

 CommandNeedRetryException needs release locks
 -

 Key: HIVE-2123
 URL: https://issues.apache.org/jira/browse/HIVE-2123
 Project: Hive
  Issue Type: Bug
Reporter: Siying Dong
Assignee: Siying Dong
 Attachments: HIVE-2123.1.patch, HIVE-2123.2.patch


 now when CommandNeedRetryException is thrown, locks are not released. Not 
 sure whether it will cause problem, since the same locks will be acquired 
 when retrying it. It is anyway something we need to fix. Also we can do some 
 little code cleaning up to make future mistakes less likely.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-2127) Improve stats gathering reliability by retries on failures

2011-04-22 Thread Ning Zhang (JIRA)
Improve stats gathering reliability by retries on failures
--

 Key: HIVE-2127
 URL: https://issues.apache.org/jira/browse/HIVE-2127
 Project: Hive
  Issue Type: Improvement
Reporter: Ning Zhang
Assignee: Ning Zhang


Stats publishing and aggregation only try once and if there is any exception it 
will fail and return. If many mappers/reducers updating stats at the same time, 
it is very common to get lock timeout. We should make stats more reliable by 
retry when there is an SQLException.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-2129) Display indexing information for TableScanOperator in plan

2011-04-22 Thread Russell Melick (JIRA)
Display indexing information for TableScanOperator in plan
--

 Key: HIVE-2129
 URL: https://issues.apache.org/jira/browse/HIVE-2129
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick


Show the indexInputFormat and indexIntermediateFile in the plan, to indicate 
that the table scan's input is being filtered by the intermediate file.  But, 
we only want to do this when these values are non-null (could use the 
usesIndex() function), so that all the old tests aren't messed up.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-2130) Cost based choice for rewrite during Automatic Indexing

2011-04-22 Thread Russell Melick (JIRA)
Cost based choice for rewrite during Automatic Indexing
---

 Key: HIVE-2130
 URL: https://issues.apache.org/jira/browse/HIVE-2130
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick


After processing a predicate, there are potentially multiple index rewrites 
possible.  Currently, we just choose the first one.  However, there are 
probably heuristics for choosing certain rewrites over others, based on 
potential time savings. See IndexWhereProcessor for a good place to do this.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira