[jira] [Assigned] (HIVE-2125) alter table concatenate fails and deletes data
[ https://issues.apache.org/jira/browse/HIVE-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joydeep Sen Sarma reassigned HIVE-2125: --- Assignee: He Yongqiang alter table concatenate fails and deletes data -- Key: HIVE-2125 URL: https://issues.apache.org/jira/browse/HIVE-2125 Project: Hive Issue Type: Bug Reporter: Joydeep Sen Sarma Assignee: He Yongqiang Priority: Critical the number of reducers is not set by this command (unlike other hive queries). since mapred.reduce.tasks=-1 (to let hive infer this automatically) - jobtracker fails the job (number of reducers cannot be negative) hive alter table ad_imps_2 partition(ds='2009-06-16') concatenate; alter table ad_imps_2 partition(ds='2009-06-16') concatenate; Starting Job = job_201103101203_453180, Tracking URL = http://curium.data.facebook.com:50030/jobdetails.jsp?jobid=job_201103101203_453180 Kill Command = /mnt/vol/hive/sites/curium/hadoop/bin/../bin/hadoop job -Dmapred.job.tracker=curium.data.facebook.com:50029 -kill job_201103101203_453180 Hadoop job information for null: number of mappers: 0; number of reducers: 0 2011-04-22 10:21:24,046 null map = 100%, reduce = 100% Ended Job = job_201103101203_453180 with errors Moved to trash: /user/facebook/warehouse/ad_imps_2/_backup.ds=2009-06-16 after the job fails - the partition is deleted thankfully it's still in trash -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-2125) alter table concatenate fails and deletes data
alter table concatenate fails and deletes data -- Key: HIVE-2125 URL: https://issues.apache.org/jira/browse/HIVE-2125 Project: Hive Issue Type: Bug Reporter: Joydeep Sen Sarma Priority: Critical the number of reducers is not set by this command (unlike other hive queries). since mapred.reduce.tasks=-1 (to let hive infer this automatically) - jobtracker fails the job (number of reducers cannot be negative) hive alter table ad_imps_2 partition(ds='2009-06-16') concatenate; alter table ad_imps_2 partition(ds='2009-06-16') concatenate; Starting Job = job_201103101203_453180, Tracking URL = http://curium.data.facebook.com:50030/jobdetails.jsp?jobid=job_201103101203_453180 Kill Command = /mnt/vol/hive/sites/curium/hadoop/bin/../bin/hadoop job -Dmapred.job.tracker=curium.data.facebook.com:50029 -kill job_201103101203_453180 Hadoop job information for null: number of mappers: 0; number of reducers: 0 2011-04-22 10:21:24,046 null map = 100%, reduce = 100% Ended Job = job_201103101203_453180 with errors Moved to trash: /user/facebook/warehouse/ad_imps_2/_backup.ds=2009-06-16 after the job fails - the partition is deleted thankfully it's still in trash -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2038) Metastore listener
[ https://issues.apache.org/jira/browse/HIVE-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13023320#comment-13023320 ] Carl Steinbach commented on HIVE-2038: -- I think the meaning of finalizing a partition is actually defined by the metastore client, since it's the client that has to call finalizePartition(). But when is the client supposed to call this? What happens if you have multiple listeners registered which each has a different idea of what it means to finalize a partition? I think the main problem with this is that the name of the method gives the impression that this is somehow well defined, when in fact the definition is left completely up to the application. It sounds like what you actually want is a mechanism that allows the metastore client to send application specific events to metastore listeners. Is this accurate? Metastore listener -- Key: HIVE-2038 URL: https://issues.apache.org/jira/browse/HIVE-2038 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.8.0 Attachments: hive-2038.patch, metastore_listener.patch, metastore_listener.patch, metastore_listener.patch Provide to way to observe changes happening on Metastore -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-2126) Hive's symlink text input format should be able to work with ComineHiveInputFormat
Hive's symlink text input format should be able to work with ComineHiveInputFormat -- Key: HIVE-2126 URL: https://issues.apache.org/jira/browse/HIVE-2126 Project: Hive Issue Type: Improvement Reporter: He Yongqiang Assignee: He Yongqiang at compile time, if a partition's file format is SymlinkTextInputFormat, will replace the symlink path with paths in the symlink file. This way, it will work with Hive's HiveCombineFileInputFormat. The reason we are doing it at compile time is because: 1) At run time, the input path is not only used to get record reader, but also used for hive to get aliases and thus operator tree. But the CombineHiveInputFormat can have multiple paths for each split, and when switching paths, it also set the job with new input file name. So it always require a real input path name. Can not fake it. 2) if write a new input format, it will require a lot of duplication work with existing CombineHiveInputFormat. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Build failed in Jenkins: Hive-0.7.0-h0.20 #83
See https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/83/ -- [...truncated 26989 lines...] [junit] OK [junit] PREHOOK: query: LOAD DATA LOCAL INPATH 'https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt' INTO TABLE src [junit] PREHOOK: type: LOAD [junit] Copying data from https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt [junit] Copying file: https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt [junit] Loading data to table default.src [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 'https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt' INTO TABLE src [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@src [junit] OK [junit] Copying file: https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv3.txt [junit] PREHOOK: query: LOAD DATA LOCAL INPATH 'https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv3.txt' INTO TABLE src1 [junit] PREHOOK: type: LOAD [junit] Copying data from https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv3.txt [junit] Loading data to table default.src1 [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 'https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv3.txt' INTO TABLE src1 [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@src1 [junit] OK [junit] Copying file: https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.seq [junit] PREHOOK: query: LOAD DATA LOCAL INPATH 'https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.seq' INTO TABLE src_sequencefile [junit] PREHOOK: type: LOAD [junit] Copying data from https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.seq [junit] Loading data to table default.src_sequencefile [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 'https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.seq' INTO TABLE src_sequencefile [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@src_sequencefile [junit] OK [junit] Copying file: https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/complex.seq [junit] PREHOOK: query: LOAD DATA LOCAL INPATH 'https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/complex.seq' INTO TABLE src_thrift [junit] PREHOOK: type: LOAD [junit] Copying data from https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/complex.seq [junit] Loading data to table default.src_thrift [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 'https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/complex.seq' INTO TABLE src_thrift [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@src_thrift [junit] OK [junit] PREHOOK: query: LOAD DATA LOCAL INPATH 'https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/json.txt' INTO TABLE src_json [junit] PREHOOK: type: LOAD [junit] Copying data from https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/json.txt [junit] Copying file: https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/json.txt [junit] Loading data to table default.src_json [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 'https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/json.txt' INTO TABLE src_json [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@src_json [junit] OK [junit] diff https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build/ql/test/logs/negative/wrong_distinct1.q.out https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/ql/src/test/results/compiler/errors/wrong_distinct1.q.out [junit] Hive history file=https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build/ql/tmp/hive_job_log_hudson_201104221207_784944570.txt [junit] Done query: wrong_distinct1.q [junit] Begin query: wrong_distinct2.q [junit] Hive history file=https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build/ql/tmp/hive_job_log_hudson_201104221207_731004933.txt [junit] PREHOOK: query: LOAD DATA LOCAL INPATH 'https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt' OVERWRITE INTO TABLE srcpart PARTITION (ds='2008-04-08',hr='11') [junit] PREHOOK: type: LOAD [junit] Copying data from https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt [junit] Copying file: https://builds.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt [junit] Loading data to table default.srcpart partition (ds=2008-04-08, hr=11) [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH
Build failed in Jenkins: Hive-trunk-h0.20 #686
See https://builds.apache.org/hudson/job/Hive-trunk-h0.20/686/ -- [...truncated 30060 lines...] [junit] OK [junit] PREHOOK: query: select count(1) as cnt from testhivedrivertable [junit] PREHOOK: type: QUERY [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: file:/tmp/hudson/hive_2011-04-22_12-28-43_332_4721842587008106825/-mr-1 [junit] Total MapReduce jobs = 1 [junit] Launching Job 1 out of 1 [junit] Number of reduce tasks determined at compile time: 1 [junit] In order to change the average load for a reducer (in bytes): [junit] set hive.exec.reducers.bytes.per.reducer=number [junit] In order to limit the maximum number of reducers: [junit] set hive.exec.reducers.max=number [junit] In order to set a constant number of reducers: [junit] set mapred.reduce.tasks=number [junit] Job running in-process (local Hadoop) [junit] Hadoop job information for null: number of mappers: 0; number of reducers: 0 [junit] 2011-04-22 12:28:46,419 null map = 100%, reduce = 100% [junit] Ended Job = job_local_0001 [junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable [junit] POSTHOOK: type: QUERY [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: file:/tmp/hudson/hive_2011-04-22_12-28-43_332_4721842587008106825/-mr-1 [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Hive history file=https://builds.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/build/service/tmp/hive_job_log_hudson_201104221228_1306966870.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] OK [junit] PREHOOK: query: create table testhivedrivertable (num int) [junit] PREHOOK: type: CREATETABLE [junit] POSTHOOK: query: create table testhivedrivertable (num int) [junit] POSTHOOK: type: CREATETABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: load data local inpath 'https://builds.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt' into table testhivedrivertable [junit] PREHOOK: type: LOAD [junit] PREHOOK: Output: default@testhivedrivertable [junit] Copying data from https://builds.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt [junit] Loading data to table default.testhivedrivertable [junit] POSTHOOK: query: load data local inpath 'https://builds.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt' into table testhivedrivertable [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: select * from testhivedrivertable limit 10 [junit] PREHOOK: type: QUERY [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: file:/tmp/hudson/hive_2011-04-22_12-28-47_967_3588446909744167948/-mr-1 [junit] POSTHOOK: query: select * from testhivedrivertable limit 10 [junit] POSTHOOK: type: QUERY [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: file:/tmp/hudson/hive_2011-04-22_12-28-47_967_3588446909744167948/-mr-1 [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Hive history file=https://builds.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/build/service/tmp/hive_job_log_hudson_201104221228_594334426.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] OK [junit] PREHOOK: query: create table testhivedrivertable (num int) [junit] PREHOOK: type: CREATETABLE [junit] POSTHOOK: query: create table testhivedrivertable (num int) [junit] POSTHOOK: type: CREATETABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE
[jira] [Commented] (HIVE-1803) Implement bitmap indexing in Hive
[ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13023355#comment-13023355 ] John Sichi commented on HIVE-1803: -- Meh, I'm still getting numRows failures myself. I noticed that your patch includes some changes to existing test outputs (e.g. bucketmapjoin1.q.out) where it is setting the expected numRows to 0; you should have reverted those before generating the patch. But the failure I got was in another existing test (filter_join_breaktask). I'm trying again after reverting the ones you changed (in case the failure I saw was a side effect), but I'm pessimistic; I'm wondering if something innocuous about the change is somehow exposing some existing non-determinism. Implement bitmap indexing in Hive - Key: HIVE-1803 URL: https://issues.apache.org/jira/browse/HIVE-1803 Project: Hive Issue Type: New Feature Components: Indexing Reporter: Marquis Wang Assignee: Marquis Wang Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.12.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch Implement bitmap index handler to complement compact indexing. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: HIVE-1644 Use filter pushdown for automatically accessing indexes
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/558/#review530 --- ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java https://reviews.apache.org/r/558/#comment1106 Create a followup task for dealing with jobs which access multiple tables. For that, we need to associate the index formats/files with specific tables, and that requires modifying the way the index input format works. ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java https://reviews.apache.org/r/558/#comment1105 Create a followup task for displaying these in the plan (to indicate that a table scan's input is being filtered by the intermediate file). We only want to do that when they are non-null (to avoid upsetting all the existing test reference files). ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java https://reviews.apache.org/r/558/#comment1099 spacing ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java https://reviews.apache.org/r/558/#comment1100 spacing ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java https://reviews.apache.org/r/558/#comment1102 spacing ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java https://reviews.apache.org/r/558/#comment1101 When logging errors being propagated, use the two-arg version of the method and pass e as the second arg. Same thing in a few other places. ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java https://reviews.apache.org/r/558/#comment1103 curly bracket placement ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java https://reviews.apache.org/r/558/#comment1104 create a followup for this one ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java https://reviews.apache.org/r/558/#comment1098 This is not an error, just a condition that prevents usage of the index, so it should be logged as info rather than error. - John On 2011-04-22 03:50:54, Russell Melick wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/558/ --- (Updated 2011-04-22 03:50:54) Review request for hive. Summary --- Review request for HIVE-1644.12.patch This addresses bug HIVE-1644. https://issues.apache.org/jira/browse/HIVE-1644 Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 2cdaeb6 conf/hive-default.xml 79ea477 ql/src/java/org/apache/hadoop/hive/ql/Driver.java ca337a8 ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 69ee03b ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java c02d90b ql/src/java/org/apache/hadoop/hive/ql/index/AbstractIndexHandler.java dd0186d ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java 411b78f ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexQueryContext.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 1f01446 ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 50db44c ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 6162676 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/IndexWhereResolver.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/PhysicalOptimizer.java 0ae9fa2 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcCtx.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 374e123 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java c41bb32 ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java 73391e9 ql/src/test/queries/clientpositive/index_auto.q PRE-CREATION ql/src/test/queries/clientpositive/index_auto_file_format.q PRE-CREATION ql/src/test/queries/clientpositive/index_auto_multiple.q PRE-CREATION ql/src/test/queries/clientpositive/index_auto_partitioned.q PRE-CREATION ql/src/test/queries/clientpositive/index_auto_unused.q PRE-CREATION ql/src/test/results/clientpositive/index_auto.q.out PRE-CREATION ql/src/test/results/clientpositive/index_auto_file_format.q.out PRE-CREATION ql/src/test/results/clientpositive/index_auto_multiple.q.out PRE-CREATION
[jira] [Commented] (HIVE-1644) use filter pushdown for automatically accessing indexes
[ https://issues.apache.org/jira/browse/HIVE-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13023362#comment-13023362 ] John Sichi commented on HIVE-1644: -- Looks good, I added a few minor comments and requests for followup creation. use filter pushdown for automatically accessing indexes --- Key: HIVE-1644 URL: https://issues.apache.org/jira/browse/HIVE-1644 Project: Hive Issue Type: Improvement Components: Indexing Affects Versions: 0.8.0 Reporter: John Sichi Assignee: Russell Melick Attachments: HIVE-1644.1.patch, HIVE-1644.10.patch, HIVE-1644.11.patch, HIVE-1644.12.patch, HIVE-1644.13.patch, HIVE-1644.14.patch, HIVE-1644.15.patch, HIVE-1644.2.patch, HIVE-1644.3.patch, HIVE-1644.4.patch, HIVE-1644.5.patch, HIVE-1644.6.patch, HIVE-1644.7.patch, HIVE-1644.8.patch, HIVE-1644.9.patch HIVE-1226 provides utilities for analyzing filters which have been pushed down to a table scan. The next step is to use these for selecting available indexes and generating access plans for those indexes. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1644) use filter pushdown for automatically accessing indexes
[ https://issues.apache.org/jira/browse/HIVE-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13023365#comment-13023365 ] jirapos...@reviews.apache.org commented on HIVE-1644: - --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/558/#review530 --- ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java https://reviews.apache.org/r/558/#comment1106 Create a followup task for dealing with jobs which access multiple tables. For that, we need to associate the index formats/files with specific tables, and that requires modifying the way the index input format works. ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java https://reviews.apache.org/r/558/#comment1105 Create a followup task for displaying these in the plan (to indicate that a table scan's input is being filtered by the intermediate file). We only want to do that when they are non-null (to avoid upsetting all the existing test reference files). ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java https://reviews.apache.org/r/558/#comment1099 spacing ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java https://reviews.apache.org/r/558/#comment1100 spacing ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java https://reviews.apache.org/r/558/#comment1102 spacing ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java https://reviews.apache.org/r/558/#comment1101 When logging errors being propagated, use the two-arg version of the method and pass e as the second arg. Same thing in a few other places. ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java https://reviews.apache.org/r/558/#comment1103 curly bracket placement ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java https://reviews.apache.org/r/558/#comment1104 create a followup for this one ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java https://reviews.apache.org/r/558/#comment1098 This is not an error, just a condition that prevents usage of the index, so it should be logged as info rather than error. - John On 2011-04-22 03:50:54, Russell Melick wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/558/ bq. --- bq. bq. (Updated 2011-04-22 03:50:54) bq. bq. bq. Review request for hive. bq. bq. bq. Summary bq. --- bq. bq. Review request for HIVE-1644.12.patch bq. bq. bq. This addresses bug HIVE-1644. bq. https://issues.apache.org/jira/browse/HIVE-1644 bq. bq. bq. Diffs bq. - bq. bq.common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 2cdaeb6 bq.conf/hive-default.xml 79ea477 bq.ql/src/java/org/apache/hadoop/hive/ql/Driver.java ca337a8 bq.ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 69ee03b bq.ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java c02d90b bq.ql/src/java/org/apache/hadoop/hive/ql/index/AbstractIndexHandler.java dd0186d bq.ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java 411b78f bq.ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexQueryContext.java PRE-CREATION bq. ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 1f01446 bq.ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 50db44c bq.ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 6162676 bq. ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/IndexWhereResolver.java PRE-CREATION bq. ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/PhysicalOptimizer.java 0ae9fa2 bq. ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcCtx.java PRE-CREATION bq. ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java PRE-CREATION bq. ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java PRE-CREATION bq.ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 374e123 bq.ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java c41bb32 bq.ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java 73391e9 bq.ql/src/test/queries/clientpositive/index_auto.q PRE-CREATION bq.ql/src/test/queries/clientpositive/index_auto_file_format.q PRE-CREATION bq.
[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive
[ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Sichi updated HIVE-1803: - Status: Open (was: Patch Available) OK, I dug into it and found out that it was a problem with HADOOP_CLASSPATH preventing derby.jar getting loaded (so stats couldn't be written from Hadoop tasks, hence numRows=0). The existing HADOOP_CLASSPATH was already incorrect, but the problem was only exposed by the addition of the javaewah-0.2.jar. It was using commas for separators instead of colons (and it should not have been using file: at all!). Here's the correct format with which I was able to pass a few failing tests I tried individually: {noformat} env key=HADOOP_CLASSPATH value=${test.src.data.dir}/conf:${build.dir.\ hive}/dist/lib/derby.jar:${build.dir.hive}/dist/lib/javaewah-0.2.jar/ {noformat} Can you give me another patch which fixes this and omits all .q.out updates for existing tests unless they need it? Fingers crossed that will be the last one. Implement bitmap indexing in Hive - Key: HIVE-1803 URL: https://issues.apache.org/jira/browse/HIVE-1803 Project: Hive Issue Type: New Feature Components: Indexing Reporter: Marquis Wang Assignee: Marquis Wang Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.12.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch Implement bitmap index handler to complement compact indexing. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2038) Metastore listener
[ https://issues.apache.org/jira/browse/HIVE-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13023429#comment-13023429 ] Ashutosh Chauhan commented on HIVE-2038: Ya, you are accurate. A mechanism for metastore client to send application specific event to metastore listener. I agree finalize may not be an appropriate choice here. Can't think of anything better. Any suggestions : ) Metastore listener -- Key: HIVE-2038 URL: https://issues.apache.org/jira/browse/HIVE-2038 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.8.0 Attachments: hive-2038.patch, metastore_listener.patch, metastore_listener.patch, metastore_listener.patch Provide to way to observe changes happening on Metastore -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2038) Metastore listener
[ https://issues.apache.org/jira/browse/HIVE-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13023430#comment-13023430 ] Ashutosh Chauhan commented on HIVE-2038: Shall we call it takeActionOnPartition() ? Too verbose ? Metastore listener -- Key: HIVE-2038 URL: https://issues.apache.org/jira/browse/HIVE-2038 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.8.0 Attachments: hive-2038.patch, metastore_listener.patch, metastore_listener.patch, metastore_listener.patch Provide to way to observe changes happening on Metastore -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2038) Metastore listener
[ https://issues.apache.org/jira/browse/HIVE-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13023439#comment-13023439 ] Carl Steinbach commented on HIVE-2038: -- If you do it this way I think you can only support registering a single listener. This follows from the fact that the meaning of a takeActionOnPartition() event is specific to a particular application, but the listener has no way of knowing which application fired the event. I don't think this is an acceptable limitation. You can get around this by defining a ListenerEvent base class that third-party applications are allowed to extend. Applications can then fire this event from the client side, and listeners can register events that they are interested in listening for using an event type registry. Getting this to work is further complicated by the fact that you have to support serialization of the event objects over the Thrift interface. I think it's appropriate to tackle this problem in a separate JIRA. I'd like to see some concrete use cases and discuss alternatives. I'm not convinced that the MetastoreClient/MetastoreListener should support the ability to fire arbitrary events. Metastore listener -- Key: HIVE-2038 URL: https://issues.apache.org/jira/browse/HIVE-2038 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.8.0 Attachments: hive-2038.patch, metastore_listener.patch, metastore_listener.patch, metastore_listener.patch Provide to way to observe changes happening on Metastore -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive
[ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marquis Wang updated HIVE-1803: --- Status: Patch Available (was: Open) Implement bitmap indexing in Hive - Key: HIVE-1803 URL: https://issues.apache.org/jira/browse/HIVE-1803 Project: Hive Issue Type: New Feature Components: Indexing Reporter: Marquis Wang Assignee: Marquis Wang Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.12.patch, HIVE-1803.13.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch Implement bitmap index handler to complement compact indexing. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive
[ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marquis Wang updated HIVE-1803: --- Attachment: HIVE-1803.13.patch New patch that updates HADOOP_CLASSPATH and doesn't change tests except adding new tests and show_functions.q. Fingers crossed for this one passing. I'm optimistic. Implement bitmap indexing in Hive - Key: HIVE-1803 URL: https://issues.apache.org/jira/browse/HIVE-1803 Project: Hive Issue Type: New Feature Components: Indexing Reporter: Marquis Wang Assignee: Marquis Wang Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.12.patch, HIVE-1803.13.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.2.patch, unit-tests.3.patch, unit-tests.patch Implement bitmap index handler to complement compact indexing. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2123) CommandNeedRetryException needs release locks
[ https://issues.apache.org/jira/browse/HIVE-2123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13023474#comment-13023474 ] Siying Dong commented on HIVE-2123: --- @Carl? CommandNeedRetryException needs release locks - Key: HIVE-2123 URL: https://issues.apache.org/jira/browse/HIVE-2123 Project: Hive Issue Type: Bug Reporter: Siying Dong Assignee: Siying Dong Attachments: HIVE-2123.1.patch, HIVE-2123.2.patch now when CommandNeedRetryException is thrown, locks are not released. Not sure whether it will cause problem, since the same locks will be acquired when retrying it. It is anyway something we need to fix. Also we can do some little code cleaning up to make future mistakes less likely. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2123) CommandNeedRetryException needs release locks
[ https://issues.apache.org/jira/browse/HIVE-2123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13023489#comment-13023489 ] Carl Steinbach commented on HIVE-2123: -- @Siying: Thanks for making the change. +1 on this patch, but I'm -1 on the overall approach of using status return codes instead of exceptions. Hopefully we can replace the status codes with exceptions during some future cleanup effort. CommandNeedRetryException needs release locks - Key: HIVE-2123 URL: https://issues.apache.org/jira/browse/HIVE-2123 Project: Hive Issue Type: Bug Reporter: Siying Dong Assignee: Siying Dong Attachments: HIVE-2123.1.patch, HIVE-2123.2.patch now when CommandNeedRetryException is thrown, locks are not released. Not sure whether it will cause problem, since the same locks will be acquired when retrying it. It is anyway something we need to fix. Also we can do some little code cleaning up to make future mistakes less likely. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-2127) Improve stats gathering reliability by retries on failures
Improve stats gathering reliability by retries on failures -- Key: HIVE-2127 URL: https://issues.apache.org/jira/browse/HIVE-2127 Project: Hive Issue Type: Improvement Reporter: Ning Zhang Assignee: Ning Zhang Stats publishing and aggregation only try once and if there is any exception it will fail and return. If many mappers/reducers updating stats at the same time, it is very common to get lock timeout. We should make stats more reliable by retry when there is an SQLException. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-2129) Display indexing information for TableScanOperator in plan
Display indexing information for TableScanOperator in plan -- Key: HIVE-2129 URL: https://issues.apache.org/jira/browse/HIVE-2129 Project: Hive Issue Type: Improvement Components: Indexing Affects Versions: 0.8.0 Reporter: Russell Melick Show the indexInputFormat and indexIntermediateFile in the plan, to indicate that the table scan's input is being filtered by the intermediate file. But, we only want to do this when these values are non-null (could use the usesIndex() function), so that all the old tests aren't messed up. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-2130) Cost based choice for rewrite during Automatic Indexing
Cost based choice for rewrite during Automatic Indexing --- Key: HIVE-2130 URL: https://issues.apache.org/jira/browse/HIVE-2130 Project: Hive Issue Type: Improvement Components: Indexing Affects Versions: 0.8.0 Reporter: Russell Melick After processing a predicate, there are potentially multiple index rewrites possible. Currently, we just choose the first one. However, there are probably heuristics for choosing certain rewrites over others, based on potential time savings. See IndexWhereProcessor for a good place to do this. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira