[jira] [Resolved] (HIVE-9589) config topology.acker.executors default value should not be null
[ https://issues.apache.org/jira/browse/HIVE-9589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun resolved HIVE-9589. -- Resolution: Won't Fix config topology.acker.executors default value should not be null Key: HIVE-9589 URL: https://issues.apache.org/jira/browse/HIVE-9589 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: caofangkun Assignee: caofangkun Priority: Minor Attachments: executors num is wrong.png, is null.png See Code: https://github.com/caofangkun/apache-storm/blob/master/storm-core/src/clj/backtype/storm/daemon/common.clj#L315 Config topology.acker.executors default value is null then acker executors will be same as topology.workers $ storm jar storm-starter-0.10.0-SNAPSHOT.jar storm.starter.ExclamationTopology ExclamationTopology Executors show up as 18 executors = 10(word) + 3(exclaim1) + 2(exclaim2) + 3(acker bolt) But the 3 acker bolt executors will not be used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9589) config topology.acker.executors default value should not be null
[ https://issues.apache.org/jira/browse/HIVE-9589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-9589: - Attachment: executors num is wrong.png is null.png config topology.acker.executors default value should not be null Key: HIVE-9589 URL: https://issues.apache.org/jira/browse/HIVE-9589 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: caofangkun Assignee: caofangkun Priority: Minor Attachments: executors num is wrong.png, is null.png See Code: https://github.com/caofangkun/apache-storm/blob/master/storm-core/src/clj/backtype/storm/daemon/common.clj#L315 Config topology.acker.executors default value is null then acker executors will be same as topology.workers $ storm jar storm-starter-0.10.0-SNAPSHOT.jar storm.starter.ExclamationTopology ExclamationTopology Executors show up as 18 executors = 10(word) + 3(exclaim1) + 2(exclaim2) + 3(acker bolt) But the 3 acker bolt executors will not be used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9589) config topology.acker.executors default value should not be null
caofangkun created HIVE-9589: Summary: config topology.acker.executors default value should not be null Key: HIVE-9589 URL: https://issues.apache.org/jira/browse/HIVE-9589 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: caofangkun Assignee: caofangkun Priority: Minor See Code: https://github.com/caofangkun/apache-storm/blob/master/storm-core/src/clj/backtype/storm/daemon/common.clj#L315 Config topology.acker.executors default value is null then acker executors will be same as topology.workers $ storm jar storm-starter-0.10.0-SNAPSHOT.jar storm.starter.ExclamationTopology ExclamationTopology Executors show up as 18 executors = 10(word) + 3(exclaim1) + 2(exclaim2) + 3(acker bolt) But the 3 acker bolt executors will not be used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-4905) In strict mode, predicate pushdown does not work on partition columns with statements using left/right join
caofangkun created HIVE-4905: Summary: In strict mode, predicate pushdown does not work on partition columns with statements using left/right join Key: HIVE-4905 URL: https://issues.apache.org/jira/browse/HIVE-4905 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: caofangkun Assignee: caofangkun Priority: Minor set hive.mapred.mode=strict; drop table mpt3; create table mpt3 (s1 string , s2 string) partitioned by (dt string, time string); alter table mpt3 add partition (dt='1',time='2'); drop table mpt4; create table mpt4 (s1 string , s2 string) partitioned by (dt string, time string); alter table mpt4 add partition (dt='1',time='2'); Query One: works well explain select * from mpt3 a join mpt4 b on (a.s1 = b.s1) where a.dt='1' and a.time='2' and b.dt='1'; Query Two :failed hive (default) explain select a.* from mpt3 a right outer join mpt4 b on (a.s1 = b.s1) where a.dt='1' and a.time='2' and b.dt='1'; FAILED: SemanticException [Error 10041]: No partition predicate found for Alias a Table mpt3 Query Three: failed hive (default) explain select a.* from mpt3 a left outer join mpt4 b on (a.s1 = b.s1) where a.dt='1' and a.time='2' and b.dt='1'; FAILED: SemanticException [Error 10041]: No partition predicate found for Alias b Table mpt4 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2827) Implement nullsafe equi-join
[ https://issues.apache.org/jira/browse/HIVE-2827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13685091#comment-13685091 ] caofangkun commented on HIVE-2827: -- In file /hive/trunk/ql/src/test/queries/clientpositive/join_nullsafe.q there's a line of command seems useless: set hive.nullsafe.equijoin=true; Implement nullsafe equi-join Key: HIVE-2827 URL: https://issues.apache.org/jira/browse/HIVE-2827 Project: Hive Issue Type: Improvement Components: Query Processor Environment: ubuntu 10.04 Reporter: Navis Assignee: Navis Priority: Minor Fix For: 0.9.0 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2827.D1971.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2827.D1971.2.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2827.D1971.3.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2827.D1971.4.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2827.D1971.5.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2827.D1971.6.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2827.D1971.7.patch was part of HIVE-2810, but separated because it affected more classes than expected. {noformat} SELECT * FROM a JOIN b ON a.key = b.key {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4742) A useless CAST makes Hive fail to create a VIEW based on an UNION
[ https://issues.apache.org/jira/browse/HIVE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13685117#comment-13685117 ] caofangkun commented on HIVE-4742: -- Simplify the query and generate the following SemanticException : hive (default) select CAST(`d` AS STRING) from dual limit 1; FAILED: SemanticException [Error 10004]: Line 1:12 Invalid table alias or column reference '`d`': (possible column names are: dummy) I don't think this's a BUG.We'd better follow the Grammar rules. A useless CAST makes Hive fail to create a VIEW based on an UNION - Key: HIVE-4742 URL: https://issues.apache.org/jira/browse/HIVE-4742 Project: Hive Issue Type: Bug Affects Versions: 0.11.0 Reporter: Nicolas Lalevée I create programmatically a script to create a view which is a union of all kind of event I have. To keep things simple, data are just blindly cast as STRING. It used to work with Hive 0.10, not anymore with 0.11. I tried to narrow down the simplest script. It seems that it only occurs if a least a view and an union is involved. Here is a failing script: {noformat} CREATE TABLE Event1 (d STRING, userid BIGINT, eventData1 STRING); CREATE TABLE Event2 (d STRING, userid BIGINT, eventData2 STRING); CREATE VIEW AllEventsTest AS SELECT * FROM ( SELECT 'Event1' AS eventType, map('d', CAST(`d` AS STRING)) AS eventData FROM Event1 UNION ALL SELECT 'Event2' AS eventType, map('d', CAST(`d` AS STRING)) AS eventData FROM Event2 ) d; {noformat} There are warnings in the logs: {noformat} o.a.h.h.q.parse.TypeCheckProcFactory - Invalid type entry TOK_STRING=null o.a.h.h.q.parse.TypeCheckProcFactory - Invalid type entry TOK_STRING=null {noformat} And the error is: {noformat} FAILED: IllegalArgumentException replace op boundaries of ReplaceOp@[@46,103:105='`d`',26,3:52]..[@46,103:105='`d`',26,3:52]:`event1`.`d` overlap with previous ReplaceOp@[@44,98:101='CAST',48,3:47]..[@51,116:116=')',276,3:65]:`event1`.`d` 10:52:51.024 [scoopMapredScheduler_Worker-9 ] [ERROR] org.apache.hadoop.hive.ql.Driver - FAILED: IllegalArgumentException replace op boundaries of ReplaceOp@[@46,103:105='`d`',26,3:52]..[@46,103:105='`d`',26,3:52]:`event1`.`d` overlap with previous ReplaceOp@[@44,98:101='CAST',48,3:47]..[@51,116:116=')',276,3:65]:`event1`.`d` java.lang.IllegalArgumentException: replace op boundaries of ReplaceOp@[@46,103:105='`d`',26,3:52]..[@46,103:105='`d`',26,3:52]:`event1`.`d` overlap with previous ReplaceOp@[@44,98:101='CAST',48,3:47]..[@51,116:116=')',276,3:65]:`event1`.`d` at org.antlr.runtime.TokenRewriteStream.reduceToSingleOperationPerIndex(TokenRewriteStream.java:504) at org.antlr.runtime.TokenRewriteStream.toString(TokenRewriteStream.java:374) at org.antlr.runtime.TokenRewriteStream.toString(TokenRewriteStream.java:358) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.saveViewDefinition(SemanticAnalyzer.java:8781) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8689) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:278) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:433) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902) at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:198) at org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:192) {noformat} Here is a working script: {noformat} CREATE TABLE Event1 (d STRING, userid BIGINT, eventData1 STRING); CREATE TABLE Event2 (d STRING, userid BIGINT, eventData2 STRING); CREATE VIEW AllEventsTest AS SELECT * FROM ( SELECT 'Event1' AS eventType, map('d', `d`, 'userid', CAST(`userid` AS STRING)) AS eventData FROM Event1 UNION ALL SELECT 'Event2' AS eventType, map('d', `d`, 'userid', CAST(`userid` AS STRING)) AS eventData FROM Event2 ) d; {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4735) table's lifecycle should be controlled in Hive
[ https://issues.apache.org/jira/browse/HIVE-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4735: - Description: Example as: CREATE TABLE src (key string, value string) TBLPROPERTIES ('LIFECYCLE'='50d'); -- alter table src and reset it's lifecycle as 50 day -- alter 50 days the table will be dropped automatically ALTER TABLE src SET TBLPROPERTIES('LIFECYCLE'='50d'); -- alter specific table partition's lifecycle -- we have DBPROPERTIES and TBLPROPERTIES , but seems we do not have 'PRTPROPERTIES' ALTER TABLE srcpart PARTITION (dt='20130614',hr='10') SET PRTPROPERTIES('LIFECYCLE'='50d'); was: Example as: -- create table src,and set it's lifecycle as 100 days -- alter 100 days the table will be dropped automatically CREATE TABLE src (key string) LIFECYCLE 100d; -- alter table src and reset it's lifecycle as 50 day ALTER TABLE src SET LIFECYCLE 50d; -- alter specific table partition's lifecycle ALTER TABLE srcpart PARTITION (dt='20130614',hr='10') SET LIFECYCLE 50d; table's lifecycle should be controlled in Hive -- Key: HIVE-4735 URL: https://issues.apache.org/jira/browse/HIVE-4735 Project: Hive Issue Type: Improvement Reporter: caofangkun Priority: Minor Example as: CREATE TABLE src (key string, value string) TBLPROPERTIES ('LIFECYCLE'='50d'); -- alter table src and reset it's lifecycle as 50 day -- alter 50 days the table will be dropped automatically ALTER TABLE src SET TBLPROPERTIES('LIFECYCLE'='50d'); -- alter specific table partition's lifecycle -- we have DBPROPERTIES and TBLPROPERTIES , but seems we do not have 'PRTPROPERTIES' ALTER TABLE srcpart PARTITION (dt='20130614',hr='10') SET PRTPROPERTIES('LIFECYCLE'='50d'); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4735) table's lifecycle should be controlled in Hive
[ https://issues.apache.org/jira/browse/HIVE-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4735: - Description: Example as: -- create table src,and set it's lifecycle as 100 days -- alter 100 days the table will be dropped automatically CREATE TABLE src (key string, value string) TBLPROPERTIES ('LIFECYCLE'='100d'); -- alter table src and reset it's lifecycle as 50 day -- alter 50 days the table will be dropped automatically ALTER TABLE src SET TBLPROPERTIES('LIFECYCLE'='50d'); -- alter specific table partition's lifecycle -- we have DBPROPERTIES and TBLPROPERTIES , but seems we do not have 'PRTPROPERTIES' ALTER TABLE srcpart PARTITION (dt='20130614',hr='10') SET PRTPROPERTIES('LIFECYCLE'='50d'); was: Example as: CREATE TABLE src (key string, value string) TBLPROPERTIES ('LIFECYCLE'='50d'); -- alter table src and reset it's lifecycle as 50 day -- alter 50 days the table will be dropped automatically ALTER TABLE src SET TBLPROPERTIES('LIFECYCLE'='50d'); -- alter specific table partition's lifecycle -- we have DBPROPERTIES and TBLPROPERTIES , but seems we do not have 'PRTPROPERTIES' ALTER TABLE srcpart PARTITION (dt='20130614',hr='10') SET PRTPROPERTIES('LIFECYCLE'='50d'); table's lifecycle should be controlled in Hive -- Key: HIVE-4735 URL: https://issues.apache.org/jira/browse/HIVE-4735 Project: Hive Issue Type: Improvement Reporter: caofangkun Priority: Minor Example as: -- create table src,and set it's lifecycle as 100 days -- alter 100 days the table will be dropped automatically CREATE TABLE src (key string, value string) TBLPROPERTIES ('LIFECYCLE'='100d'); -- alter table src and reset it's lifecycle as 50 day -- alter 50 days the table will be dropped automatically ALTER TABLE src SET TBLPROPERTIES('LIFECYCLE'='50d'); -- alter specific table partition's lifecycle -- we have DBPROPERTIES and TBLPROPERTIES , but seems we do not have 'PRTPROPERTIES' ALTER TABLE srcpart PARTITION (dt='20130614',hr='10') SET PRTPROPERTIES('LIFECYCLE'='50d'); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4735) table's lifecycle should be controlled in Hive
[ https://issues.apache.org/jira/browse/HIVE-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13683147#comment-13683147 ] caofangkun commented on HIVE-4735: -- Hi [~cwsteinbach] I'm think you are indeed right. I've changed the description. thanks table's lifecycle should be controlled in Hive -- Key: HIVE-4735 URL: https://issues.apache.org/jira/browse/HIVE-4735 Project: Hive Issue Type: Improvement Reporter: caofangkun Priority: Minor Example as: -- create table src,and set it's lifecycle as 100 days -- alter 100 days the table will be dropped automatically CREATE TABLE src (key string, value string) TBLPROPERTIES ('LIFECYCLE'='100d'); -- alter table src and reset it's lifecycle as 50 day -- alter 50 days the table will be dropped automatically ALTER TABLE src SET TBLPROPERTIES('LIFECYCLE'='50d'); -- alter specific table partition's lifecycle -- we have DBPROPERTIES and TBLPROPERTIES , but seems we do not have 'PRTPROPERTIES' ALTER TABLE srcpart PARTITION (dt='20130614',hr='10') SET PRTPROPERTIES('LIFECYCLE'='50d'); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4729) when subquery is a star selection with a group by ,the query will generate Confusing SemanticException
caofangkun created HIVE-4729: Summary: when subquery is a star selection with a group by ,the query will generate Confusing SemanticException Key: HIVE-4729 URL: https://issues.apache.org/jira/browse/HIVE-4729 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.8.1, 0.12.0 Environment: hive-trunk hadoop-0.20.1 Reporter: caofangkun Assignee: caofangkun Priority: Minor hive (default) desc src; OK key string None value string None Time taken: 2.887 seconds, Fetched: 2 row(s) hive (default) desc src_1; OK key string None value string None Time taken: 0.146 seconds, Fetched: 2 row(s) SubQuery with 'group by' will generate semantic exception: explain select a.key as a_key, b.key as b_key from (select * from src group by key, value ) a left outer join src_1 b on a.key = b.key; FAILED: Error in semantic analysis: Line 8:6 Invalid column reference 'key' However the following query will work well : explain select a.key as a_key, b.key as b_key from (select key,value from src group by key, value ) a left outer join src_1 b on a.key = b.key; In debug mode: 13/06/13 16:09:04 WARN parse.TypeCheckProcFactory: Invalid type entry TOK_TABLE_OR_COL=null FAILED: SemanticException [Error 10002]: Line 8:5 Invalid column reference 'key' 13/06/13 16:09:04 ERROR ql.Driver: FAILED: SemanticException [Error 10002]: Line 8:5 Invalid column reference 'key' org.apache.hadoop.hive.ql.parse.SemanticException: Line 8:5 Invalid column reference 'key' at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:9002) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:8950) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:8921) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genJoinReduceSinkChild(SemanticAnalyzer.java:5982) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genJoinOperator(SemanticAnalyzer.java:6079) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genJoinPlan(SemanticAnalyzer.java:6240) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:8057) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8720) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:277) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:50) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:277) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:433) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446) at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:122) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:782) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) 13/06/13 16:09:04 INFO ql.Driver: /PERFLOG method=compile start=1371110941624 end=1371110944658 duration=3034 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4735) table's lifecycle should be controlled in Hive
caofangkun created HIVE-4735: Summary: table's lifecycle should be controlled in Hive Key: HIVE-4735 URL: https://issues.apache.org/jira/browse/HIVE-4735 Project: Hive Issue Type: Improvement Reporter: caofangkun Priority: Minor Example as: -- create table src,and set it's lifecycle as 100 days -- alter 100 days the table will be dropped automatically CREATE TABLE src (key string) LIFECYCLE 100d; -- alter table src and reset it's lifecycle as 50 day ALTER TABLE src SET LIFECYCLE 50d; -- alter specific table partition's lifecycle ALTER TABLE srcpart PARTITION (dt='20130614',hr='10') SET LIFECYCLE 50d; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4064) Handle db qualified names consistently across all HiveQL statements
[ https://issues.apache.org/jira/browse/HIVE-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4064: - Attachment: HIVE-4064-1.patch https://reviews.apache.org/r/11755/ HiveLexer.g Identifier should contain '.' for full table name . It works now for 'alter table ' statements. Handle db qualified names consistently across all HiveQL statements --- Key: HIVE-4064 URL: https://issues.apache.org/jira/browse/HIVE-4064 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.10.0 Reporter: Shreepadma Venugopalan Assignee: caofangkun Attachments: HIVE-4064-1.patch Hive doesn't consistently handle db qualified names across all HiveQL statements. While some HiveQL statements such as SELECT support DB qualified names, other such as CREATE INDEX doesn't. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3589) describe/show partition/show tblproperties command should accept database name
[ https://issues.apache.org/jira/browse/HIVE-3589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678935#comment-13678935 ] caofangkun commented on HIVE-3589: -- Hi [~navis] desc statements had been fixed in trunk. But 'show partition ' and 'show tblproperties' statements have not been fixed yet . I find simpley modify HiveParser.g file will fix this issue. Please have a look at this rb: https://reviews.apache.org/r/11753/ thank you describe/show partition/show tblproperties command should accept database name -- Key: HIVE-3589 URL: https://issues.apache.org/jira/browse/HIVE-3589 Project: Hive Issue Type: Bug Components: Metastore, Query Processor Affects Versions: 0.8.1 Reporter: Sujesh Chirackkal Assignee: Navis Priority: Minor Attachments: HIVE-3589.D6075.1.patch, HIVE-3589.D6075.2.patch describe command not giving the details when called as describe dbname.tablename. Throwing the error Table dbname not found. Ex: hive -e describe masterdb.table1 will throw error Table masterdb not found -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (HIVE-4346) when writing data into filesystem from queries ,the output files could contain a line of column names
[ https://issues.apache.org/jira/browse/HIVE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-4346 started by caofangkun. when writing data into filesystem from queries ,the output files could contain a line of column names -- Key: HIVE-4346 URL: https://issues.apache.org/jira/browse/HIVE-4346 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: caofangkun Assignee: caofangkun Priority: Minor Attachments: HIVE-4346-1.patch, HIVE-4346-3.patch For example : hivedesc src; key string value string hiveselect * from src; 1 10 2 20 hiveset hive.output.markschema=true; hiveinsert overwrite local directory './test1' select * from src ; hive!ls -l './test1'; ./test1/_metadata ./test1/00_0 hive!cat './test1/_metadata' key^Avalue hive!cat './test1/00_0'; 1^A10 2^A20 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4346) when writing data into filesystem from queries ,the output files could contain a line of column names
[ https://issues.apache.org/jira/browse/HIVE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4346: - Attachment: HIVE-4346-4.patch https://reviews.apache.org/r/10474/ Could anybody review this patch? when writing data into filesystem from queries ,the output files could contain a line of column names -- Key: HIVE-4346 URL: https://issues.apache.org/jira/browse/HIVE-4346 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: caofangkun Assignee: caofangkun Priority: Minor Attachments: HIVE-4346-1.patch, HIVE-4346-3.patch, HIVE-4346-4.patch For example : hivedesc src; key string value string hiveselect * from src; 1 10 2 20 hiveset hive.output.markschema=true; hiveinsert overwrite local directory './test1' select * from src ; hive!ls -l './test1'; ./test1/_metadata ./test1/00_0 hive!cat './test1/_metadata' key^Avalue hive!cat './test1/00_0'; 1^A10 2^A20 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4064) Handle db qualified names consistently across all HiveQL statements
[ https://issues.apache.org/jira/browse/HIVE-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4064: - Assignee: caofangkun Handle db qualified names consistently across all HiveQL statements --- Key: HIVE-4064 URL: https://issues.apache.org/jira/browse/HIVE-4064 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.10.0 Reporter: Shreepadma Venugopalan Assignee: caofangkun Hive doesn't consistently handle db qualified names across all HiveQL statements. While some HiveQL statements such as SELECT support DB qualified names, other such as CREATE INDEX doesn't. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4671) When HiveQL WHERE clause is a constant value or a single column name , it should be handled in reason
caofangkun created HIVE-4671: Summary: When HiveQL WHERE clause is a constant value or a single column name , it should be handled in reason Key: HIVE-4671 URL: https://issues.apache.org/jira/browse/HIVE-4671 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: caofangkun Assignee: caofangkun Priority: Minor User can specify any condition using WHERE clause. This clause is used to compare given value with the field value available in Hive table . Generally speaking WHERE condition should be key-value pairs like `column_name = 'const value' and ` . But the followwing three statements shoud also be handled in reason. Statement One: WHERE condition is a boolean value the following two queries work well but should optimized as non-MR fetching SELECT * FROM src WHERE true; SELECT * FROM src WHERE false; Statement One: WHERE condition is a single constat value the following two queries will generate RunTime ClassCastException Should this to be optimized as if Integer not equal to 0 then TRUE else FALSE ? SELECT * FROM src WHERE 1; SELECT * FROM src WHERE 0; SELECT * FROM src WHERE -1; Statement One: WHERE condition is a single column name the following two query will generate RunTime ClassCastException too . Should this to be optimized as SemanticException ? SELECT * FROM src WHERE key; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3682) when output hive table to file,users should could have a separator of their own choice
[ https://issues.apache.org/jira/browse/HIVE-3682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13677784#comment-13677784 ] caofangkun commented on HIVE-3682: -- Hi [~navis] could you please put this into the wiki ? https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Writingdataintofilesystemfromqueries when output hive table to file,users should could have a separator of their own choice -- Key: HIVE-3682 URL: https://issues.apache.org/jira/browse/HIVE-3682 Project: Hive Issue Type: New Feature Components: CLI Affects Versions: 0.8.1 Environment: Linux 3.0.0-14-generic #23-Ubuntu SMP Mon Nov 21 20:34:47 UTC 2011 i686 i686 i386 GNU/Linux java version 1.6.0_25 hadoop-0.20.2-cdh3u0 hive-0.8.1 Reporter: caofangkun Assignee: Sushanth Sowmyan Fix For: 0.11.0 Attachments: HIVE-3682-1.patch, HIVE-3682.D10275.1.patch, HIVE-3682.D10275.2.patch, HIVE-3682.D10275.3.patch, HIVE-3682.D10275.4.patch, HIVE-3682.D10275.4.patch.for.0.11, HIVE-3682.with.serde.patch By default,when output hive table to file ,columns of the Hive table are separated by ^A character (that is \001). But indeed users should have the right to set a seperator of their own choice. Usage Example: create table for_test (key string, value string); load data local inpath './in1.txt' into table for_test select * from for_test; UT-01:default separator is \001 line separator is \n insert overwrite local directory './test-01' select * from src ; create table array_table (a arraystring, b arraystring) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ','; load data local inpath ../hive/examples/files/arraytest.txt overwrite into table table2; CREATE TABLE map_table (foo STRING , bar MAPSTRING, STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',' MAP KEYS TERMINATED BY ':' STORED AS TEXTFILE; UT-02:defined field separator as ':' insert overwrite local directory './test-02' row format delimited FIELDS TERMINATED BY ':' select * from src ; UT-03: line separator DO NOT ALLOWED to define as other separator insert overwrite local directory './test-03' row format delimited FIELDS TERMINATED BY ':' select * from src ; UT-04: define map separators insert overwrite local directory './test-04' row format delimited FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',' MAP KEYS TERMINATED BY ':' select * from src; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-4674) add jar for a jar on hdfs doesn't copy jars referenced in manfiest's classpath attribute
[ https://issues.apache.org/jira/browse/HIVE-4674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun reassigned HIVE-4674: Assignee: caofangkun add jar for a jar on hdfs doesn't copy jars referenced in manfiest's classpath attribute -- Key: HIVE-4674 URL: https://issues.apache.org/jira/browse/HIVE-4674 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Ian Robertson Assignee: caofangkun The ability to load udfs via add jar when the jar is in hdfs was added in HIVE-1157. Unfortunately, if the jar's manifest.mf file has a Class-Path attribute referencing jars in the same directory, these jars are not copied from hdsf to the local file system. It would be great if, after copying the specified jar from hdfs, that jar's manifest was inspected for a Class-Path attribute, and if any mentioned jars were recursively copied in the same fashion. If this is seen as a desirable feature, I'd be happy to contribute a patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4674) add jar for a jar on hdfs doesn't copy jars referenced in manfiest's classpath attribute
[ https://issues.apache.org/jira/browse/HIVE-4674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4674: - Assignee: (was: caofangkun) add jar for a jar on hdfs doesn't copy jars referenced in manfiest's classpath attribute -- Key: HIVE-4674 URL: https://issues.apache.org/jira/browse/HIVE-4674 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Ian Robertson The ability to load udfs via add jar when the jar is in hdfs was added in HIVE-1157. Unfortunately, if the jar's manifest.mf file has a Class-Path attribute referencing jars in the same directory, these jars are not copied from hdsf to the local file system. It would be great if, after copying the specified jar from hdfs, that jar's manifest was inspected for a Class-Path attribute, and if any mentioned jars were recursively copied in the same fashion. If this is seen as a desirable feature, I'd be happy to contribute a patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4659) while sql contains \t , 'desc formatted view_name' and 'show create table view_name' statements will generate Incomplete results
[ https://issues.apache.org/jira/browse/HIVE-4659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4659: - Attachment: HIVE-4659-1.patch https://reviews.apache.org/r/11652/ while sql contains \t , 'desc formatted view_name' and 'show create table view_name' statements will generate Incomplete results Key: HIVE-4659 URL: https://issues.apache.org/jira/browse/HIVE-4659 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: caofangkun Assignee: caofangkun Priority: Minor Attachments: HIVE-4659-1.patch drop view if exists v_test; CREATE VIEW v_test AS select key,-- start by \t\t value, -- start by \t\t dt from -- start by \t\t ( select key, value, dt from tmp_v_t1 where dt='20130122' union all select key,value, dt from tmp_v_t1 where dt='20130123' ) t; $ hive -e show create table v_test UT-One the three lines which started by \t lost in create statment ! Logging initialized using configuration in file:/home/zongren/hive-conf/hive-log4j.properties Hive history file=/tmp/zongren/hive_job_log_zongren_24155@hd17-vm5_201306051125_94165790.txt OK CREATE VIEW v_test AS select ( select `tmp_v_t1`.`key`, `tmp_v_t1`.`value`, `tmp_v_t1`.`dt` from `default`.`tmp_v_t1` where `tmp_v_t1`.`dt`='20130122' union all select `tmp_v_t1`.`key`,`tmp_v_t1`.`value`, `tmp_v_t1`.`dt` from `default`.`tmp_v_t1` where `tmp_v_t1`.`dt`='20130123' ) `t` Time taken: 2.767 seconds, Fetched: 9 row(s) UT-Two: -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-4346) when writing data into filesystem from queries ,the output files could contain a line of column names
[ https://issues.apache.org/jira/browse/HIVE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun reassigned HIVE-4346: Assignee: caofangkun when writing data into filesystem from queries ,the output files could contain a line of column names -- Key: HIVE-4346 URL: https://issues.apache.org/jira/browse/HIVE-4346 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: caofangkun Assignee: caofangkun Priority: Minor Attachments: HIVE-4346-1.patch, HIVE-4346-3.patch For example : hivedesc src; key string value string hiveselect * from src; 1 10 2 20 hiveset hive.output.markschema=true; hiveinsert overwrite local directory './test1' select * from src ; hive!ls -l './test1'; ./test1/_metadata ./test1/00_0 hive!cat './test1/_metadata' key^Avalue hive!cat './test1/00_0'; 1^A10 2^A20 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-4367) enhance TRUNCATE syntex to drop data of external table
[ https://issues.apache.org/jira/browse/HIVE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun reassigned HIVE-4367: Assignee: caofangkun enhance TRUNCATE syntex to drop data of external table Key: HIVE-4367 URL: https://issues.apache.org/jira/browse/HIVE-4367 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.11.0 Reporter: caofangkun Assignee: caofangkun Priority: Minor Attachments: HIVE-4367-1.patch In my use case , sometimes I have to remove data of external tables to free up storage space of the cluster . So it's necessary for to enhance the syntax like TRUNCATE TABLE srcpart_truncate PARTITION (dt='201130412') FORCE; to remove data from EXTERNAL table. And I add a configuration property to enable remove data to Trash property namehive.truncate.skiptrash/name valuefalse/value description if true will remove data to trash, else false drop data immediately /description /property For example : hive (default) TRUNCATE TABLE external1 partition (ds='11'); FAILED: Error in semantic analysis: Cannot truncate non-managed table external1 hive (default) TRUNCATE TABLE external1 partition (ds='11') FORCE; [2013-04-16 17:15:52]: Compile Start [2013-04-16 17:15:52]: Compile End [2013-04-16 17:15:52]: OK [2013-04-16 17:15:52]: Time taken: 0.413 seconds hive (default) set hive.truncate.skiptrash; hive.truncate.skiptrash=false hive (default) set hive.truncate.skiptrash=true; hive (default) TRUNCATE TABLE external1 partition (ds='12') FORCE; [2013-04-16 17:16:21]: Compile Start [2013-04-16 17:16:21]: Compile End [2013-04-16 17:16:21]: OK [2013-04-16 17:16:21]: Time taken: 0.143 seconds hive (default) dfs -ls /user/test/.Trash/Current/; Found 1 items drwxr-xr-x -test supergroup 0 2013-04-16 17:06 /user/test/.Trash/Current/ds=11 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4659) while sql contains \t , 'desc formatted view_name' and 'show create table view_name' statements will generate Incomplete results
caofangkun created HIVE-4659: Summary: while sql contains \t , 'desc formatted view_name' and 'show create table view_name' statements will generate Incomplete results Key: HIVE-4659 URL: https://issues.apache.org/jira/browse/HIVE-4659 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: caofangkun Assignee: caofangkun Priority: Minor drop view if exists v_test; CREATE VIEW v_test AS select key,-- start by \t\t value, -- start by \t\t dt from -- start by \t\t ( select key, value, dt from tmp_v_t1 where dt='20130122' union all select key,value, dt from tmp_v_t1 where dt='20130123' ) t; $ hive -e show create table v_test UT-One the three lines which started by \t lost in create statment ! Logging initialized using configuration in file:/home/zongren/hive-conf/hive-log4j.properties Hive history file=/tmp/zongren/hive_job_log_zongren_24155@hd17-vm5_201306051125_94165790.txt OK CREATE VIEW v_test AS select ( select `tmp_v_t1`.`key`, `tmp_v_t1`.`value`, `tmp_v_t1`.`dt` from `default`.`tmp_v_t1` where `tmp_v_t1`.`dt`='20130122' union all select `tmp_v_t1`.`key`,`tmp_v_t1`.`value`, `tmp_v_t1`.`dt` from `default`.`tmp_v_t1` where `tmp_v_t1`.`dt`='20130123' ) `t` Time taken: 2.767 seconds, Fetched: 9 row(s) UT-Two: -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4032) Inserting data into Hive table from a query, when the query is a partitioned table and select * ,will generate a SemanticException
[ https://issues.apache.org/jira/browse/HIVE-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4032: - Assignee: caofangkun Inserting data into Hive table from a query, when the query is a partitioned table and select * ,will generate a SemanticException -- Key: HIVE-4032 URL: https://issues.apache.org/jira/browse/HIVE-4032 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Environment: Apache Hadoop 0.19.1 + Apache Hive 0.10.0 Reporter: caofangkun Assignee: caofangkun Priority: Minor Attachments: HIVE-4032-1.patch Inserting data into Hive table from a query , when the query is : select * from a_partitioned_table, will throw a SemanticException . It seems that * contains the virtual partition columns. drop table if exists zr_test; create table if not exists zr_test (key string, value string) partitioned by (dt string); drop table if exists zr_test_1; create table if not exists zr_test_1 (key string, value string) partitioned by (dt string); --Query One explain insert into table zr_test partition (dt='20130217') select key, value from zr_test_1 where dt='20130217'; --Query Two explain insert into table zr_test partition (dt='20130217') select * from zr_test_1 where dt='20130217'; Ouery One works well, bug Query Two failed with the following information: FAILED: SemanticException [Error 10044]: Line 2:18 Cannot insert into target table because column number/types are different ''20130217'': Table insclause-0 has 2 columns, but query has 3 columns. p.s: Query Two works well on Apache Hadoop 0.20.1 + Hive 0.10.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4568) Beeline needs to support resolving variables
[ https://issues.apache.org/jira/browse/HIVE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659381#comment-13659381 ] caofangkun commented on HIVE-4568: -- env:* and and system:* variables can not be set . And other variables can be set in beeline Already. Beeline needs to support resolving variables Key: HIVE-4568 URL: https://issues.apache.org/jira/browse/HIVE-4568 Project: Hive Issue Type: Improvement Affects Versions: 0.10.0 Reporter: Xuefu Zhang Priority: Minor Beeline currently doesn't support variable (system, env, etc) substitution as hive client does. Supporting this feature will certainly make it more usable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4562) Some jars of Hive are required to be deployed on every salve of hadoop cluster,we'd better separate these jars from common client-side-jars
caofangkun created HIVE-4562: Summary: Some jars of Hive are required to be deployed on every salve of hadoop cluster,we'd better separate these jars from common client-side-jars Key: HIVE-4562 URL: https://issues.apache.org/jira/browse/HIVE-4562 Project: Hive Issue Type: Bug Components: Clients Reporter: caofangkun Priority: Minor Some jars of Hive are required not only by the client but also the server (every Hadoop slave), though we could use 'add jar' command to add all the jars in dis-cache , but in common way ,we may add these jars in $HADOOP_HOME/lib/ of every salve of the Hadoop Cluster, and need restart all the tasktrackers . For example: When using hive stats, If we use mysql as tmp stats db ,every salve of the Hadoop Cluster should contain mysql-connector-java-.jar in $HADOOP_HOME/lib/ And for column stats In all slaves $HADOOP_HOME/lib/ should contain: jackson-core-asl-1.8.8.jar jackson-jaxrs-1.8.8.jar jackson-mapper-asl-1.8.8.jar jackson-xc-1.8.8.jar These jars should be separated from other common client-side-jars . -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4562) Some jars of Hive are required to be deployed on every salve of hadoop cluster,we'd better separate these jars from common client-side-jars
[ https://issues.apache.org/jira/browse/HIVE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4562: - Description: Some jars of Hive are required not only by the client but also the server (every Hadoop slave), though we could use 'add jar' command to add all the jars in dis-cache , but in common way ,we may add these jars in $HADOOP_HOME/lib/ of every salve of the Hadoop Cluster, and need restart all the tasktrackers . For example: When using hive stats, If we use mysql as tmp stats db ,every salve of the Hadoop Cluster should contain mysql-connector-java-.jar in $HADOOP_HOME/lib/ And for column stats In all slaves $HADOOP_HOME/lib/ should contain: jackson-core-asl-1.8.8.jar jackson-jaxrs-1.8.8.jar jackson-mapper-asl-1.8.8.jar jackson-xc-1.8.8.jar These jars should be separated from other common client-side-jars . was: Some jars of Hive are required not only by the client but also the server (every Hadoop slave), though we could use 'add jar' command to add all the jars in dis-cache , but in common way ,we may add these jars in $HADOOP_HOME/lib/ of every salve of the Hadoop Cluster, and need restart all the tasktrackers . For example: When using hive stats, If we use mysql as tmp stats db ,every salve of the Hadoop Cluster should contain mysql-connector-java-.jar in $HADOOP_HOME/lib/ And for column stats In all slaves $HADOOP_HOME/lib/ should contain: jackson-core-asl-1.8.8.jar jackson-jaxrs-1.8.8.jar jackson-mapper-asl-1.8.8.jar jackson-xc-1.8.8.jar These jars should be separated from other common client-side-jars . Some jars of Hive are required to be deployed on every salve of hadoop cluster,we'd better separate these jars from common client-side-jars --- Key: HIVE-4562 URL: https://issues.apache.org/jira/browse/HIVE-4562 Project: Hive Issue Type: Bug Components: Clients Reporter: caofangkun Priority: Minor Some jars of Hive are required not only by the client but also the server (every Hadoop slave), though we could use 'add jar' command to add all the jars in dis-cache , but in common way ,we may add these jars in $HADOOP_HOME/lib/ of every salve of the Hadoop Cluster, and need restart all the tasktrackers . For example: When using hive stats, If we use mysql as tmp stats db ,every salve of the Hadoop Cluster should contain mysql-connector-java-.jar in $HADOOP_HOME/lib/ And for column stats In all slaves $HADOOP_HOME/lib/ should contain: jackson-core-asl-1.8.8.jar jackson-jaxrs-1.8.8.jar jackson-mapper-asl-1.8.8.jar jackson-xc-1.8.8.jar These jars should be separated from other common client-side-jars . -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4562) HIVE-3393 brought in Jackson library,and these four jars should be packed into hive-exec.jar
[ https://issues.apache.org/jira/browse/HIVE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4562: - Attachment: HIVE-4562-1.patch Add patch HIVE-3393 brought in Jackson library,and these four jars should be packed into hive-exec.jar Key: HIVE-4562 URL: https://issues.apache.org/jira/browse/HIVE-4562 Project: Hive Issue Type: Bug Components: Clients Reporter: caofangkun Priority: Minor Attachments: HIVE-4562-1.patch Some jars of Hive are required not only by the client but also the server (every Hadoop slave), though we could use 'add jar' command to add all the jars in dis-cache , but in common way ,we may add these jars in $HADOOP_HOME/lib/ of every salve of the Hadoop Cluster, and need restart all the tasktrackers . For example: When using hive stats, If we use mysql as tmp stats db ,every salve of the Hadoop Cluster should contain mysql-connector-java-.jar in $HADOOP_HOME/lib/ And for column stats In all slaves $HADOOP_HOME/lib/ should contain: jackson-core-asl-1.8.8.jar jackson-jaxrs-1.8.8.jar jackson-mapper-asl-1.8.8.jar jackson-xc-1.8.8.jar These jars should be separated from other common client-side-jars . -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4562) HIVE-3393 brought in Jackson library,and these four jars should be packed into hive-exec.jar
[ https://issues.apache.org/jira/browse/HIVE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4562: - Attachment: HIVE-4562-2.patch Only jackson-core-asl-1.8.8.jar and jackson-mapper-asl-1.8.8.jar need to be packed in. HIVE-3393 brought in Jackson library,and these four jars should be packed into hive-exec.jar Key: HIVE-4562 URL: https://issues.apache.org/jira/browse/HIVE-4562 Project: Hive Issue Type: Bug Components: Clients Reporter: caofangkun Priority: Minor Attachments: HIVE-4562-1.patch, HIVE-4562-2.patch Some jars of Hive are required not only by the client but also the server (every Hadoop slave), though we could use 'add jar' command to add all the jars in dis-cache , but in common way ,we may add these jars in $HADOOP_HOME/lib/ of every salve of the Hadoop Cluster, and need restart all the tasktrackers . For example: When using hive stats, If we use mysql as tmp stats db ,every salve of the Hadoop Cluster should contain mysql-connector-java-.jar in $HADOOP_HOME/lib/ And for column stats In all slaves $HADOOP_HOME/lib/ should contain: jackson-core-asl-1.8.8.jar jackson-jaxrs-1.8.8.jar jackson-mapper-asl-1.8.8.jar jackson-xc-1.8.8.jar These jars should be separated from other common client-side-jars . -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4562) HIVE-3393 brought in Jackson library,and these four jars should be packed into hive-exec.jar
[ https://issues.apache.org/jira/browse/HIVE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4562: - Component/s: (was: Clients) Build Infrastructure HIVE-3393 brought in Jackson library,and these four jars should be packed into hive-exec.jar Key: HIVE-4562 URL: https://issues.apache.org/jira/browse/HIVE-4562 Project: Hive Issue Type: Bug Components: Build Infrastructure Reporter: caofangkun Priority: Minor Attachments: HIVE-4562-1.patch, HIVE-4562-2.patch Some jars of Hive are required not only by the client but also the server (every Hadoop slave), though we could use 'add jar' command to add all the jars in dis-cache , but in common way ,we may add these jars in $HADOOP_HOME/lib/ of every salve of the Hadoop Cluster, and need restart all the tasktrackers . For example: When using hive stats, If we use mysql as tmp stats db ,every salve of the Hadoop Cluster should contain mysql-connector-java-.jar in $HADOOP_HOME/lib/ And for column stats In all slaves $HADOOP_HOME/lib/ should contain: jackson-core-asl-1.8.8.jar jackson-jaxrs-1.8.8.jar jackson-mapper-asl-1.8.8.jar jackson-xc-1.8.8.jar These jars should be separated from other common client-side-jars . -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4557) Beeline should support 'quit' to exit beeline shell
[ https://issues.apache.org/jira/browse/HIVE-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658194#comment-13658194 ] caofangkun commented on HIVE-4557: -- Hi [~xuefu.w...@kodak.com] just try : beeline !q or beeline !quit Beeline should support 'quit' to exit beeline shell --- Key: HIVE-4557 URL: https://issues.apache.org/jira/browse/HIVE-4557 Project: Hive Issue Type: Improvement Components: Clients Affects Versions: 0.10.0 Reporter: Xuefu Zhang Right now, ctrl+d can exit the beeline shell, which is not friendly. It is better to support quit or quit; to exit shell. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4536) INSERT OVERWRITE LOCAL DIRECTORY sum() function will write error result to local file
[ https://issues.apache.org/jira/browse/HIVE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13656914#comment-13656914 ] caofangkun commented on HIVE-4536: -- This is a Java Rounding Double issue and not a bug of Hive. Not only Java but also other languages exhibit similar behaviour. INSERT OVERWRITE LOCAL DIRECTORY sum() function will write error result to local file - Key: HIVE-4536 URL: https://issues.apache.org/jira/browse/HIVE-4536 Project: Hive Issue Type: Bug Affects Versions: 0.8.1 Reporter: sutao bian When i use insert overwrite local directory to sum, it will write a error result 729.40001 . the correct result is 729.4. sometimes it will write correct result ,sometimes it write erro result. hive -S -e INSERT OVERWRITE LOCAL DIRECTORY '/opt/data/data1' select concat(year,'-',month,'-',day),flog_appid,case when flog_country ='null' then 'UNKNOW' else flog_country end,sum(flog_expense) from tbl_ad_finance_log where year='${YEAR}' and month='${MONTH}' and day='${DAY}' group by flog_appid,case when flog_country ='null' then 'UNKNOW' else flog_country end,concat(year,'-',month,'-',day) 2013-05-09^A72665@5136b689dc317bc00303^AIN^A729.40001 hive -S -e INSERT OVERWRITE LOCAL DIRECTORY '/opt/data/data2' select concat(year,'-',month,'-',day),flog_appid,case when flog_country ='null' then 'UNKNOW' else flog_country end,sum(flog_expense) from tbl_ad_finance_log where year='${YEAR}' and month='${MONTH}' and day='${DAY}' group by flog_appid,case when flog_country ='null' then 'UNKNOW' else flog_country end,concat(year,'-',month,'-',day) 2013-05-09^A72665@5136b689dc317bc00303^AIN^A729.4 Thanks in advance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)
caofangkun created HIVE-4561: Summary: Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the column values larger than 0.0 (or if all column values smaller than 0.0) Key: HIVE-4561 URL: https://issues.apache.org/jira/browse/HIVE-4561 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.12.0 Reporter: caofangkun Priority: Minor if all column values larger than 0.0 DOUBLE_LOW_VALUE always will be 0.0 or if all column values less than 0.0, DOUBLE_HIGH_VALUE will always be hive (default) create table src_test (price double); hive (default) load data local inpath './test.txt' into table src_test; hive (default) select * from src_test; OK 1.0 2.0 3.0 Time taken: 0.313 seconds, Fetched: 3 row(s) hive (default) analyze table src_test compute statistics for columns price; mysql select * from TAB_COL_STATS \G; *** 1. row *** CS_ID: 16 DB_NAME: default TABLE_NAME: src_test COLUMN_NAME: price COLUMN_TYPE: double TBL_ID: 2586 LONG_LOW_VALUE: 0 LONG_HIGH_VALUE: 0 DOUBLE_LOW_VALUE: 0. # Wrong Result ! Expected is 1. DOUBLE_HIGH_VALUE: 3. BIG_DECIMAL_LOW_VALUE: NULL BIG_DECIMAL_HIGH_VALUE: NULL NUM_NULLS: 0 NUM_DISTINCTS: 1 AVG_COL_LEN: 0. MAX_COL_LEN: 0 NUM_TRUES: 0 NUM_FALSES: 0 LAST_ANALYZED: 1368596151 2 rows in set (0.00 sec) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)
[ https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4561: - Description: if all column values larger than 0.0 DOUBLE_LOW_VALUE always will be 0.0 or if all column values less than 0.0, DOUBLE_HIGH_VALUE will always be hive (default) create table src_test (price double); hive (default) load data local inpath './test.txt' into table src_test; hive (default) select * from src_test; OK 1.0 2.0 3.0 Time taken: 0.313 seconds, Fetched: 3 row(s) hive (default) analyze table src_test compute statistics for columns price; mysql select * from TAB_COL_STATS \G; CS_ID: 16 DB_NAME: default TABLE_NAME: src_test COLUMN_NAME: price COLUMN_TYPE: double TBL_ID: 2586 LONG_LOW_VALUE: 0 LONG_HIGH_VALUE: 0 DOUBLE_LOW_VALUE: 0. # Wrong Result ! Expected is 1. DOUBLE_HIGH_VALUE: 3. BIG_DECIMAL_LOW_VALUE: NULL BIG_DECIMAL_HIGH_VALUE: NULL NUM_NULLS: 0 NUM_DISTINCTS: 1 AVG_COL_LEN: 0. MAX_COL_LEN: 0 NUM_TRUES: 0 NUM_FALSES: 0 LAST_ANALYZED: 1368596151 2 rows in set (0.00 sec) was: if all column values larger than 0.0 DOUBLE_LOW_VALUE always will be 0.0 or if all column values less than 0.0, DOUBLE_HIGH_VALUE will always be hive (default) create table src_test (price double); hive (default) load data local inpath './test.txt' into table src_test; hive (default) select * from src_test; OK 1.0 2.0 3.0 Time taken: 0.313 seconds, Fetched: 3 row(s) hive (default) analyze table src_test compute statistics for columns price; mysql select * from TAB_COL_STATS \G; *** 1. row *** CS_ID: 16 DB_NAME: default TABLE_NAME: src_test COLUMN_NAME: price COLUMN_TYPE: double TBL_ID: 2586 LONG_LOW_VALUE: 0 LONG_HIGH_VALUE: 0 DOUBLE_LOW_VALUE: 0. # Wrong Result ! Expected is 1. DOUBLE_HIGH_VALUE: 3. BIG_DECIMAL_LOW_VALUE: NULL BIG_DECIMAL_HIGH_VALUE: NULL NUM_NULLS: 0 NUM_DISTINCTS: 1 AVG_COL_LEN: 0. MAX_COL_LEN: 0 NUM_TRUES: 0 NUM_FALSES: 0 LAST_ANALYZED: 1368596151 2 rows in set (0.00 sec) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the column values larger than 0.0 (or if all column values smaller than 0.0) Key: HIVE-4561 URL: https://issues.apache.org/jira/browse/HIVE-4561 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.12.0 Reporter: caofangkun Priority: Minor if all column values larger than 0.0 DOUBLE_LOW_VALUE always will be 0.0 or if all column values less than 0.0, DOUBLE_HIGH_VALUE will always be hive (default) create table src_test (price double); hive (default) load data local inpath './test.txt' into table src_test; hive (default) select * from src_test; OK 1.0 2.0 3.0 Time taken: 0.313 seconds, Fetched: 3 row(s) hive (default) analyze table src_test compute statistics for columns price; mysql select * from TAB_COL_STATS \G; CS_ID: 16 DB_NAME: default TABLE_NAME: src_test COLUMN_NAME: price COLUMN_TYPE: double TBL_ID: 2586 LONG_LOW_VALUE: 0 LONG_HIGH_VALUE: 0 DOUBLE_LOW_VALUE: 0. # Wrong Result ! Expected is 1. DOUBLE_HIGH_VALUE: 3. BIG_DECIMAL_LOW_VALUE: NULL BIG_DECIMAL_HIGH_VALUE: NULL NUM_NULLS: 0 NUM_DISTINCTS: 1 AVG_COL_LEN: 0. MAX_COL_LEN: 0 NUM_TRUES: 0 NUM_FALSES: 0 LAST_ANALYZED: 1368596151 2 rows in set (0.00 sec) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4346) when writing data into filesystem from queries ,the output files could contain a line of column names
[ https://issues.apache.org/jira/browse/HIVE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4346: - Attachment: HIVE-4346-3.patch when writing data into filesystem from queries ,the output files could contain a line of column names -- Key: HIVE-4346 URL: https://issues.apache.org/jira/browse/HIVE-4346 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: caofangkun Priority: Minor Attachments: HIVE-4346-1.patch, HIVE-4346-3.patch For example : hivedesc src; key string value string hiveselect * from src; 1 10 2 20 hiveset hive.output.markschema=true; hiveinsert overwrite local directory './test1' select * from src ; hive!ls -l './test1'; ./test1/_metadata ./test1/00_0 hive!cat './test1/_metadata' key^Avalue hive!cat './test1/00_0'; 1^A10 2^A20 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4392) Illogical InvalidObjectException throwed when use mulit aggregate functions with star columns
[ https://issues.apache.org/jira/browse/HIVE-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13651545#comment-13651545 ] caofangkun commented on HIVE-4392: -- hivecreate table summary as select *, sum(key), count(value) from src; hiveselect * from summary; 130091.0500 130091.0500 Such kind of query will generate only one row of result ? And the the first two column values are not correct ? Illogical InvalidObjectException throwed when use mulit aggregate functions with star columns -- Key: HIVE-4392 URL: https://issues.apache.org/jira/browse/HIVE-4392 Project: Hive Issue Type: Bug Components: Query Processor Environment: Apache Hadoop 0.20.1 Apache Hive Trunk Reporter: caofangkun Assignee: Navis Priority: Minor Attachments: HIVE-4392.D10431.1.patch, HIVE-4392.D10431.2.patch, HIVE-4392.D10431.3.patch, HIVE-4392.D10431.4.patch, HIVE-4392.D10431.5.patch For Example: hive (default) create table liza_1 as select *, sum(key), sum(value) from new_src; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Starting Job = job_201304191025_0003, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0003 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job -kill job_201304191025_0003 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 1 2013-04-22 11:09:28,017 Stage-1 map = 0%, reduce = 0% 2013-04-22 11:09:34,054 Stage-1 map = 0%, reduce = 100% 2013-04-22 11:09:37,074 Stage-1 map = 100%, reduce = 100% Ended Job = job_201304191025_0003 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1 FAILED: Error in metadata: InvalidObjectException(message:liza_1 is not a valid object name) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask MapReduce Jobs Launched: Job 0: Reduce: 1 HDFS Read: 0 HDFS Write: 12 SUCCESS Total MapReduce CPU Time Spent: 0 msec hive (default) create table liza_1 as select *, sum(key), sum(value) from new_src group by key, value; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Starting Job = job_201304191025_0004, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0004 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job -kill job_201304191025_0004 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 1 2013-04-22 11:11:58,945 Stage-1 map = 0%, reduce = 0% 2013-04-22 11:12:01,964 Stage-1 map = 0%, reduce = 100% 2013-04-22 11:12:04,982 Stage-1 map = 100%, reduce = 100% Ended Job = job_201304191025_0004 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1 FAILED: Error in metadata: InvalidObjectException(message:liza_1 is not a valid object name) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask MapReduce Jobs Launched: Job 0: Reduce: 1 HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec But the following tow Queries work: hive (default) create table liza_1 as select * from new_src; Total MapReduce jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_201304191025_0006, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0006 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job -kill job_201304191025_0006 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2013-04-22 11:15:00,681 Stage-1 map = 0%, reduce = 0% 2013-04-22 11:15:03,697 Stage-1 map = 100%, reduce = 100% Ended Job = job_201304191025_0006 Stage-4 is selected by condition resolver. Stage-3 is filtered out by condition resolver. Stage-5 is filtered out by condition resolver. Moving data to:
[jira] [Updated] (HIVE-4392) Illogical InvalidObjectException throwed when use mulit aggregate functions with star columns
[ https://issues.apache.org/jira/browse/HIVE-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4392: - Description: For Example: hive (default) create table liza_1 as select *, sum(key), sum(value) from new_src; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Starting Job = job_201304191025_0003, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0003 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job -kill job_201304191025_0003 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 1 2013-04-22 11:09:28,017 Stage-1 map = 0%, reduce = 0% 2013-04-22 11:09:34,054 Stage-1 map = 0%, reduce = 100% 2013-04-22 11:09:37,074 Stage-1 map = 100%, reduce = 100% Ended Job = job_201304191025_0003 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1 FAILED: Error in metadata: InvalidObjectException(message:liza_1 is not a valid object name) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask MapReduce Jobs Launched: Job 0: Reduce: 1 HDFS Read: 0 HDFS Write: 12 SUCCESS Total MapReduce CPU Time Spent: 0 msec hive (default) create table liza_1 as select *, sum(key), sum(value) from new_src group by key, value; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Starting Job = job_201304191025_0004, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0004 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job -kill job_201304191025_0004 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 1 2013-04-22 11:11:58,945 Stage-1 map = 0%, reduce = 0% 2013-04-22 11:12:01,964 Stage-1 map = 0%, reduce = 100% 2013-04-22 11:12:04,982 Stage-1 map = 100%, reduce = 100% Ended Job = job_201304191025_0004 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1 FAILED: Error in metadata: InvalidObjectException(message:liza_1 is not a valid object name) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask MapReduce Jobs Launched: Job 0: Reduce: 1 HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec But the following tow Queries work: hive (default) create table liza_1 as select * from new_src; Total MapReduce jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_201304191025_0006, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0006 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job -kill job_201304191025_0006 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2013-04-22 11:15:00,681 Stage-1 map = 0%, reduce = 0% 2013-04-22 11:15:03,697 Stage-1 map = 100%, reduce = 100% Ended Job = job_201304191025_0006 Stage-4 is selected by condition resolver. Stage-3 is filtered out by condition resolver. Stage-5 is filtered out by condition resolver. Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive-scratchdir/hive_2013-04-22_11-14-54_632_6709035018023861094/-ext-10001 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1 Table default.liza_1 stats: [num_partitions: 0, num_files: 0, num_rows: 0, total_size: 0, raw_data_size: 0] MapReduce Jobs Launched: Job 0: HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec OK Time taken: 9.576 seconds hive (default) create table liza_1 as select sum (key), sum(value) from new_test; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Starting Job = job_201304191025_0008, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0008 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job -kill job_201304191025_0008 Hadoop job information for Stage-1: number
[jira] [Commented] (HIVE-4392) Illogical InvalidObjectException throwed when use mulit aggregate functions with star columns
[ https://issues.apache.org/jira/browse/HIVE-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13651575#comment-13651575 ] caofangkun commented on HIVE-4392: -- hive (default) select * from src_for_test; 35 48 100 100 hive (default) select *, sum(key), count(value) from src_for_test group by key, value ; 35 0.0 1 0.0 1 100 100 100.0 1 100.0 1 48 48.01 48.01 two more columns generated! Illogical InvalidObjectException throwed when use mulit aggregate functions with star columns -- Key: HIVE-4392 URL: https://issues.apache.org/jira/browse/HIVE-4392 Project: Hive Issue Type: Bug Components: Query Processor Environment: Apache Hadoop 0.20.1 Apache Hive Trunk Reporter: caofangkun Assignee: Navis Priority: Minor Attachments: HIVE-4392.D10431.1.patch, HIVE-4392.D10431.2.patch, HIVE-4392.D10431.3.patch, HIVE-4392.D10431.4.patch, HIVE-4392.D10431.5.patch For Example: hive (default) create table liza_1 as select *, sum(key), sum(value) from new_src; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Starting Job = job_201304191025_0003, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0003 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job -kill job_201304191025_0003 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 1 2013-04-22 11:09:28,017 Stage-1 map = 0%, reduce = 0% 2013-04-22 11:09:34,054 Stage-1 map = 0%, reduce = 100% 2013-04-22 11:09:37,074 Stage-1 map = 100%, reduce = 100% Ended Job = job_201304191025_0003 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1 FAILED: Error in metadata: InvalidObjectException(message:liza_1 is not a valid object name) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask MapReduce Jobs Launched: Job 0: Reduce: 1 HDFS Read: 0 HDFS Write: 12 SUCCESS Total MapReduce CPU Time Spent: 0 msec hive (default) create table liza_1 as select *, sum(key), sum(value) from new_src group by key, value; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Starting Job = job_201304191025_0004, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0004 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job -kill job_201304191025_0004 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 1 2013-04-22 11:11:58,945 Stage-1 map = 0%, reduce = 0% 2013-04-22 11:12:01,964 Stage-1 map = 0%, reduce = 100% 2013-04-22 11:12:04,982 Stage-1 map = 100%, reduce = 100% Ended Job = job_201304191025_0004 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1 FAILED: Error in metadata: InvalidObjectException(message:liza_1 is not a valid object name) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask MapReduce Jobs Launched: Job 0: Reduce: 1 HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec But the following tow Queries work: hive (default) create table liza_1 as select * from new_src; Total MapReduce jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_201304191025_0006, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0006 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job -kill job_201304191025_0006 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2013-04-22 11:15:00,681 Stage-1 map = 0%, reduce = 0% 2013-04-22 11:15:03,697 Stage-1 map = 100%, reduce = 100% Ended Job = job_201304191025_0006 Stage-4 is selected by condition resolver. Stage-3 is filtered out by condition resolver. Stage-5 is filtered out by condition resolver. Moving data to:
[jira] [Created] (HIVE-4522) Confusing result generated when use mulit aggregate functions with star columns
caofangkun created HIVE-4522: Summary: Confusing result generated when use mulit aggregate functions with star columns Key: HIVE-4522 URL: https://issues.apache.org/jira/browse/HIVE-4522 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: caofangkun Priority: Minor hive (default) set hive.cli.print.header=true; hive (default) select * from src; OK key value 35 48 100 100 Table src has two columns: key and value But guess how many columns the following query will generate ? Three ? No, it's two . hive (default) select * , count(key) as cnt from src; OK (tok_function count (tok_table_or_col key)) cnt 3 3 And what about this query ? hive (default) select * , count(key), sum(value) as cnt from src group by key, value; Four columns ? No, it's six! hive (default) select * , count(key) as cnt , sum(value) as sum_value from src group by key, value ; OK (tok_table_or_col key) (tok_table_or_col value)(tok_function count (tok_table_or_col key)) (tok_function sum (tok_table_or_col value)) cnt sum_value 35 1 35.01 35.0 100 100 1 100.0 1 100.0 48 1 0.0 1 0.0 The column names do not match and the result is Confusing。 Have a look at how such kind of queries work in MySQL : mysql select *, sum(id),count(data) from example ; +--+--+-+-+ | id | data | sum(id) | count(data) | +--+--+-+-+ |1 | 2| 6 | 3 | +--+--+-+-+ 1 row in set (0.03 sec) mysql select *, sum(id) from example ; +--+--+-+ | id | data | sum(id) | +--+--+-+ |1 | 2| 6 | +--+--+-+ 1 row in set (0.09 sec) mysql select *, sum(id),count(data) from example group by id, data ; +--+--+-+-+ | id | data | sum(id) | count(data) | +--+--+-+-+ |1 | 2| 1 | 1 | |2 | 2| 2 | 1 | |3 | 3| 3 | 1 | +--+--+-+-+ 3 rows in set (0.00 sec) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4367) enhance TRUNCATE syntex to drop data of external table
[ https://issues.apache.org/jira/browse/HIVE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13640697#comment-13640697 ] caofangkun commented on HIVE-4367: -- Thanks [~teddy.choi] I have read the cwiki , it's helpful for me. enhance TRUNCATE syntex to drop data of external table Key: HIVE-4367 URL: https://issues.apache.org/jira/browse/HIVE-4367 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.11.0 Reporter: caofangkun Priority: Minor Attachments: HIVE-4367-1.patch In my use case , sometimes I have to remove data of external tables to free up storage space of the cluster . So it's necessary for to enhance the syntax like TRUNCATE TABLE srcpart_truncate PARTITION (dt='201130412') FORCE; to remove data from EXTERNAL table. And I add a configuration property to enable remove data to Trash property namehive.truncate.skiptrash/name valuefalse/value description if true will remove data to trash, else false drop data immediately /description /property For example : hive (default) TRUNCATE TABLE external1 partition (ds='11'); FAILED: Error in semantic analysis: Cannot truncate non-managed table external1 hive (default) TRUNCATE TABLE external1 partition (ds='11') FORCE; [2013-04-16 17:15:52]: Compile Start [2013-04-16 17:15:52]: Compile End [2013-04-16 17:15:52]: OK [2013-04-16 17:15:52]: Time taken: 0.413 seconds hive (default) set hive.truncate.skiptrash; hive.truncate.skiptrash=false hive (default) set hive.truncate.skiptrash=true; hive (default) TRUNCATE TABLE external1 partition (ds='12') FORCE; [2013-04-16 17:16:21]: Compile Start [2013-04-16 17:16:21]: Compile End [2013-04-16 17:16:21]: OK [2013-04-16 17:16:21]: Time taken: 0.143 seconds hive (default) dfs -ls /user/test/.Trash/Current/; Found 1 items drwxr-xr-x -test supergroup 0 2013-04-16 17:06 /user/test/.Trash/Current/ds=11 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3682) when output hive table to file,users should could have a separator of their own choice
[ https://issues.apache.org/jira/browse/HIVE-3682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13640700#comment-13640700 ] caofangkun commented on HIVE-3682: -- Hi [~gangtimliu] , I'm not a committer yet, so I could not assign this issue to myself. Please feel free and assign this issue . Thanks when output hive table to file,users should could have a separator of their own choice -- Key: HIVE-3682 URL: https://issues.apache.org/jira/browse/HIVE-3682 Project: Hive Issue Type: New Feature Components: CLI Affects Versions: 0.8.1 Environment: Linux 3.0.0-14-generic #23-Ubuntu SMP Mon Nov 21 20:34:47 UTC 2011 i686 i686 i386 GNU/Linux java version 1.6.0_25 hadoop-0.20.2-cdh3u0 hive-0.8.1 Reporter: caofangkun Assignee: Gang Tim Liu Attachments: HIVE-3682-1.patch, HIVE-3682.D10275.1.patch, HIVE-3682.with.serde.patch By default,when output hive table to file ,columns of the Hive table are separated by ^A character (that is \001). But indeed users should have the right to set a seperator of their own choice. Usage Example: create table for_test (key string, value string); load data local inpath './in1.txt' into table for_test select * from for_test; UT-01:default separator is \001 line separator is \n insert overwrite local directory './test-01' select * from src ; create table array_table (a arraystring, b arraystring) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ','; load data local inpath ../hive/examples/files/arraytest.txt overwrite into table table2; CREATE TABLE map_table (foo STRING , bar MAPSTRING, STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',' MAP KEYS TERMINATED BY ':' STORED AS TEXTFILE; UT-02:defined field separator as ':' insert overwrite local directory './test-02' row format delimited FIELDS TERMINATED BY ':' select * from src ; UT-03: line separator DO NOT ALLOWED to define as other separator insert overwrite local directory './test-03' row format delimited FIELDS TERMINATED BY ':' select * from src ; UT-04: define map separators insert overwrite local directory './test-04' row format delimited FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',' MAP KEYS TERMINATED BY ':' select * from src; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4392) Illogical InvalidObjectException throwed when use mulit aggregate functions with star columns
caofangkun created HIVE-4392: Summary: Illogical InvalidObjectException throwed when use mulit aggregate functions with star columns Key: HIVE-4392 URL: https://issues.apache.org/jira/browse/HIVE-4392 Project: Hive Issue Type: Bug Components: Query Processor Reporter: caofangkun Priority: Minor For Example: hive (default) create table liza_1 as select *, sum(key), sum(value) from new_src; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Starting Job = job_201304191025_0003, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0003 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job -kill job_201304191025_0003 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 1 2013-04-22 11:09:28,017 Stage-1 map = 0%, reduce = 0% 2013-04-22 11:09:34,054 Stage-1 map = 0%, reduce = 100% 2013-04-22 11:09:37,074 Stage-1 map = 100%, reduce = 100% Ended Job = job_201304191025_0003 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1 FAILED: Error in metadata: InvalidObjectException(message:liza_1 is not a valid object name) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask MapReduce Jobs Launched: Job 0: Reduce: 1 HDFS Read: 0 HDFS Write: 12 SUCCESS Total MapReduce CPU Time Spent: 0 msec hive (default) create table liza_1 as select *, sum(key), sum(value) from new_src group by key, value; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Starting Job = job_201304191025_0004, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0004 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job -kill job_201304191025_0004 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 1 2013-04-22 11:11:58,945 Stage-1 map = 0%, reduce = 0% 2013-04-22 11:12:01,964 Stage-1 map = 0%, reduce = 100% 2013-04-22 11:12:04,982 Stage-1 map = 100%, reduce = 100% Ended Job = job_201304191025_0004 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1 FAILED: Error in metadata: InvalidObjectException(message:liza_1 is not a valid object name) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask MapReduce Jobs Launched: Job 0: Reduce: 1 HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec But the following tow Queries work: hive (default) create table liza_1 as select * from new_src; Total MapReduce jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_201304191025_0006, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0006 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job -kill job_201304191025_0006 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2013-04-22 11:15:00,681 Stage-1 map = 0%, reduce = 0% 2013-04-22 11:15:03,697 Stage-1 map = 100%, reduce = 100% Ended Job = job_201304191025_0006 Stage-4 is selected by condition resolver. Stage-3 is filtered out by condition resolver. Stage-5 is filtered out by condition resolver. Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive-scratchdir/hive_2013-04-22_11-14-54_632_6709035018023861094/-ext-10001 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1 Table default.liza_1 stats: [num_partitions: 0, num_files: 0, num_rows: 0, total_size: 0, raw_data_size: 0] MapReduce Jobs Launched: Job 0: HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec OK Time taken: 9.576 seconds hive (default) create table liza_1 as select sum (key), sum(value) from new_test; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Starting Job =
[jira] [Updated] (HIVE-4392) Illogical InvalidObjectException throwed when use mulit aggregate functions with star columns
[ https://issues.apache.org/jira/browse/HIVE-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4392: - Description: For Example: hive (default) create table liza_1 as select *, sum(key), sum(value) from new_src; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Starting Job = job_201304191025_0003, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0003 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job -kill job_201304191025_0003 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 1 2013-04-22 11:09:28,017 Stage-1 map = 0%, reduce = 0% 2013-04-22 11:09:34,054 Stage-1 map = 0%, reduce = 100% 2013-04-22 11:09:37,074 Stage-1 map = 100%, reduce = 100% Ended Job = job_201304191025_0003 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1 FAILED: Error in metadata: InvalidObjectException(message:liza_1 is not a valid object name) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask MapReduce Jobs Launched: Job 0: Reduce: 1 HDFS Read: 0 HDFS Write: 12 SUCCESS Total MapReduce CPU Time Spent: 0 msec hive (default) create table liza_1 as select *, sum(key), sum(value) from new_src group by key, value; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Starting Job = job_201304191025_0004, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0004 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job -kill job_201304191025_0004 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 1 2013-04-22 11:11:58,945 Stage-1 map = 0%, reduce = 0% 2013-04-22 11:12:01,964 Stage-1 map = 0%, reduce = 100% 2013-04-22 11:12:04,982 Stage-1 map = 100%, reduce = 100% Ended Job = job_201304191025_0004 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1 FAILED: Error in metadata: InvalidObjectException(message:liza_1 is not a valid object name) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask MapReduce Jobs Launched: Job 0: Reduce: 1 HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec But the following tow Queries work: hive (default) create table liza_1 as select * from new_src; Total MapReduce jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_201304191025_0006, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0006 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job -kill job_201304191025_0006 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2013-04-22 11:15:00,681 Stage-1 map = 0%, reduce = 0% 2013-04-22 11:15:03,697 Stage-1 map = 100%, reduce = 100% Ended Job = job_201304191025_0006 Stage-4 is selected by condition resolver. Stage-3 is filtered out by condition resolver. Stage-5 is filtered out by condition resolver. Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive-scratchdir/hive_2013-04-22_11-14-54_632_6709035018023861094/-ext-10001 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1 Table default.liza_1 stats: [num_partitions: 0, num_files: 0, num_rows: 0, total_size: 0, raw_data_size: 0] MapReduce Jobs Launched: Job 0: HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec OK Time taken: 9.576 seconds hive (default) create table liza_1 as select sum (key), sum(value) from new_test; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Starting Job = job_201304191025_0008, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0008 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job -kill job_201304191025_0008 Hadoop job information for Stage-1: number
[jira] [Updated] (HIVE-4367) enhance TRUNCATE syntex to drop data of external table
[ https://issues.apache.org/jira/browse/HIVE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4367: - Attachment: HIVE-4367-1.patch https://reviews.apache.org/r/10600/ enhance TRUNCATE syntex to drop data of external table Key: HIVE-4367 URL: https://issues.apache.org/jira/browse/HIVE-4367 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.11.0 Reporter: caofangkun Assignee: Teddy Choi Priority: Minor Attachments: HIVE-4367-1.patch In my use case , sometimes I have to remove data of external tables to free up storage space of the cluster . So it's necessary for to enhance the syntax like TRUNCATE TABLE srcpart_truncate PARTITION (dt='201130412') FORCE; to remove data from EXTERNAL table. And I add a configuration property to enable remove data to Trash property namehive.truncate.skiptrash/name valuefalse/value description if true will remove data to trash, else false drop data immediately /description /property For example : hive (default) TRUNCATE TABLE external1 partition (ds='11'); FAILED: Error in semantic analysis: Cannot truncate non-managed table external1 hive (default) TRUNCATE TABLE external1 partition (ds='11') FORCE; [2013-04-16 17:15:52]: Compile Start [2013-04-16 17:15:52]: Compile End [2013-04-16 17:15:52]: OK [2013-04-16 17:15:52]: Time taken: 0.413 seconds hive (default) set hive.truncate.skiptrash; hive.truncate.skiptrash=false hive (default) set hive.truncate.skiptrash=true; hive (default) TRUNCATE TABLE external1 partition (ds='12') FORCE; [2013-04-16 17:16:21]: Compile Start [2013-04-16 17:16:21]: Compile End [2013-04-16 17:16:21]: OK [2013-04-16 17:16:21]: Time taken: 0.143 seconds hive (default) dfs -ls /user/test/.Trash/Current/; Found 1 items drwxr-xr-x -test supergroup 0 2013-04-16 17:06 /user/test/.Trash/Current/ds=11 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4367) enhance TRUNCATE syntex to drop data of external table
[ https://issues.apache.org/jira/browse/HIVE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13634911#comment-13634911 ] caofangkun commented on HIVE-4367: -- Hi [~teddy.choi] ,Sorry for that I did not notice you have assigned this issue when I upload the patch. I'm not a committer yet,so please feel free and assign this issue. Thank you. enhance TRUNCATE syntex to drop data of external table Key: HIVE-4367 URL: https://issues.apache.org/jira/browse/HIVE-4367 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.11.0 Reporter: caofangkun Priority: Minor Attachments: HIVE-4367-1.patch In my use case , sometimes I have to remove data of external tables to free up storage space of the cluster . So it's necessary for to enhance the syntax like TRUNCATE TABLE srcpart_truncate PARTITION (dt='201130412') FORCE; to remove data from EXTERNAL table. And I add a configuration property to enable remove data to Trash property namehive.truncate.skiptrash/name valuefalse/value description if true will remove data to trash, else false drop data immediately /description /property For example : hive (default) TRUNCATE TABLE external1 partition (ds='11'); FAILED: Error in semantic analysis: Cannot truncate non-managed table external1 hive (default) TRUNCATE TABLE external1 partition (ds='11') FORCE; [2013-04-16 17:15:52]: Compile Start [2013-04-16 17:15:52]: Compile End [2013-04-16 17:15:52]: OK [2013-04-16 17:15:52]: Time taken: 0.413 seconds hive (default) set hive.truncate.skiptrash; hive.truncate.skiptrash=false hive (default) set hive.truncate.skiptrash=true; hive (default) TRUNCATE TABLE external1 partition (ds='12') FORCE; [2013-04-16 17:16:21]: Compile Start [2013-04-16 17:16:21]: Compile End [2013-04-16 17:16:21]: OK [2013-04-16 17:16:21]: Time taken: 0.143 seconds hive (default) dfs -ls /user/test/.Trash/Current/; Found 1 items drwxr-xr-x -test supergroup 0 2013-04-16 17:06 /user/test/.Trash/Current/ds=11 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4367) enhance TRUNCATE syntex to drop data of external table
caofangkun created HIVE-4367: Summary: enhance TRUNCATE syntex to drop data of external table Key: HIVE-4367 URL: https://issues.apache.org/jira/browse/HIVE-4367 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.11.0 Reporter: caofangkun Priority: Minor In my use case , sometimes I have to remove data of external tables to free up storage space of the cluster . So it's necessary for to enhance the syntax like TRUNCATE TABLE srcpart_truncate PARTITION (dt='201130412') FORCE; to remove data from EXTERNAL table. And I add a configuration property to enable remove data to Trash property namehive.truncate.skiptrash/name valuefalse/value description if true will remove data to trash, else false drop data immediately /description /property For example : hive (default) TRUNCATE TABLE external1 partition (ds='11'); FAILED: Error in semantic analysis: Cannot truncate non-managed table external1 hive (default) TRUNCATE TABLE external1 partition (ds='11') FORCE; [2013-04-16 17:15:52]: Compile Start [2013-04-16 17:15:52]: Compile End [2013-04-16 17:15:52]: OK [2013-04-16 17:15:52]: Time taken: 0.413 seconds hive (default) set hive.truncate.skiptrash; hive.truncate.skiptrash=false hive (default) set hive.truncate.skiptrash=true; hive (default) TRUNCATE TABLE external1 partition (ds='12') FORCE; [2013-04-16 17:16:21]: Compile Start [2013-04-16 17:16:21]: Compile End [2013-04-16 17:16:21]: OK [2013-04-16 17:16:21]: Time taken: 0.143 seconds hive (default) dfs -ls /user/test/.Trash/Current/; Found 1 items drwxr-xr-x - kun.cao supergroup 0 2013-04-16 17:06 /user/test/.Trash/Current/ds=11 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4367) enhance TRUNCATE syntex to drop data of external table
[ https://issues.apache.org/jira/browse/HIVE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4367: - Description: In my use case , sometimes I have to remove data of external tables to free up storage space of the cluster . So it's necessary for to enhance the syntax like TRUNCATE TABLE srcpart_truncate PARTITION (dt='201130412') FORCE; to remove data from EXTERNAL table. And I add a configuration property to enable remove data to Trash property namehive.truncate.skiptrash/name valuefalse/value description if true will remove data to trash, else false drop data immediately /description /property For example : hive (default) TRUNCATE TABLE external1 partition (ds='11'); FAILED: Error in semantic analysis: Cannot truncate non-managed table external1 hive (default) TRUNCATE TABLE external1 partition (ds='11') FORCE; [2013-04-16 17:15:52]: Compile Start [2013-04-16 17:15:52]: Compile End [2013-04-16 17:15:52]: OK [2013-04-16 17:15:52]: Time taken: 0.413 seconds hive (default) set hive.truncate.skiptrash; hive.truncate.skiptrash=false hive (default) set hive.truncate.skiptrash=true; hive (default) TRUNCATE TABLE external1 partition (ds='12') FORCE; [2013-04-16 17:16:21]: Compile Start [2013-04-16 17:16:21]: Compile End [2013-04-16 17:16:21]: OK [2013-04-16 17:16:21]: Time taken: 0.143 seconds hive (default) dfs -ls /user/test/.Trash/Current/; Found 1 items drwxr-xr-x -test supergroup 0 2013-04-16 17:06 /user/test/.Trash/Current/ds=11 was: In my use case , sometimes I have to remove data of external tables to free up storage space of the cluster . So it's necessary for to enhance the syntax like TRUNCATE TABLE srcpart_truncate PARTITION (dt='201130412') FORCE; to remove data from EXTERNAL table. And I add a configuration property to enable remove data to Trash property namehive.truncate.skiptrash/name valuefalse/value description if true will remove data to trash, else false drop data immediately /description /property For example : hive (default) TRUNCATE TABLE external1 partition (ds='11'); FAILED: Error in semantic analysis: Cannot truncate non-managed table external1 hive (default) TRUNCATE TABLE external1 partition (ds='11') FORCE; [2013-04-16 17:15:52]: Compile Start [2013-04-16 17:15:52]: Compile End [2013-04-16 17:15:52]: OK [2013-04-16 17:15:52]: Time taken: 0.413 seconds hive (default) set hive.truncate.skiptrash; hive.truncate.skiptrash=false hive (default) set hive.truncate.skiptrash=true; hive (default) TRUNCATE TABLE external1 partition (ds='12') FORCE; [2013-04-16 17:16:21]: Compile Start [2013-04-16 17:16:21]: Compile End [2013-04-16 17:16:21]: OK [2013-04-16 17:16:21]: Time taken: 0.143 seconds hive (default) dfs -ls /user/test/.Trash/Current/; Found 1 items drwxr-xr-x - kun.cao supergroup 0 2013-04-16 17:06 /user/test/.Trash/Current/ds=11 enhance TRUNCATE syntex to drop data of external table Key: HIVE-4367 URL: https://issues.apache.org/jira/browse/HIVE-4367 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.11.0 Reporter: caofangkun Priority: Minor In my use case , sometimes I have to remove data of external tables to free up storage space of the cluster . So it's necessary for to enhance the syntax like TRUNCATE TABLE srcpart_truncate PARTITION (dt='201130412') FORCE; to remove data from EXTERNAL table. And I add a configuration property to enable remove data to Trash property namehive.truncate.skiptrash/name valuefalse/value description if true will remove data to trash, else false drop data immediately /description /property For example : hive (default) TRUNCATE TABLE external1 partition (ds='11'); FAILED: Error in semantic analysis: Cannot truncate non-managed table external1 hive (default) TRUNCATE TABLE external1 partition (ds='11') FORCE; [2013-04-16 17:15:52]: Compile Start [2013-04-16 17:15:52]: Compile End [2013-04-16 17:15:52]: OK [2013-04-16 17:15:52]: Time taken: 0.413 seconds hive (default) set hive.truncate.skiptrash; hive.truncate.skiptrash=false hive (default) set hive.truncate.skiptrash=true; hive (default) TRUNCATE TABLE external1 partition (ds='12') FORCE; [2013-04-16 17:16:21]: Compile Start [2013-04-16 17:16:21]: Compile End [2013-04-16 17:16:21]: OK [2013-04-16 17:16:21]: Time taken: 0.143 seconds hive (default) dfs -ls /user/test/.Trash/Current/; Found 1 items drwxr-xr-x -test supergroup 0 2013-04-16 17:06 /user/test/.Trash/Current/ds=11 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-446) Implement TRUNCATE
[ https://issues.apache.org/jira/browse/HIVE-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13632717#comment-13632717 ] caofangkun commented on HIVE-446: - Thank you [~gangtimliu] Many times I have to remove data from external table to free up storage space of the cluster. So it's necessary for me to have some statement like truncate ... force to remove data. I submited an issue https://issues.apache.org/jira/browse/HIVE-4367 Just in case it may be some helpful for people have a similar need . Implement TRUNCATE -- Key: HIVE-446 URL: https://issues.apache.org/jira/browse/HIVE-446 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Prasad Chakka Assignee: Navis Fix For: 0.11.0 Attachments: HIVE-446.D7371.1.patch, HIVE-446.D7371.2.patch, HIVE-446.D7371.3.patch, HIVE-446.D7371.4.patch truncate the data but leave the table and metadata intact. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4346) when writing data into filesystem from queries ,the output files could contain a line of column names
[ https://issues.apache.org/jira/browse/HIVE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4346: - Attachment: HIVE-4346-1.patch https://reviews.apache.org/r/10474/ when writing data into filesystem from queries ,the output files could contain a line of column names -- Key: HIVE-4346 URL: https://issues.apache.org/jira/browse/HIVE-4346 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: caofangkun Priority: Minor Attachments: HIVE-4346-1.patch For example : hivedesc src; key string value string hiveselect * from src; 1 10 2 20 hiveset hive.output.contain.columnnames=true; hiveinsert overwrite local directory './test1' select * from src ; hive!ls -l './test1'; ./test1/_metadata ./test1/00_0 hive!cat './test1/_metadata' key^Avalue hive!cat './test1/00_0'; 1^A10 2^A20 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4346) when writing data into filesystem from queries ,the output files could contain a line of column names
[ https://issues.apache.org/jira/browse/HIVE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4346: - Description: For example : hivedesc src; key string value string hiveselect * from src; 1 10 2 20 hiveset hive.output.markschema=true; hiveinsert overwrite local directory './test1' select * from src ; hive!ls -l './test1'; ./test1/_metadata ./test1/00_0 hive!cat './test1/_metadata' key^Avalue hive!cat './test1/00_0'; 1^A10 2^A20 was: For example : hivedesc src; key string value string hiveselect * from src; 1 10 2 20 hiveset hive.output.contain.columnnames=true; hiveinsert overwrite local directory './test1' select * from src ; hive!ls -l './test1'; ./test1/_metadata ./test1/00_0 hive!cat './test1/_metadata' key^Avalue hive!cat './test1/00_0'; 1^A10 2^A20 when writing data into filesystem from queries ,the output files could contain a line of column names -- Key: HIVE-4346 URL: https://issues.apache.org/jira/browse/HIVE-4346 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: caofangkun Priority: Minor Attachments: HIVE-4346-1.patch For example : hivedesc src; key string value string hiveselect * from src; 1 10 2 20 hiveset hive.output.markschema=true; hiveinsert overwrite local directory './test1' select * from src ; hive!ls -l './test1'; ./test1/_metadata ./test1/00_0 hive!cat './test1/_metadata' key^Avalue hive!cat './test1/00_0'; 1^A10 2^A20 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4346) when writing data into filesystem from queries ,the output files could contain a line of column names
[ https://issues.apache.org/jira/browse/HIVE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4346: - Description: For example : hivedesc src; key string value string hiveselect * from src; 1 10 2 20 hiveset hive.output.contain.columnnames=true; hiveinsert overwrite local directory './test1' select * from src ; hive!ls -l './test1'; ./test1/_metadata ./test1/00_0 hive!cat './test1/_metadata' key^Avalue hive!cat './test1/00_0'; 1^A10 2^A20 was: For example : hivedesc src; key string value string hiveselect * from src; 1 10 2 20 hiveset hive.output.contain.columnnames=true; hiveinsert overwrite local directory './test1' select * from src ; hive!cat './test1/00_0'; key^Avalue 1^A10 2^A20 when writing data into filesystem from queries ,the output files could contain a line of column names -- Key: HIVE-4346 URL: https://issues.apache.org/jira/browse/HIVE-4346 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: caofangkun Priority: Minor For example : hivedesc src; key string value string hiveselect * from src; 1 10 2 20 hiveset hive.output.contain.columnnames=true; hiveinsert overwrite local directory './test1' select * from src ; hive!ls -l './test1'; ./test1/_metadata ./test1/00_0 hive!cat './test1/_metadata' key^Avalue hive!cat './test1/00_0'; 1^A10 2^A20 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4346) when writing data into filesystem from queries ,the output files could contain a line of column names
caofangkun created HIVE-4346: Summary: when writing data into filesystem from queries ,the output files could contain a line of column names Key: HIVE-4346 URL: https://issues.apache.org/jira/browse/HIVE-4346 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: caofangkun Priority: Minor For example : hivedesc src; key string value string hiveselect * from src; 1 10 2 20 hiveset hive.output.contain.columnnames=true; hiveinsert overwrite local directory './test1' select * from src ; hive!cat './test1/00_0'; key^Avalue 1^A10 2^A20 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-446) Implement TRUNCATE
[ https://issues.apache.org/jira/browse/HIVE-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629960#comment-13629960 ] caofangkun commented on HIVE-446: - Hi ALL: Whether it is necessary to enhance the syntax like TRUNCATE TABLE srcpart_truncate PARTITION (dt='201130412') FORCE; to remove data from EXTERNAL table ? Implement TRUNCATE -- Key: HIVE-446 URL: https://issues.apache.org/jira/browse/HIVE-446 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Prasad Chakka Assignee: Navis Fix For: 0.11.0 Attachments: HIVE-446.D7371.1.patch, HIVE-446.D7371.2.patch, HIVE-446.D7371.3.patch, HIVE-446.D7371.4.patch truncate the data but leave the table and metadata intact. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3682) when output hive table to file,users should could have a separator of their own choice
[ https://issues.apache.org/jira/browse/HIVE-3682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629787#comment-13629787 ] caofangkun commented on HIVE-3682: -- Thanks [~sushanth] and the STORED AS feature is very useful for me too. when output hive table to file,users should could have a separator of their own choice -- Key: HIVE-3682 URL: https://issues.apache.org/jira/browse/HIVE-3682 Project: Hive Issue Type: New Feature Components: CLI Affects Versions: 0.8.1 Environment: Linux 3.0.0-14-generic #23-Ubuntu SMP Mon Nov 21 20:34:47 UTC 2011 i686 i686 i386 GNU/Linux java version 1.6.0_25 hadoop-0.20.2-cdh3u0 hive-0.8.1 Reporter: caofangkun Assignee: Gang Tim Liu Priority: Minor Attachments: HIVE-3682-1.patch, HIVE-3682.with.serde.patch By default,when output hive table to file ,columns of the Hive table are separated by ^A character (that is \001). But indeed users should have the right to set a seperator of their own choice. Usage Example: create table for_test (key string, value string); load data local inpath './in1.txt' into table for_test select * from for_test; UT-01:default separator is \001 line separator is \n insert overwrite local directory './test-01' select * from src ; create table array_table (a arraystring, b arraystring) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ','; load data local inpath ../hive/examples/files/arraytest.txt overwrite into table table2; CREATE TABLE map_table (foo STRING , bar MAPSTRING, STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',' MAP KEYS TERMINATED BY ':' STORED AS TEXTFILE; UT-02:defined field separator as ':' insert overwrite local directory './test-02' row format delimited FIELDS TERMINATED BY ':' select * from src ; UT-03: line separator DO NOT ALLOWED to define as other separator insert overwrite local directory './test-03' row format delimited FIELDS TERMINATED BY ':' select * from src ; UT-04: define map separators insert overwrite local directory './test-04' row format delimited FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',' MAP KEYS TERMINATED BY ':' select * from src; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4307) For partitioned table , if where statement is 'const string equals const string', the query will throw MismatchedTokenException
[ https://issues.apache.org/jira/browse/HIVE-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4307: - Description: For example: create table srcpart (key string, value string) partitioned by (ds string, hr string); hiveexplain select * from srcpart where 'test'='test2'; BR.recoverFromMismatchedToken FAILED: Error in semantic analysis: MetaException(message:Error parsing partition filter : MismatchedTokenException(9!=8)) For views there's a potentially dangerous: For Example: create view v_test_table ( id, key, type, dt) as select id, key ,type dt from ( select id ,key , 'apple' as type , dt from src_test_table where dt='20130326′ ) t; The following three query statements will work well: explain select * from v_test_table where id='123' ; explain select * from v_test_table where key='123' ; explain select * from v_test_table where dt='20130326' ; But the following query will fail : explain select * from v_test_table where type='orange' ; BR.recoverFromMismatchedToken FAILED: Error in semantic analysis: MetaException(message:Error parsing partition filter : MismatchedTokenException(9!=8)) was: For example: create table srcpart (key string, value string) partitioned by (ds string, hr string); hiveexplain select * from srcpart where 'test'='test2′; BR.recoverFromMismatchedToken FAILED: Error in semantic analysis: MetaException(message:Error parsing partition filter : MismatchedTokenException(9!=8)) For views there's a potentially dangerous: For Example: create view v_test_table ( id, key, type, dt) as select id, key ,type dt from ( select id ,key , 'apple' as type , dt from src_test_table where dt='20130326′ ) t; The following three query statements will work well: explain select * from v_test_table where id='123′ ; explain select * from v_test_table where key='123′ ; explain select * from v_test_table where dt='20130326′ ; But the following query will fail : explain select * from v_test_table where type='orange' ; BR.recoverFromMismatchedToken FAILED: Error in semantic analysis: MetaException(message:Error parsing partition filter : MismatchedTokenException(9!=8)) For partitioned table , if where statement is 'const string equals const string', the query will throw MismatchedTokenException --- Key: HIVE-4307 URL: https://issues.apache.org/jira/browse/HIVE-4307 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: caofangkun Priority: Minor Attachments: HIVE-4307-1.patch For example: create table srcpart (key string, value string) partitioned by (ds string, hr string); hiveexplain select * from srcpart where 'test'='test2'; BR.recoverFromMismatchedToken FAILED: Error in semantic analysis: MetaException(message:Error parsing partition filter : MismatchedTokenException(9!=8)) For views there's a potentially dangerous: For Example: create view v_test_table ( id, key, type, dt) as select id, key ,type dt from ( select id ,key , 'apple' as type , dt from src_test_table where dt='20130326′ ) t; The following three query statements will work well: explain select * from v_test_table where id='123' ; explain select * from v_test_table where key='123' ; explain select * from v_test_table where dt='20130326' ; But the following query will fail : explain select * from v_test_table where type='orange' ; BR.recoverFromMismatchedToken FAILED: Error in semantic analysis: MetaException(message:Error parsing partition filter : MismatchedTokenException(9!=8)) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4307) For partitioned table , if where statement is 'const string equals const string', the query will throw MismatchedTokenException
[ https://issues.apache.org/jira/browse/HIVE-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13625193#comment-13625193 ] caofangkun commented on HIVE-4307: -- [~navis] if quotation marks is wrong then should throw Parse Error hive (default) explain select * from srcpart where 'test' = 'test’; [2013-04-08 14:38:58]: Compile Start FAILED: Parse Error: line 1:51 character 'EOF' not supported here line 1:50 character '’' not supported here [2013-04-08 14:38:58]: Compile End For partitioned table , if where statement is 'const string equals const string', the query will throw MismatchedTokenException --- Key: HIVE-4307 URL: https://issues.apache.org/jira/browse/HIVE-4307 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: caofangkun Priority: Minor Attachments: HIVE-4307-1.patch For example: create table srcpart (key string, value string) partitioned by (ds string, hr string); hiveexplain select * from srcpart where 'test'='test2'; BR.recoverFromMismatchedToken FAILED: Error in semantic analysis: MetaException(message:Error parsing partition filter : MismatchedTokenException(9!=8)) For views there's a potentially dangerous: For Example: create view v_test_table ( id, key, type, dt) as select id, key ,type dt from ( select id ,key , 'apple' as type , dt from src_test_table where dt='20130326′ ) t; The following three query statements will work well: explain select * from v_test_table where id='123' ; explain select * from v_test_table where key='123' ; explain select * from v_test_table where dt='20130326' ; But the following query will fail : explain select * from v_test_table where type='orange' ; BR.recoverFromMismatchedToken FAILED: Error in semantic analysis: MetaException(message:Error parsing partition filter : MismatchedTokenException(9!=8)) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3682) when output hive table to file,users should could have a separator of their own choice
[ https://issues.apache.org/jira/browse/HIVE-3682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-3682: - Description: By default,when output hive table to file ,columns of the Hive table are separated by ^A character (that is \001). But indeed users should have the right to set a seperator of their own choice. Usage Example: create table for_test (key string, value string); load data local inpath './in1.txt' into table for_test select * from for_test; UT-01:default separator is \001 line separator is \n insert overwrite local directory './test-01' select * from src ; create table array_table (a arraystring, b arraystring) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ','; load data local inpath ../hive/examples/files/arraytest.txt overwrite into table table2; CREATE TABLE map_table (foo STRING , bar MAPSTRING, STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',' MAP KEYS TERMINATED BY ':' STORED AS TEXTFILE; UT-02:defined field separator as ':' insert overwrite local directory './test-02' row format delimited FIELDS TERMINATED BY ':' select * from src ; UT-03: line separator DO NOT ALLOWED to define as other separator insert overwrite local directory './test-03' row format delimited FIELDS TERMINATED BY ':' select * from src ; UT-04: define map separators insert overwrite local directory './test-04' row format delimited FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',' MAP KEYS TERMINATED BY ':' select * from src; was: By default,when output hive table to file ,columns of the Hive table are separated by ^A character (that is \001). But indeed users should have the right to set a seperator of their own choice. when output hive table to file,users should could have a separator of their own choice -- Key: HIVE-3682 URL: https://issues.apache.org/jira/browse/HIVE-3682 Project: Hive Issue Type: New Feature Components: CLI Affects Versions: 0.8.1 Environment: Linux 3.0.0-14-generic #23-Ubuntu SMP Mon Nov 21 20:34:47 UTC 2011 i686 i686 i386 GNU/Linux java version 1.6.0_25 hadoop-0.20.2-cdh3u0 hive-0.8.1 Reporter: caofangkun Assignee: Gang Tim Liu Priority: Minor Attachments: HIVE-3682-1.patch By default,when output hive table to file ,columns of the Hive table are separated by ^A character (that is \001). But indeed users should have the right to set a seperator of their own choice. Usage Example: create table for_test (key string, value string); load data local inpath './in1.txt' into table for_test select * from for_test; UT-01:default separator is \001 line separator is \n insert overwrite local directory './test-01' select * from src ; create table array_table (a arraystring, b arraystring) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ','; load data local inpath ../hive/examples/files/arraytest.txt overwrite into table table2; CREATE TABLE map_table (foo STRING , bar MAPSTRING, STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',' MAP KEYS TERMINATED BY ':' STORED AS TEXTFILE; UT-02:defined field separator as ':' insert overwrite local directory './test-02' row format delimited FIELDS TERMINATED BY ':' select * from src ; UT-03: line separator DO NOT ALLOWED to define as other separator insert overwrite local directory './test-03' row format delimited FIELDS TERMINATED BY ':' select * from src ; UT-04: define map separators insert overwrite local directory './test-04' row format delimited FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',' MAP KEYS TERMINATED BY ':' select * from src; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4307) For partitioned table , if where statement is 'const string equals const string', the query will throw MismatchedTokenException
[ https://issues.apache.org/jira/browse/HIVE-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13625238#comment-13625238 ] caofangkun commented on HIVE-4307: -- [~navis] Well .This is indeed fixed in trunk. For partitioned table , if where statement is 'const string equals const string', the query will throw MismatchedTokenException --- Key: HIVE-4307 URL: https://issues.apache.org/jira/browse/HIVE-4307 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: caofangkun Priority: Minor Attachments: HIVE-4307-1.patch For example: create table srcpart (key string, value string) partitioned by (ds string, hr string); hiveexplain select * from srcpart where 'test'='test2'; BR.recoverFromMismatchedToken FAILED: Error in semantic analysis: MetaException(message:Error parsing partition filter : MismatchedTokenException(9!=8)) For views there's a potentially dangerous: For Example: create view v_test_table ( id, key, type, dt) as select id, key ,type dt from ( select id ,key , 'apple' as type , dt from src_test_table where dt='20130326′ ) t; The following three query statements will work well: explain select * from v_test_table where id='123' ; explain select * from v_test_table where key='123' ; explain select * from v_test_table where dt='20130326' ; But the following query will fail : explain select * from v_test_table where type='orange' ; BR.recoverFromMismatchedToken FAILED: Error in semantic analysis: MetaException(message:Error parsing partition filter : MismatchedTokenException(9!=8)) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4307) For partitioned table , if where statement is 'const string equals const string', the query will throw MismatchedTokenException
caofangkun created HIVE-4307: Summary: For partitioned table , if where statement is 'const string equals const string', the query will throw MismatchedTokenException Key: HIVE-4307 URL: https://issues.apache.org/jira/browse/HIVE-4307 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: caofangkun Priority: Minor For example: create table srcpart (key string, value string) partitioned by (ds string, hr string); hiveexplain select * from srcpart where 'test'='test2′; BR.recoverFromMismatchedToken FAILED: Error in semantic analysis: MetaException(message:Error parsing partition filter : MismatchedTokenException(9!=8)) For views there's a potentially dangerous: For Example: create view v_test_table ( id, key, type, dt) as select id, key ,type dt from ( select id ,key , 'apple' as type , dt from src_test_table where dt='20130326′ ) t; The following three query statements will work well: explain select * from v_test_table where id='123′ ; explain select * from v_test_table where key='123′ ; explain select * from v_test_table where dt='20130326′ ; But the following query will fail : explain select * from v_test_table where type='orange' ; BR.recoverFromMismatchedToken FAILED: Error in semantic analysis: MetaException(message:Error parsing partition filter : MismatchedTokenException(9!=8)) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-4032) Inserting data into Hive table from a query, when the query is a partitioned table and select * ,will generate a SemanticException
[ https://issues.apache.org/jira/browse/HIVE-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun resolved HIVE-4032. -- Resolution: Won't Fix Not BUG Inserting data into Hive table from a query, when the query is a partitioned table and select * ,will generate a SemanticException -- Key: HIVE-4032 URL: https://issues.apache.org/jira/browse/HIVE-4032 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Environment: Apache Hadoop 0.19.1 + Apache Hive 0.10.0 Reporter: caofangkun Priority: Minor Attachments: HIVE-4032-1.patch Inserting data into Hive table from a query , when the query is : select * from a_partitioned_table, will throw a SemanticException . It seems that * contains the virtual partition columns. drop table if exists zr_test; create table if not exists zr_test (key string, value string) partitioned by (dt string); drop table if exists zr_test_1; create table if not exists zr_test_1 (key string, value string) partitioned by (dt string); --Query One explain insert into table zr_test partition (dt='20130217') select key, value from zr_test_1 where dt='20130217'; --Query Two explain insert into table zr_test partition (dt='20130217') select * from zr_test_1 where dt='20130217'; Ouery One works well, bug Query Two failed with the following information: FAILED: SemanticException [Error 10044]: Line 2:18 Cannot insert into target table because column number/types are different ''20130217'': Table insclause-0 has 2 columns, but query has 3 columns. p.s: Query Two works well on Apache Hadoop 0.20.1 + Hive 0.10.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4032) Inserting data into Hive table from a query, when the query is a partitioned table and select * ,will generate a SemanticException
[ https://issues.apache.org/jira/browse/HIVE-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4032: - Attachment: HIVE-4032-1.patch https://reviews.apache.org/r/10234/ Inserting data into Hive table from a query, when the query is a partitioned table and select * ,will generate a SemanticException -- Key: HIVE-4032 URL: https://issues.apache.org/jira/browse/HIVE-4032 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Environment: Apache Hadoop 0.19.1 + Apache Hive 0.10.0 Reporter: caofangkun Priority: Minor Attachments: HIVE-4032-1.patch Inserting data into Hive table from a query , when the query is : select * from a_partitioned_table, will throw a SemanticException . It seems that * contains the virtual partition columns. drop table if exists zr_test; create table if not exists zr_test (key string, value string) partitioned by (dt string); drop table if exists zr_test_1; create table if not exists zr_test_1 (key string, value string) partitioned by (dt string); --Query One explain insert into table zr_test partition (dt='20130217') select key, value from zr_test_1 where dt='20130217'; --Query Two explain insert into table zr_test partition (dt='20130217') select * from zr_test_1 where dt='20130217'; Ouery One works well, bug Query Two failed with the following information: FAILED: SemanticException [Error 10044]: Line 2:18 Cannot insert into target table because column number/types are different ''20130217'': Table insclause-0 has 2 columns, but query has 3 columns. p.s: Query Two works well on Apache Hadoop 0.20.1 + Hive 0.10.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3682) when output hive table to file,users should could have a separator of their own choice
[ https://issues.apache.org/jira/browse/HIVE-3682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13612464#comment-13612464 ] caofangkun commented on HIVE-3682: -- https://reviews.apache.org/r/10115/ when output hive table to file,users should could have a separator of their own choice -- Key: HIVE-3682 URL: https://issues.apache.org/jira/browse/HIVE-3682 Project: Hive Issue Type: New Feature Components: CLI Affects Versions: 0.8.1 Environment: Linux 3.0.0-14-generic #23-Ubuntu SMP Mon Nov 21 20:34:47 UTC 2011 i686 i686 i386 GNU/Linux java version 1.6.0_25 hadoop-0.20.2-cdh3u0 hive-0.8.1 Reporter: caofangkun Assignee: Gang Tim Liu Priority: Minor Attachments: HIVE-3682-1.patch By default,when output hive table to file ,columns of the Hive table are separated by ^A character (that is \001). But indeed users should have the right to set a seperator of their own choice. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4230) when export data to local files ,users could decide whether the output contains corresponding column names
[ https://issues.apache.org/jira/browse/HIVE-4230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4230: - Priority: Minor (was: Major) when export data to local files ,users could decide whether the output contains corresponding column names -- Key: HIVE-4230 URL: https://issues.apache.org/jira/browse/HIVE-4230 Project: Hive Issue Type: New Feature Reporter: caofangkun Priority: Minor Example: hivedesc src; key string value string hiveinsert overwrite local directory './src_table' select * from src; hive!cat ./src_table/00_0; a1 b1 a2 b2 a3 b3 when user set a property's value to true; then output should be : hive!cat ./src_table/00_0; key value --- the first line is corresponding column names a1 b1 a2 b2 a3 b3 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3914) use Chinese in hive column comment and table comment
[ https://issues.apache.org/jira/browse/HIVE-3914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-3914: - Attachment: HIVE-3914-2.patch for hive 0.9.0 use Chinese in hive column comment and table comment Key: HIVE-3914 URL: https://issues.apache.org/jira/browse/HIVE-3914 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.9.0, 0.10.0 Reporter: caofangkun Priority: Minor Attachments: HIVE-3914-1.patch, HIVE-3914-2.patch use Chinese in hive column comment and table comment,and the metadata in Mysql is regular,the charset of 'COMMENT' column in 'columns_v2' table and 'PARAM_VALUE' column in 'table_params' table both are 'utf8'. When I exec 'select * from columns_v2' with mysql client,the Chinese comments display normally. But when I execute 'describe table' with hive cli,the Chinese words are garbled. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4210) output of show create table shoud contain a Semicolon at the end of the query string
caofangkun created HIVE-4210: Summary: output of show create table shoud contain a Semicolon at the end of the query string Key: HIVE-4210 URL: https://issues.apache.org/jira/browse/HIVE-4210 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: caofangkun Priority: Minor Before: hive (default) SHOW CREATE TABLE v1 ; [2013-03-21 11:31:57]: Compile Start [2013-03-21 11:31:59]: Compile End [2013-03-21 11:31:59]: OK CREATE VIEW v1 AS SELECT `src`.`key`, `src`.`value` from `default`.`src` [2013-03-21 11:31:59]: Time taken: 2.528 seconds After Fix: hive (default) SHOW CREATE TABLE v1; [2013-03-21 13:48:31]: Compile Start [2013-03-21 13:48:34]: Compile End [2013-03-21 13:48:34]: OK CREATE VIEW v1 AS SELECT `src`.`key`, `src`.`value` from `default`.`src`; [2013-03-21 13:48:34]: Time taken: 2.462 seconds -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4210) output of show create table shoud contain a Semicolon at the end of the query string
[ https://issues.apache.org/jira/browse/HIVE-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4210: - Attachment: HIVE-4210-1.patch output of show create table shoud contain a Semicolon at the end of the query string -- Key: HIVE-4210 URL: https://issues.apache.org/jira/browse/HIVE-4210 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: caofangkun Priority: Minor Attachments: HIVE-4210-1.patch Before: hive (default) SHOW CREATE TABLE v1 ; [2013-03-21 11:31:57]: Compile Start [2013-03-21 11:31:59]: Compile End [2013-03-21 11:31:59]: OK CREATE VIEW v1 AS SELECT `src`.`key`, `src`.`value` from `default`.`src` [2013-03-21 11:31:59]: Time taken: 2.528 seconds After Fix: hive (default) SHOW CREATE TABLE v1; [2013-03-21 13:48:31]: Compile Start [2013-03-21 13:48:34]: Compile End [2013-03-21 13:48:34]: OK CREATE VIEW v1 AS SELECT `src`.`key`, `src`.`value` from `default`.`src`; [2013-03-21 13:48:34]: Time taken: 2.462 seconds -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4135) When using PostgreSQL as stats database ,there's a type cast bug
[ https://issues.apache.org/jira/browse/HIVE-4135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4135: - Attachment: HIVE-4135-1.patch A quick fix When using PostgreSQL as stats database ,there's a type cast bug --- Key: HIVE-4135 URL: https://issues.apache.org/jira/browse/HIVE-4135 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.9.0, 0.10.0 Reporter: caofangkun Priority: Minor Attachments: HIVE-4135-1.patch When using PostgreSQL as stats database ,there's a type cast bug. tasktrack log : 2013-03-01 16:03:08,973 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarded 17598 rows 2013-03-01 16:03:09,040 INFO org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher: Stats publishing for key dim_pub_date/00 2013-03-01 16:03:09,045 ERROR org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher: Error during publishing statistics. org.postgresql.util.PSQLException: ERROR: column row_count is of type bigint but expression is of type character varying Hint: You will need to rewrite or cast the expression. Position: 126 at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2101) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1834) at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255) at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:510) at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:386) at org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:332) at org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher$2.run(JDBCStatsPublisher.java:136) at org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher$2.run(JDBCStatsPublisher.java:133) at org.apache.hadoop.hive.ql.exec.Utilities.executeWithRetry(Utilities.java:2093) at org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher.publishStat(JDBCStatsPublisher.java:149) at org.apache.hadoop.hive.ql.exec.TableScanOperator.publishStats(TableScanOperator.java:260) at org.apache.hadoop.hive.ql.exec.TableScanOperator.closeOp(TableScanOperator.java:198) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566) at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:373) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) at org.apache.hadoop.mapred.Child.main(Child.java:167) 2013-03-01 16:03:09,046 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: publishing : dim_pub_date/00 : {numRows=17598, rawDataSize=2626518} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4032) Inserting data into Hive table from a query, when the query is a partitioned table and select * ,will generate a SemanticException
[ https://issues.apache.org/jira/browse/HIVE-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4032: - Description: Inserting data into Hive table from a query , when the query is : select * from a_partitioned_table, will throw a SemanticException . It seems that * contains the virtual partition columns. drop table if exists zr_test; create table if not exists zr_test (key string, value string) partitioned by (dt string); drop table if exists zr_test_1; create table if not exists zr_test_1 (key string, value string) partitioned by (dt string); --Query One explain insert into table zr_test partition (dt='20130217') select key, value from zr_test_1 where dt='20130217'; --Query Two explain insert into table zr_test partition (dt='20130217') select * from zr_test_1 where dt='20130217'; Ouery One works well, bug Query Two failed with the following information: FAILED: SemanticException [Error 10044]: Line 2:18 Cannot insert into target table because column number/types are different ''20130217'': Table insclause-0 has 2 columns, but query has 3 columns. was: Inserting data into Hive table from a query , when the query is : select * from a_partitioned_table, will throw a SemanticException . It seems that * contains the virtual partition columns. drop table if exists zr_test; create table if not exists zr_test (key string, value string) partitioned by (dt string); drop table if exists zr_test_1; create table if not exists zr_test_1 (key string, value string) partitioned by (dt string); --Query One explain insert into table zr_test partition (dt='20130217') select key, value from zr_test_1 where dt='20130217'; --Query Two explain insert into table zr_test partition (dt='20130217') select * from zr_test_1 where dt='20130217'; Ouery One works well, bug Query Two failed with the following information: FAILED: SemanticException [Error 10044]: Line 2:18 Cannot insert into target table because column number/types are different ''20130217'': Table insclause-0 has 2 columns, but query has 3 columns. Inserting data into Hive table from a query, when the query is a partitioned table and select * ,will generate a SemanticException -- Key: HIVE-4032 URL: https://issues.apache.org/jira/browse/HIVE-4032 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: caofangkun Priority: Minor Inserting data into Hive table from a query , when the query is : select * from a_partitioned_table, will throw a SemanticException . It seems that * contains the virtual partition columns. drop table if exists zr_test; create table if not exists zr_test (key string, value string) partitioned by (dt string); drop table if exists zr_test_1; create table if not exists zr_test_1 (key string, value string) partitioned by (dt string); --Query One explain insert into table zr_test partition (dt='20130217') select key, value from zr_test_1 where dt='20130217'; --Query Two explain insert into table zr_test partition (dt='20130217') select * from zr_test_1 where dt='20130217'; Ouery One works well, bug Query Two failed with the following information: FAILED: SemanticException [Error 10044]: Line 2:18 Cannot insert into target table because column number/types are different ''20130217'': Table insclause-0 has 2 columns, but query has 3 columns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4032) Inserting data into Hive table from a query, when the query is a partitioned table and select * ,will generate a SemanticException
[ https://issues.apache.org/jira/browse/HIVE-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4032: - Description: Inserting data into Hive table from a query , when the query is : select * from a_partitioned_table, will throw a SemanticException . It seems that * contains the virtual partition columns. drop table if exists zr_test; create table if not exists zr_test (key string, value string) partitioned by (dt string); drop table if exists zr_test_1; create table if not exists zr_test_1 (key string, value string) partitioned by (dt string); --Query One explain insert into table zr_test partition (dt='20130217') select key, value from zr_test_1 where dt='20130217'; --Query Two explain insert into table zr_test partition (dt='20130217') select * from zr_test_1 where dt='20130217'; Ouery One works well, bug Query Two failed with the following information: FAILED: SemanticException [Error 10044]: Line 2:18 Cannot insert into target table because column number/types are different ''20130217'': Table insclause-0 has 2 columns, but query has 3 columns. p.s: Query Two works well on Apache Hadoop 0.20.1 + Hive 0.10.0 was: Inserting data into Hive table from a query , when the query is : select * from a_partitioned_table, will throw a SemanticException . It seems that * contains the virtual partition columns. drop table if exists zr_test; create table if not exists zr_test (key string, value string) partitioned by (dt string); drop table if exists zr_test_1; create table if not exists zr_test_1 (key string, value string) partitioned by (dt string); --Query One explain insert into table zr_test partition (dt='20130217') select key, value from zr_test_1 where dt='20130217'; --Query Two explain insert into table zr_test partition (dt='20130217') select * from zr_test_1 where dt='20130217'; Ouery One works well, bug Query Two failed with the following information: FAILED: SemanticException [Error 10044]: Line 2:18 Cannot insert into target table because column number/types are different ''20130217'': Table insclause-0 has 2 columns, but query has 3 columns. Environment: Apache Hadoop 0.19.1 + Apache Hive 0.10.0 Inserting data into Hive table from a query, when the query is a partitioned table and select * ,will generate a SemanticException -- Key: HIVE-4032 URL: https://issues.apache.org/jira/browse/HIVE-4032 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Environment: Apache Hadoop 0.19.1 + Apache Hive 0.10.0 Reporter: caofangkun Priority: Minor Inserting data into Hive table from a query , when the query is : select * from a_partitioned_table, will throw a SemanticException . It seems that * contains the virtual partition columns. drop table if exists zr_test; create table if not exists zr_test (key string, value string) partitioned by (dt string); drop table if exists zr_test_1; create table if not exists zr_test_1 (key string, value string) partitioned by (dt string); --Query One explain insert into table zr_test partition (dt='20130217') select key, value from zr_test_1 where dt='20130217'; --Query Two explain insert into table zr_test partition (dt='20130217') select * from zr_test_1 where dt='20130217'; Ouery One works well, bug Query Two failed with the following information: FAILED: SemanticException [Error 10044]: Line 2:18 Cannot insert into target table because column number/types are different ''20130217'': Table insclause-0 has 2 columns, but query has 3 columns. p.s: Query Two works well on Apache Hadoop 0.20.1 + Hive 0.10.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3999) Mysql metastore upgrade script will end up with different schema than the full schema load
[ https://issues.apache.org/jira/browse/HIVE-3999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13580452#comment-13580452 ] caofangkun commented on HIVE-3999: -- We'd better set a default value for column IS_STOREDASSUBDIRECTORIES: ALTER TABLE `SDS` ADD `IS_STOREDASSUBDIRECTORIES` BIT(1) NOT NULL; ALTER TABLE `SDS` ALTER `IS_STOREDASSUBDIRECTORIES` SET DEFAULT 0; or Query May Throw : Failed with exception javax.jdo.JDODataStoreException: Insert of object org.apache.hadoop.hive.metastore.model.MStorageDescriptor@c3c44 using statement INSERT INTO `SDS` (`SD_ID`,`OUTPUT_FORMAT`,`CD_ID`,`NUM_BUCKETS`,`INPUT_FORMAT`,`SERDE_ID`,`IS_COMPRESSED`,`LOCATION`) VALUES (?,?,?,?,?,?,?,?) failed : Field 'IS_STOREDASSUBDIRECTORIES' doesn't have a default value NestedThrowables: java.sql.SQLException: Field 'IS_STOREDASSUBDIRECTORIES' doesn't have a default value Mysql metastore upgrade script will end up with different schema than the full schema load -- Key: HIVE-3999 URL: https://issues.apache.org/jira/browse/HIVE-3999 Project: Hive Issue Type: Bug Components: Metastore Reporter: Jarek Jarcec Cecho Assignee: Jarek Jarcec Cecho Fix For: 0.11.0 Attachments: mysql_upgrade_issue.patch I've noticed that the file {{hive-schema-0.10.0.mysql.sql}} is creating table SDS with following column: {code} `IS_STOREDASSUBDIRECTORIES` bit(1) NOT NULL, {code} However the upgrade script {{011-HIVE-3649.mysql.sql}} will create the column differently: {code} ALTER TABLE `SDS` ADD `IS_STOREDASSUBDIRECTORIES` bit(1) ; {code} Thus user will get slightly different schema each time - once with NOT NULL and secondly with NULL definition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3999) Mysql metastore upgrade script will end up with different schema than the full schema load
[ https://issues.apache.org/jira/browse/HIVE-3999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13580469#comment-13580469 ] caofangkun commented on HIVE-3999: -- Hi Jarek Jarcec Cecho, I agree with you now. Your solution is the most suitable way. Mysql metastore upgrade script will end up with different schema than the full schema load -- Key: HIVE-3999 URL: https://issues.apache.org/jira/browse/HIVE-3999 Project: Hive Issue Type: Bug Components: Metastore Reporter: Jarek Jarcec Cecho Assignee: Jarek Jarcec Cecho Fix For: 0.11.0 Attachments: mysql_upgrade_issue.patch I've noticed that the file {{hive-schema-0.10.0.mysql.sql}} is creating table SDS with following column: {code} `IS_STOREDASSUBDIRECTORIES` bit(1) NOT NULL, {code} However the upgrade script {{011-HIVE-3649.mysql.sql}} will create the column differently: {code} ALTER TABLE `SDS` ADD `IS_STOREDASSUBDIRECTORIES` bit(1) ; {code} Thus user will get slightly different schema each time - once with NOT NULL and secondly with NULL definition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4032) Inserting data into Hive table from a query, when the query is a partitioned table and select * ,will generate a SemanticException
caofangkun created HIVE-4032: Summary: Inserting data into Hive table from a query, when the query is a partitioned table and select * ,will generate a SemanticException Key: HIVE-4032 URL: https://issues.apache.org/jira/browse/HIVE-4032 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: caofangkun Priority: Minor Inserting data into Hive table from a query , when the query is : select * from a_partitioned_table, will throw a SemanticException . It seems that * contains the virtual partition columns. drop table if exists zr_test; create table if not exists zr_test (key string, value string) partitioned by (dt string); drop table if exists zr_test_1; create table if not exists zr_test_1 (key string, value string) partitioned by (dt string); --Query One explain insert into table zr_test partition (dt='20130217') select key, value from zr_test_1 where dt='20130217'; --Query Two explain insert into table zr_test partition (dt='20130217') select * from zr_test_1 where dt='20130217'; Ouery One works well, bug Query Two failed with the following information: FAILED: SemanticException [Error 10044]: Line 2:18 Cannot insert into target table because column number/types are different ''20130217'': Table insclause-0 has 2 columns, but query has 3 columns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3961) provide new method for get all sorted tables by table create time
caofangkun created HIVE-3961: Summary: provide new method for get all sorted tables by table create time Key: HIVE-3961 URL: https://issues.apache.org/jira/browse/HIVE-3961 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.9.0 Reporter: caofangkun Priority: Minor /** * * @param dbName * @param sorted if true then sort by createTime ASC else sort by createTime DESC * @return * @throws HiveException */ public ListString getAllTables(String dbName, boolean sorted) throws HiveException { return getTablesByPattern(dbName, .*, sortByCreateTime); } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3941) Implement The Schema Search Path Feature
[ https://issues.apache.org/jira/browse/HIVE-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-3941: - Description: The Schema Search Path http://www.postgresql.org/docs/current/static/ddl-schemas.html hive(myschema) SET search_path TO myschema,default; -- set Schema Search Path hive(myschema) SHOW search_path; myschema default hive(default)show search_path in myschema; myschema default hive(default) show tables; de_src; hive(myschema) show tables; -- in myschema database there is no table named de_src src; src1; hive(myschema)select * from de_src; --this queuery is equivalent to query: select * from default.de_src the default search_path is : current_db,default SET search_path TO ... statement is Session level command , when the session is over ,fall back to current_db,default search path. was: The Schema Search Path http://www.postgresql.org/docs/current/static/ddl-schemas.html hive(myschema) SET search_path TO myschema,default; -- set Schema Search Path hive(myschema) SHOW search_path; myschema default hive(default) show tables; de_src; hive(myschema) show tables; -- in myschema database there is no table named de_src src; src1; hiveselect * from de_src; --this queuery is equivalent to query: select * from default.de_src Implement The Schema Search Path Feature Key: HIVE-3941 URL: https://issues.apache.org/jira/browse/HIVE-3941 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.9.0 Reporter: caofangkun Priority: Minor The Schema Search Path http://www.postgresql.org/docs/current/static/ddl-schemas.html hive(myschema) SET search_path TO myschema,default; -- set Schema Search Path hive(myschema) SHOW search_path; myschema default hive(default)show search_path in myschema; myschema default hive(default) show tables; de_src; hive(myschema) show tables; -- in myschema database there is no table named de_src src; src1; hive(myschema)select * from de_src; --this queuery is equivalent to query: select * from default.de_src the default search_path is : current_db,default SET search_path TO ... statement is Session level command , when the session is over ,fall back to current_db,default search path. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3945) union all datatype do not match may result wrong result
[ https://issues.apache.org/jira/browse/HIVE-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13565054#comment-13565054 ] caofangkun commented on HIVE-3945: -- Hi Frankline Jose S the problem is this query before UNION ALL clause is like this : select 'username' as key, 'age' as value from ... see the column values are Fixed strings . Example select key, value FROM ( select 'USERNAME' as key, 'AGE' as value – Hi Frankline Jose S , take a look at this from src s1 limit 1 UNION ALL select s2.key as key, sum(s2.value) as value – datatype: strung, double from src s2 group by s2.key ) unionsrc; union all datatype do not match may result wrong result Key: HIVE-3945 URL: https://issues.apache.org/jira/browse/HIVE-3945 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.9.0 Reporter: caofangkun Priority: Minor hive (default) desc src; key string value string select key, value FROM ( select 'key' as key, 'value' as value -- datatype: string, string from src s1 limit 1 UNION ALL select s2.key as key, sum(s2.value) as value -- datatype: strung, double from src s2 group by s2.key ) unionsrc; this query exec normally but has wrong result: key 2.4081029415476845E-282-- expected is 'value' 35.0 100 100.0 480.0 and sometimes when the string title is too long it may case ArrayIndexOutOfBoundsException: Caused by: java.lang.ArrayIndexOutOfBoundsException at java.lang.System.arraycopy(Native Method) at org.apache.hadoop.io.Text.set(Text.java:205) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryString.init(LazyBinaryString.java:48) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:216) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:197) at org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:61) at org.apache.hadoop.hive.ql.exec.UnionOperator.processOp(UnionOperator.java:125) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:531) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3945) union all datatype do not match may result wrong result
caofangkun created HIVE-3945: Summary: union all datatype do not match may result wrong result Key: HIVE-3945 URL: https://issues.apache.org/jira/browse/HIVE-3945 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.9.0 Reporter: caofangkun Priority: Minor hive (default) desc src; key string value string select key, value FROM ( select 'key' as key, 'value' as value -- datatype: string, string from src s1 limit 1 UNION ALL select s2.key as key, sum(s2.value) as value -- datatype: strung, double from src s2 group by s2.key ) unionsrc; this query exec normally but has wrong result: key 2.4081029415476845E-282-- expected is 'value' 35.0 100 100.0 48 0.0 and sometimes when the string title is too long it may case ArrayIndexOutOfBoundsException: Caused by: java.lang.ArrayIndexOutOfBoundsException at java.lang.System.arraycopy(Native Method) at org.apache.hadoop.io.Text.set(Text.java:205) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryString.init(LazyBinaryString.java:48) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:216) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:197) at org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:61) at org.apache.hadoop.hive.ql.exec.UnionOperator.processOp(UnionOperator.java:125) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:531) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3941) Implement The Schema Search Path Feature
caofangkun created HIVE-3941: Summary: Implement The Schema Search Path Feature Key: HIVE-3941 URL: https://issues.apache.org/jira/browse/HIVE-3941 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.9.0 Reporter: caofangkun Priority: Minor The Schema Search Path http://www.postgresql.org/docs/current/static/ddl-schemas.html hive(myschema) SET search_path TO myschema,default; -- set Schema Search Path hive(myschema) SHOW search_path; myschema default hive(default) show tables; de_src; hive(myschema) show tables; -- in myschema database there is no table named de_src src; src1; hiveselect * from de_src; --this queuery is equivalent to query: select * from default.de_src -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3942) Add UDF month_add and month_sub
caofangkun created HIVE-3942: Summary: Add UDF month_add and month_sub Key: HIVE-3942 URL: https://issues.apache.org/jira/browse/HIVE-3942 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.9.0 Reporter: caofangkun Priority: Minor Attachments: HIVE-3942-1.patch hive (default) desc function extended month_add; month_add(start_date, num_months) - Returns the date that is num_months after start_date. Synonyms: month_sub start_date is a string in the format '-MM-dd HH:mm:ss' or '-MM-dd'. num_months is a number. The time part of start_date is ignored. Example: SELECT month_add('2012-04-12', 1) FROM src LIMIT 1; --Return 2012-05-12 SELECT month_add('2012-04-12 11:22:31', 1) FROM src LIMIT 1; --Return 2012-05-12 SELECT month_add(cast('2012-04-12 11:22:31' as timestamp), 1) FROM src LIMIT 1; --Return 2012-05-12 hive (default) desc function extended month_sub; month_sub(start_date, num_months) - Returns the date that is num_months after start_date. Synonyms: month_add start_date is a string in the format '-MM-dd HH:mm:ss' or '-MM-dd'. num_months is a number. The time part of start_date is ignored. Example: SELECT month_sub('2012-04-12', 1) FROM src LIMIT 1; --Return 2012-05-12 SELECT month_sub('2012-04-12 11:22:31', 1) FROM src LIMIT 1; --Return 2012-05-12 SELECT month_sub(cast('2012-04-12 11:22:31' as timestamp), 1) FROM src LIMIT 1; --Return 2012-05-12 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3942) Add UDF month_add and month_sub
[ https://issues.apache.org/jira/browse/HIVE-3942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-3942: - Attachment: HIVE-3942-1.patch Add UDF month_add and month_sub Key: HIVE-3942 URL: https://issues.apache.org/jira/browse/HIVE-3942 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.9.0 Reporter: caofangkun Priority: Minor Attachments: HIVE-3942-1.patch hive (default) desc function extended month_add; month_add(start_date, num_months) - Returns the date that is num_months after start_date. Synonyms: month_sub start_date is a string in the format '-MM-dd HH:mm:ss' or '-MM-dd'. num_months is a number. The time part of start_date is ignored. Example: SELECT month_add('2012-04-12', 1) FROM src LIMIT 1; --Return 2012-05-12 SELECT month_add('2012-04-12 11:22:31', 1) FROM src LIMIT 1; --Return 2012-05-12 SELECT month_add(cast('2012-04-12 11:22:31' as timestamp), 1) FROM src LIMIT 1; --Return 2012-05-12 hive (default) desc function extended month_sub; month_sub(start_date, num_months) - Returns the date that is num_months after start_date. Synonyms: month_add start_date is a string in the format '-MM-dd HH:mm:ss' or '-MM-dd'. num_months is a number. The time part of start_date is ignored. Example: SELECT month_sub('2012-04-12', 1) FROM src LIMIT 1; --Return 2012-05-12 SELECT month_sub('2012-04-12 11:22:31', 1) FROM src LIMIT 1; --Return 2012-05-12 SELECT month_sub(cast('2012-04-12 11:22:31' as timestamp), 1) FROM src LIMIT 1; --Return 2012-05-12 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3914) use Chinese in hive column comment and table comment
[ https://issues.apache.org/jira/browse/HIVE-3914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-3914: - Attachment: HIVE-3914-1.patch use “outStream.writeUTF” instead of “ outStream.writeBytes ” use Chinese in hive column comment and table comment Key: HIVE-3914 URL: https://issues.apache.org/jira/browse/HIVE-3914 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.9.0, 0.10.0 Reporter: caofangkun Priority: Minor Attachments: HIVE-3914-1.patch use Chinese in hive column comment and table comment,and the metadata in Mysql is regular,the charset of 'COMMENT' column in 'columns_v2' table and 'PARAM_VALUE' column in 'table_params' table both are 'utf8'. When I exec 'select * from columns_v2' with mysql client,the Chinese comments display normally. But when I execute 'describe table' with hive cli,the Chinese words are garbled. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3914) use Chinese in hive column comment and table comment
caofangkun created HIVE-3914: Summary: use Chinese in hive column comment and table comment Key: HIVE-3914 URL: https://issues.apache.org/jira/browse/HIVE-3914 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.9.0, 0.10.0 Reporter: caofangkun Priority: Minor Attachments: HIVE-3914-1.patch use Chinese in hive column comment and table comment,and the metadata in Mysql is regular,the charset of 'COMMENT' column in 'columns_v2' table and 'PARAM_VALUE' column in 'table_params' table both are 'utf8'. When I exec 'select * from columns_v2' with mysql client,the Chinese comments display normally. But when I execute 'describe table' with hive cli,the Chinese words are garbled. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3905) Add the option --version in hive cli to show the version info of hive client
caofangkun created HIVE-3905: Summary: Add the option --version in hive cli to show the version info of hive client Key: HIVE-3905 URL: https://issues.apache.org/jira/browse/HIVE-3905 Project: Hive Issue Type: New Feature Components: CLI Affects Versions: 0.9.0, 0.8.1, 0.10.0 Reporter: caofangkun Priority: Minor Hadoop provide a way to check the version of Hadoop Client $HADOOP_HOME/bin/hadoop version See: https://issues.apache.org/jira/browse/HADOOP-567 Hive should'd better privode such Feature too . -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3905) Add the option --version in hive cli to show the version info of hive client
[ https://issues.apache.org/jira/browse/HIVE-3905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-3905: - Attachment: HIVE-3905-1.patch [zongren@hd17-vm1 ~]$ hive --version — Hadoop Version — Hadoop 0.20.1 Subversion -r Compiled by zongren on Thu Dec 27 17:40:23 CST 2012 — Hive Version — Hive 0.8.1 Subversion -r Compiled by zongren on Wed Jan 16 09:32:51 CST 2013 Add the option --version in hive cli to show the version info of hive client --- Key: HIVE-3905 URL: https://issues.apache.org/jira/browse/HIVE-3905 Project: Hive Issue Type: New Feature Components: CLI Affects Versions: 0.8.1, 0.9.0, 0.10.0 Reporter: caofangkun Priority: Minor Attachments: HIVE-3905-1.patch Hadoop provide a way to check the version of Hadoop Client $HADOOP_HOME/bin/hadoop version See: https://issues.apache.org/jira/browse/HADOOP-567 Hive should'd better privode such Feature too . -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3883) user defined partition info does not match the actual partition info defined in the table schema
[ https://issues.apache.org/jira/browse/HIVE-3883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-3883: - Summary: user defined partition info does not match the actual partition info defined in the table schema (was: user defined parttion info does not match the actual partition info defined in the table schema ) user defined partition info does not match the actual partition info defined in the table schema - Key: HIVE-3883 URL: https://issues.apache.org/jira/browse/HIVE-3883 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.9.0 Reporter: caofangkun Priority: Minor create table src_part (key string, value string) partitioned by (dt string, hr string); load data local inpath './in1.txt' overwrite into table src_part partition (dt='20121230',hr='11'); load data local inpath './in1.txt' overwrite into table src_part partition (hr='12', dt='20121230'); -- user defined the wrong partition info hive (default) select * from src_part where dt='20121230' and hr='12' limit 1; no result. hive (default) desc formatted src_part partition (dt='20121230', hr='12') ; hive (default) dfs -ls /user/zongren/hive/src_part/hr=12/dt=20121230/; Found 1 items rw-rr- 3 zongren supergroup 16 2012-12-04 14:18 /user/zongren/hive/src_part/hr=12/dt=20121230/in1.t -- data's here Should we provate strict check about this ? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3883) user defined parttion info does not match the actual partition info defined in the table schema
caofangkun created HIVE-3883: Summary: user defined parttion info does not match the actual partition info defined in the table schema Key: HIVE-3883 URL: https://issues.apache.org/jira/browse/HIVE-3883 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.9.0 Reporter: caofangkun Priority: Minor create table src_part (key string, value string) partitioned by (dt string, hr string); load data local inpath './in1.txt' overwrite into table src_part partition (dt='20121230',hr='11'); load data local inpath './in1.txt' overwrite into table src_part partition (hr='12', dt='20121230'); -- user defined the wrong partition info hive (default) select * from src_part where dt='20121230' and hr='12' limit 1; no result. hive (default) desc formatted src_part partition (dt='20121230', hr='12') ; hive (default) dfs -ls /user/zongren/hive/src_part/hr=12/dt=20121230/; Found 1 items rw-rr- 3 zongren supergroup 16 2012-12-04 14:18 /user/zongren/hive/src_part/hr=12/dt=20121230/in1.t -- data's here Should we provate strict check about this ? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3867) Whether we need to check the storage format defined in the table schema be the same as the actual storage file format
caofangkun created HIVE-3867: Summary: Whether we need to check the storage format defined in the table schema be the same as the actual storage file format Key: HIVE-3867 URL: https://issues.apache.org/jira/browse/HIVE-3867 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.9.0 Reporter: caofangkun Priority: Minor For Example: select count(*) from src_test; storage format defined in the table schema is SimpleText , but the actual storage file format is SequenceFile . This HiveQL can run properly。But it process the entire sequence file as a single line ,and as a single column. So should we need to have a check ? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3683) pdk and builtins, run ant test will failed ,since missing junit*.jar in trunk/testlibs/
caofangkun created HIVE-3683: Summary: pdk and builtins, run ant test will failed ,since missing junit*.jar in trunk/testlibs/ Key: HIVE-3683 URL: https://issues.apache.org/jira/browse/HIVE-3683 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.9.0 Environment: Linux 3.0.0-14-generic #23-Ubuntu SMP Mon Nov 21 20:34:47 UTC 2011 i686 i686 i386 GNU/Linux java version 1.6.0_25 hadoop-0.20.2-cdh3u0 hive-0.9.0 Reporter: caofangkun Priority: Minor ~ hive-0.9.0/builtins$ ant test and ~ hive-0.9.0/pdk$ ant test will fail for BUILD FAILED /builtins/build.xml:45: The following error occurred while executing this line: .../pdk/scripts/build-plugin.xml:122: The classpath for junit must include junit.jar if not in Ant's own classpath Solution: add junit-4.10.jar in trunk/testlibs/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3682) when output hive table to file,users should could have a separator of their own choice
caofangkun created HIVE-3682: Summary: when output hive table to file,users should could have a separator of their own choice Key: HIVE-3682 URL: https://issues.apache.org/jira/browse/HIVE-3682 Project: Hive Issue Type: New Feature Components: CLI Affects Versions: 0.8.1 Environment: Linux 3.0.0-14-generic #23-Ubuntu SMP Mon Nov 21 20:34:47 UTC 2011 i686 i686 i386 GNU/Linux java version 1.6.0_25 hadoop-0.20.2-cdh3u0 hive-0.8.1 Reporter: caofangkun Priority: Minor By default,when output hive table to file ,columns of the Hive table are separated by ^A character (that is \001). But indeed users should have the right to set a seperator of their own choice. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3417) mulit inserts when the from statement is a subquery,this is a bug
caofangkun created HIVE-3417: Summary: mulit inserts when the from statement is a subquery,this is a bug Key: HIVE-3417 URL: https://issues.apache.org/jira/browse/HIVE-3417 Project: Hive Issue Type: Bug Components: Query Processor, SQL Affects Versions: 0.8.1 Environment: Linux 3.0.0-14-generic #23-Ubuntu SMP Mon Nov 21 20:34:47 UTC 2011 i686 i686 i386 GNU/Linux java version 1.6.0_25 hadoop-0.20.2-cdh3u0 hive-0.8.1 Reporter: caofangkun vi mulit-insert.sql create table src (key string, value string); load data local inpath './in1.txt' overwrite into table src; drop table if exists test1; drop table if exists test2; create table test1 (key string, value string) partitioned by (dt string); create table test2 (key string, value string) partitioned by (dt string); select * from src; from (select * from src where key is not null ) --there is a bug here insert overwrite table test1 PARTITION (dt='1') select key ,value where key='48' insert overwrite table test2 PARTITION (dt='2') select key, value where key='100'; select * from test1; select * from test2; test1 and test2 shoud both have a single line of context.But it's not . Has a Solution: when set hive.ppd.remove.duplicatefilters=false; this's not such bug. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3298) select 1+1 from dual; if dual is an empty table this statement will return no result
caofangkun created HIVE-3298: Summary: select 1+1 from dual; if dual is an empty table this statement will return no result Key: HIVE-3298 URL: https://issues.apache.org/jira/browse/HIVE-3298 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.8.1 Environment: Linux 3.0.0-14-generic #23-Ubuntu SMP Mon Nov 21 20:34:47 UTC 2011 i686 i686 i386 GNU/Linux java version 1.6.0_25 hadoop-0.20.2-cdh3u0 hive-0.8.1 Reporter: caofangkun Priority: Minor hivedrop table if exists dual; hivecreate table dual (dummy string); hiveselect 1+1 from dual; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_201206081154_0458, Tracking URL = http://dwtest-93-61:50030/jobdetails.jsp?jobid=job_201206081154_0458 Kill Command = /home/hive/hadoop-0.20.2-cdh3u0/bin/hadoop job -Dmapred.job.tracker=dwtest-93-61:9001 -kill job_201206081154_0458 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2012-07-25 16:58:15,793 Stage-1 map = 0%, reduce = 0% 2012-07-25 16:58:17,817 Stage-1 map = 100%, reduce = 100% Ended Job = job_201206081154_0458 MapReduce Jobs Launched: Job 0: HDFS Read: 0 HDFS Write: 0 SUCESS Total MapReduce CPU Time Spent: 0 msec OK Time taken: 6.607 seconds -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3064) in insert into tablename statement,if the tablename contains uppercase characters this statement will overwrite the table
caofangkun created HIVE-3064: Summary: in insert into tablename statement,if the tablename contains uppercase characters this statement will overwrite the table Key: HIVE-3064 URL: https://issues.apache.org/jira/browse/HIVE-3064 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.8.1, 0.8.0 Environment: Linux zongren-VirtualBox 3.0.0-14-generic #23-Ubuntu SMP Mon Nov 21 20:34:47 UTC 2011 i686 i686 i386 GNU/Linux java version 1.6.0_25 hadoop-0.20.2-cdh3u0 hive-0.8.0 Reporter: caofangkun Priority: Minor in insert into tablename statement, if the tablename contains uppercase characters this statement will overwrite the table. For Example: hive desc dual; OK dummy string Time taken: 1.856 seconds hive select * from dual; OK dummy Time taken: 3.133 seconds drop table if exists tmp_test_1 ; create EXTERNAL table tmp_test_1 (dummy string) partitioned by (dt string, hr string); insert into table tmp_test_1 partition (dt='1', hr='1') select * from dual; insert into table tmp_TEST_1 partition (dt='1', hr='1') select count(*) from dual; select * from tmp_zongren_test_1; Result : OK 1 1 1 Time taken: 0.121 seconds -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-2323) create table target_table as select * from source_table the source_table's partitions are handled as columns in the target_table
create table target_table as select * from source_table the source_table's partitions are handled as columns in the target_table - Key: HIVE-2323 URL: https://issues.apache.org/jira/browse/HIVE-2323 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.7.0 Reporter: caofangkun Priority: Minor CREATE TABLE user (userid INT, username STRING, age INT, country STRING) PARTITIONED BY (day STRING, hour STRING); CREATE TABLE user_bak AS SELECT * FROM user; day and hour are partitions in table user; but in the new table user_bak , day and hour are handled as two columns. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira