[jira] [Created] (HIVE-26291) Ranger client file descriptor leak
Adrian Wang created HIVE-26291: -- Summary: Ranger client file descriptor leak Key: HIVE-26291 URL: https://issues.apache.org/jira/browse/HIVE-26291 Project: Hive Issue Type: Improvement Reporter: Adrian Wang Assignee: Adrian Wang Ranger Client has an fd leak -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26174) ALTER TABLE RENAME TO should check new db location
Adrian Wang created HIVE-26174: -- Summary: ALTER TABLE RENAME TO should check new db location Key: HIVE-26174 URL: https://issues.apache.org/jira/browse/HIVE-26174 Project: Hive Issue Type: Improvement Reporter: Adrian Wang Assignee: Adrian Wang Currently, if we run ALTER TABLE db1.table1 RENAME TO db2.table2; and with `db1` and `db2` on different filesystem, for example `db1` as `"hdfs:/user/hive/warehouse/db1.db"`, and `db2` as `"s3://bucket/s3warehouse/db2.db"`, the new `db2.table2` will be under location `hdfs:/s3warehouse/db2.db/table2`, which looks quite strange. The idea is to ban this kind of operation, as we seem to intend to ban that, but the check was done after we changed file system scheme so it was always true. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26032) Upgrade cron-utils to 9.1.6
Yuming Wang created HIVE-26032: -- Summary: Upgrade cron-utils to 9.1.6 Key: HIVE-26032 URL: https://issues.apache.org/jira/browse/HIVE-26032 Project: Hive Issue Type: Task Components: Hive Affects Versions: 4.0.0 Reporter: Yuming Wang To fix [CVE-2021-41269|https://nvd.nist.gov/vuln/detail/CVE-2021-41269] issue. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26030) Backport HIVE-21498 to branch-2.3
Yuming Wang created HIVE-26030: -- Summary: Backport HIVE-21498 to branch-2.3 Key: HIVE-26030 URL: https://issues.apache.org/jira/browse/HIVE-26030 Project: Hive Issue Type: Task Components: Thrift API Affects Versions: 2.3.9 Reporter: Yuming Wang -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25996) Backport HIVE-21498 and HIVE-25098 to fix CVE-2020-13949
Yuming Wang created HIVE-25996: -- Summary: Backport HIVE-21498 and HIVE-25098 to fix CVE-2020-13949 Key: HIVE-25996 URL: https://issues.apache.org/jira/browse/HIVE-25996 Project: Hive Issue Type: Improvement Affects Versions: 2.3.9 Reporter: Yuming Wang -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25869) Add GitHub Action job to publish snapshot
Yuming Wang created HIVE-25869: -- Summary: Add GitHub Action job to publish snapshot Key: HIVE-25869 URL: https://issues.apache.org/jira/browse/HIVE-25869 Project: Hive Issue Type: Improvement Reporter: Yuming Wang -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25635) Upgrade Thrift to 0.15.0
Yuming Wang created HIVE-25635: -- Summary: Upgrade Thrift to 0.15.0 Key: HIVE-25635 URL: https://issues.apache.org/jira/browse/HIVE-25635 Project: Hive Issue Type: Improvement Reporter: Yuming Wang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25295) "File already exist exception" during mapper/reducer retry with old hive(0.13)
yuquan wang created HIVE-25295: -- Summary: "File already exist exception" during mapper/reducer retry with old hive(0.13) Key: HIVE-25295 URL: https://issues.apache.org/jira/browse/HIVE-25295 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 0.13.0 Reporter: yuquan wang We are now using very old hive version(0.13) due to historical reason, and we often meet following issue: {code:java} Caused by: java.io.IOException: File already exists:s3://smart-dmp/warehouse/uploaded/ad_dmp_pixel/dt=2021-06-21/key=259f3XXX {code} We have investigated this issue for quite a long time, but didn't get a good fix, so I may want to ask the hive community for help to see if there are any solutions. The error is created during map/reduce stage, once an instance failed due to some unexpected reason(for example unstable spot instance got killed), then later retry will throw the above exception, instead of overwriting it. we have several guesses like following: 1. Is it caused by orc file type? I have found similar issue like https://issues.apache.org/jira/browse/HIVE-6341 but saw no comments there, and our table is stored as orc style. 2. Is the problem solved in the higher hive version? because we are also running hive 2.3.6, but didn't meet such an issue, so want to see if version upgrade can solve the issue? 3.Do we have such a config that supports always cleaning up existing folders during retry of mapper/reducer stage. I have searched all mapreduce config but can not find one. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24893) Download data from Thriftserver through JDBC
Yuming Wang created HIVE-24893: -- Summary: Download data from Thriftserver through JDBC Key: HIVE-24893 URL: https://issues.apache.org/jira/browse/HIVE-24893 Project: Hive Issue Type: New Feature Components: HiveServer2, JDBC Affects Versions: 4.0.0 Reporter: Yuming Wang Snowflake support Download Data Files Directly from an Internal Stage to a Stream: https://docs.snowflake.com/en/user-guide/jdbc-using.html#label-jdbc-download-from-stage-to-stream https://github.com/snowflakedb/snowflake-jdbc/blob/95a7d8a03316093430dc3960df6635643208b6fd/src/main/java/net/snowflake/client/jdbc/SnowflakeConnectionV1.java#L886 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24797) Disable validate default values when parsing Avro schemas.
Yuming Wang created HIVE-24797: -- Summary: Disable validate default values when parsing Avro schemas. Key: HIVE-24797 URL: https://issues.apache.org/jira/browse/HIVE-24797 Project: Hive Issue Type: Bug Reporter: Yuming Wang It will throw exceptions when upgrading Avro to 1.10.1 for this schema: {code:json} { "type": "record", "name": "EventData", "doc": "event data", "fields": [ {"name": "ARRAY_WITH_DEFAULT", "type": {"type": "array", "items": "string"}, "default": null } ] } {code} {noformat} org.apache.avro.AvroTypeException: Invalid default for field USERACTIONS: null not a {"type":"array","items":"string"} at org.apache.avro.Schema.validateDefault(Schema.java:1571) at org.apache.avro.Schema.access$500(Schema.java:87) at org.apache.avro.Schema$Field.(Schema.java:544) at org.apache.avro.Schema.parse(Schema.java:1678) at org.apache.avro.Schema$Parser.parse(Schema.java:1425) at org.apache.avro.Schema$Parser.parse(Schema.java:1396) at org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.getSchemaFor(AvroSerdeUtils.java:287) at org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.getSchemaFromFS(AvroSerdeUtils.java:170) at org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:139) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.determineSchemaOrReturnErrorSchema(AvroSerDe.java:187) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:107) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:83) at org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:533) at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:493) at org.apache.hadoop.hive.ql.metadata.Partition.getDeserializer(Partition.java:225) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24760) Backport HIVE-19228 to branch-3.0, branch-2 and branch-2.3
Yuming Wang created HIVE-24760: -- Summary: Backport HIVE-19228 to branch-3.0, branch-2 and branch-2.3 Key: HIVE-24760 URL: https://issues.apache.org/jira/browse/HIVE-24760 Project: Hive Issue Type: Improvement Reporter: Yuming Wang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24568) Fix guice compatibility issues
Yuming Wang created HIVE-24568: -- Summary: Fix guice compatibility issues Key: HIVE-24568 URL: https://issues.apache.org/jira/browse/HIVE-24568 Project: Hive Issue Type: Improvement Reporter: Yuming Wang {noformat} Exception in thread "main" java.lang.NoSuchMethodError: com.google.inject.util.Types.collectionOf(Ljava/lang/reflect/Type;)Ljava/lang/reflect/ParameterizedType; » at com.google.inject.multibindings.Multibinder.collectionOfProvidersOf(Multibinder.java:202) » at com.google.inject.multibindings.Multibinder$RealMultibinder.(Multibinder.java:283) » at com.google.inject.multibindings.Multibinder$RealMultibinder.(Multibinder.java:258) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24436) Fix Avro NULL_DEFAULT_VALUE compatibility issue
Yuming Wang created HIVE-24436: -- Summary: Fix Avro NULL_DEFAULT_VALUE compatibility issue Key: HIVE-24436 URL: https://issues.apache.org/jira/browse/HIVE-24436 Project: Hive Issue Type: Improvement Components: Avro Affects Versions: 2.3.8 Reporter: Yuming Wang Exception1: {noformat} - create hive serde table with Catalog *** RUN ABORTED *** java.lang.NoSuchMethodError: 'void org.apache.avro.Schema$Field.(java.lang.String, org.apache.avro.Schema, java.lang.String, org.codehaus.jackson.JsonNode)' at org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.createAvroField(TypeInfoToSchema.java:76) at org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.convert(TypeInfoToSchema.java:61) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.getSchemaFromCols(AvroSerDe.java:170) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:114) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:83) at org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:533) at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:450) at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:437) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:281) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:263) {noformat} Exception2: {noformat} - alter hive serde table add columns -- partitioned - AVRO *** FAILED *** org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.avro.AvroRuntimeException: Unknown datum class: class org.codehaus.jackson.node.NullNode; at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:112) at org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:245) at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.createTable(ExternalCatalogWithListener.scala:94) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:346) at org.apache.spark.sql.execution.command.CreateTableCommand.run(tables.scala:166) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228) at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3680) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24153) distinct is not quite effective in table expression
Xinyu Wang created HIVE-24153: - Summary: distinct is not quite effective in table expression Key: HIVE-24153 URL: https://issues.apache.org/jira/browse/HIVE-24153 Project: Hive Issue Type: Bug Components: Query Planning Affects Versions: 3.1.1 Reporter: Xinyu Wang Below is an example: _t(id int, name string, comment string)._ _with cte as (_ _select distinct id, name, comment_ _from t_ _)_ _select count(*) from cte_ The result of the above query is larger than select count(distinct id, name, comment). In the result of EXPLAIN, PARTITION_ONLY_SHUFFLE is used. But for select count(distinct id, name, comment), SHUFFLE is used instead. Thanks. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24053) Pluggable HttpRequestInterceptor for Hive JDBC
Ying Wang created HIVE-24053: Summary: Pluggable HttpRequestInterceptor for Hive JDBC Key: HIVE-24053 URL: https://issues.apache.org/jira/browse/HIVE-24053 Project: Hive Issue Type: New Feature Components: JDBC Affects Versions: 3.1.2 Reporter: Ying Wang Assignee: Ying Wang Allows client to pass in the name of a customize HttpRequestInterceptor, instantiate the class and adds it to HttpClient. Example usage: We would like to pass in a HttpRequestInterceptor for OAuth2.0 Authentication purpose. The HttpRequestInterceptor will acquire and/or refresh the access token and add it as authentication header each time HiveConnection sends the HttpRequest. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24049) Forbid binary type as partition column
Yuming Wang created HIVE-24049: -- Summary: Forbid binary type as partition column Key: HIVE-24049 URL: https://issues.apache.org/jira/browse/HIVE-24049 Project: Hive Issue Type: Bug Reporter: Yuming Wang Use binary type as partition column maybe has data issue. {noformat} hive> create table t1(id int) partitioned by (part binary); OK Time taken: 3.307 seconds hive> insert into t1 PARTITION(part) select 1 as id, cast('a' as binary) as part; WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Query ID = yumwang_20200819144033_5eb6d723-edeb-4e17-8509-c658ad89c2a3 Total jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Job running in-process (local Hadoop) 2020-08-19 14:40:36,083 Stage-1 map = 100%, reduce = 0% Ended Job = job_local247252310_0001 Stage-4 is selected by condition resolver. Stage-3 is filtered out by condition resolver. Stage-5 is filtered out by condition resolver. Moving data to directory file:/Users/yumwang/Downloads/apache-hive-2.3.7-bin/tmp/t1/.hive-staging_hive_2020-08-19_14-40-33_789_7653530788805518878-1/-ext-1 Loading data to table default.t1 partition (part=null) Loaded : 1/1 partitions. Time taken to load dynamic partitions: 4.029 seconds Time taken for adding to write entity : 0.001 seconds MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec OK Time taken: 6.591 seconds hive> insert into t1 PARTITION(part) select 1 as id, cast('b' as binary) as part; WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Query ID = yumwang_20200819144045_1f112d6d-effa-4d81-87e8-9326015289f1 Total jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Job running in-process (local Hadoop) 2020-08-19 14:40:47,537 Stage-1 map = 100%, reduce = 0% Ended Job = job_local698238180_0002 Stage-4 is selected by condition resolver. Stage-3 is filtered out by condition resolver. Stage-5 is filtered out by condition resolver. Moving data to directory file:/Users/yumwang/Downloads/apache-hive-2.3.7-bin/tmp/t1/.hive-staging_hive_2020-08-19_14-40-45_908_8062651574733580526-1/-ext-1 Loading data to table default.t1 partition (part=null) Loaded : 1/1 partitions. Time taken to load dynamic partitions: 0.15 seconds Time taken for adding to write entity : 0.0 seconds MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec OK Time taken: 1.988 seconds hive> select * from t1; OK 1 61 1 62 Time taken: 0.471 seconds, Fetched: 2 row(s) hive> select * from t1 where part= cast('b' as binary);; OK Time taken: 0.381 seconds hive> select * from t1 where part= cast('b' as binary); OK Time taken: 0.141 seconds hive> select * from t1 where part= cast('a' as binary); OK Time taken: 0.198 seconds hive> select * from t1 where part= 61; FAILED: RuntimeException Cannot convert to Binary from: int {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23771) load数据到hive,limit 显示用户名中文正确,where 用户名乱码,并且不能使用用户名比对
wang created HIVE-23771: --- Summary: load数据到hive,limit 显示用户名中文正确,where 用户名乱码,并且不能使用用户名比对 Key: HIVE-23771 URL: https://issues.apache.org/jira/browse/HIVE-23771 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 2.1.1 Reporter: wang Fix For: 2.1.1 Attachments: image-2020-06-29-15-04-23-999.png, image-2020-06-29-15-08-25-923.png, image-2020-06-29-15-10-10-310.png 建表语句:create table smg_t_usr_inf_23( Usr_ID string, RlgnSvcPltfrmUsr_TpCd string, Rlgn_InsID string, Usr_Nm string , ) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH SERDEPROPERTIES ("field.delim"="|@|") stored as textfile 导入数据:LOAD DATA LOCAL INPATH '/home/ap/USR_INF 20200622_0001.dat' INTO TABLE usr_inf select * from usr_inf limit 10;显示数据: !image-2020-06-29-15-04-23-999.png! select * from usr_inf where usr_nm = '胡学玲' ;无显示数据: !image-2020-06-29-15-08-25-923.png! 其他select * from usr_inf where usr_id='***';显示数据 !image-2020-06-29-15-10-10-310.png! . 求大神解答,为什么导入的数据是中文但是where就有问题,直接insert into table aa select * from usr_inf;新表 的usr_nm 字段也是同上 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23359) "show tables like" support for SQL wildcard characters (% and _)
Yuming Wang created HIVE-23359: -- Summary: "show tables like" support for SQL wildcard characters (% and _) Key: HIVE-23359 URL: https://issues.apache.org/jira/browse/HIVE-23359 Project: Hive Issue Type: Improvement Components: SQL Affects Versions: 2.3.7 Reporter: Yuming Wang https://docs.snowflake.com/en/sql-reference/sql/show-tables.html https://clickhouse.tech/docs/en/sql-reference/statements/show/ https://www.mysqltutorial.org/mysql-show-tables/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23129) Cast invalid string to date returns incorrect result
Yuming Wang created HIVE-23129: -- Summary: Cast invalid string to date returns incorrect result Key: HIVE-23129 URL: https://issues.apache.org/jira/browse/HIVE-23129 Project: Hive Issue Type: Bug Affects Versions: 3.1.2 Reporter: Yuming Wang {noformat} hive> select cast('2020-20-20' as date); OK 2021-08-20 Time taken: 4.436 seconds, Fetched: 1 row(s) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22838) like any is incorrect if contains null
Yuming Wang created HIVE-22838: -- Summary: like any is incorrect if contains null Key: HIVE-22838 URL: https://issues.apache.org/jira/browse/HIVE-22838 Project: Hive Issue Type: Bug Components: Query Planning Affects Versions: 3.1.2 Reporter: Yuming Wang How to reproduce: {code:sql} CREATE TABLE like_any_table STORED AS TEXTFILE AS SELECT "google" as company,"%oo%" as pat UNION ALL SELECT "facebook" as company,"%oo%" as pat UNION ALL SELECT "linkedin" as company,"%in" as pat ; {code} {noformat} hive> select company from like_any_table where company like any ('%oo%',null); OK Time taken: 0.064 seconds hive> select company from like_any_table where company like '%oo%' or company like null; OK google facebook {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22642) Fix the TCLIService.thrift warning
Yuming Wang created HIVE-22642: -- Summary: Fix the TCLIService.thrift warning Key: HIVE-22642 URL: https://issues.apache.org/jira/browse/HIVE-22642 Project: Hive Issue Type: Improvement Reporter: Yuming Wang {noformat} TCLIService.thrift:361] Consider using the more efficient "binary" type instead of "list" {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22229) Backport HIVE-8472 to branch-2.3
Yuming Wang created HIVE-9: -- Summary: Backport HIVE-8472 to branch-2.3 Key: HIVE-9 URL: https://issues.apache.org/jira/browse/HIVE-9 Project: Hive Issue Type: Improvement Components: Database/Schema Affects Versions: 2.3.6 Reporter: Yuming Wang Assignee: Yuming Wang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22206) Failed to "add jar" with hiveserver2 on JDK 11
Yuming Wang created HIVE-22206: -- Summary: Failed to "add jar" with hiveserver2 on JDK 11 Key: HIVE-22206 URL: https://issues.apache.org/jira/browse/HIVE-22206 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 4.0.0 Reporter: Yuming Wang How to reproduce: {code:sh} export JAVA_HOME=/usr/lib/jdk-11.0.3 export PATH=${JAVA_HOME}/bin:${PATH} rm -rf lib/hive-hcatalog-core-4.0.0-SNAPSHOT.jar bin/hiveserver2 {code} {code:sql} bin/beeline -u jdbc:hive2://localhost:1 add jar /root/opensource/apache-hive/hive-hcatalog-core-4.0.0-SNAPSHOT.jar; CREATE TABLE addJar(key string) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'; {code} {noformat} 0: jdbc:hive2://localhost:1> add jar /root/opensource/apache-hive/packaging/target/apache-hive-4.0.0-SNAPSHOT-bin/apache-hive-4.0.0-SNAPSHOT-bin/hive-hcatalog-core-4.0.0-SNAPSHOT.jar; INFO : Added [/root/opensource/apache-hive/packaging/target/apache-hive-4.0.0-SNAPSHOT-bin/apache-hive-4.0.0-SNAPSHOT-bin/hive-hcatalog-core-4.0.0-SNAPSHOT.jar] to class path INFO : Added resources: [/root/opensource/apache-hive/packaging/target/apache-hive-4.0.0-SNAPSHOT-bin/apache-hive-4.0.0-SNAPSHOT-bin/hive-hcatalog-core-4.0.0-SNAPSHOT.jar] No rows affected (0.018 seconds) 0: jdbc:hive2://localhost:1> CREATE TABLE addJar(key string) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'; INFO : Compiling command(queryId=root_20190914215356_211fe827-960f-4556-ad5b-feb4ad474a8c): CREATE TABLE addJar(key string) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' INFO : Concurrency mode is disabled, not creating a lock manager INFO : Semantic Analysis Completed (retrial = false) INFO : Returning Hive schema: Schema(fieldSchemas:null, properties:null) INFO : Completed compiling command(queryId=root_20190914215356_211fe827-960f-4556-ad5b-feb4ad474a8c); Time taken: 0.006 seconds INFO : Concurrency mode is disabled, not creating a lock manager INFO : Executing command(queryId=root_20190914215356_211fe827-960f-4556-ad5b-feb4ad474a8c): CREATE TABLE addJar(key string) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' INFO : Starting task [Stage-0:DDL] in serial mode ERROR : Failed org.apache.hadoop.hive.ql.metadata.HiveException: Cannot validate serde: org.apache.hive.hcatalog.data.JsonSerDe at org.apache.hadoop.hive.ql.ddl.DDLUtils.validateSerDe(DDLUtils.java:118) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.ddl.table.creation.CreateTableDesc.toTable(CreateTableDesc.java:772) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.ddl.table.creation.CreateTableOperation.execute(CreateTableOperation.java:57) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:90) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2188) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1840) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1508) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1268) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1262) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:160) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:233) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation.access$600(SQLOperation.java:88) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:332) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at java.security.AccessController.doPrivileged(Native Method) ~[?:?] at javax.security.auth.Subject.doAs(Subject.java:423) ~[?:?] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) ~[hadoop-common-3.2.0.jar:?] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:350) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?] at
[jira] [Created] (HIVE-22139) Will not pad Decimal numbers with trailing zeros if select from value
Yuming Wang created HIVE-22139: -- Summary: Will not pad Decimal numbers with trailing zeros if select from value Key: HIVE-22139 URL: https://issues.apache.org/jira/browse/HIVE-22139 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 3.1.1 Reporter: Yuming Wang How to reproduce: {code:sql} // code placeholder {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (HIVE-22097) Incompatible java.util.ArrayList for java 11
Yuming Wang created HIVE-22097: -- Summary: Incompatible java.util.ArrayList for java 11 Key: HIVE-22097 URL: https://issues.apache.org/jira/browse/HIVE-22097 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Yuming Wang {noformat} export JAVA_HOME=/usr/lib/jdk-11.0.3 export PATH=${JAVA_HOME}/bin:${PATH} hive> create table t(id int); Time taken: 0.035 seconds hive> insert into t values(1); Query ID = root_20190811155400_7c0e0494-eecb-4c54-a9fd-942ab52a0794 Total jobs = 3 Launching Job 1 out of 3 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer= In order to limit the maximum number of reducers: set hive.exec.reducers.max= In order to set a constant number of reducers: set mapreduce.job.reduces= java.lang.RuntimeException: java.lang.NoSuchFieldException: parentOffset at org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:390) at org.apache.hadoop.hive.ql.exec.SerializationUtilities$1.create(SerializationUtilities.java:235) at org.apache.hive.com.esotericsoftware.kryo.pool.KryoPoolQueueImpl.borrow(KryoPoolQueueImpl.java:48) at org.apache.hadoop.hive.ql.exec.SerializationUtilities.borrowKryo(SerializationUtilities.java:280) at org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:595) at org.apache.hadoop.hive.ql.exec.Utilities.setMapWork(Utilities.java:587) at org.apache.hadoop.hive.ql.exec.Utilities.setMapRedWork(Utilities.java:579) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:357) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:159) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2317) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1969) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1636) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1396) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1390) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:242) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:189) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:838) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:777) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:696) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.apache.hadoop.util.RunJar.run(RunJar.java:323) at org.apache.hadoop.util.RunJar.main(RunJar.java:236) Caused by: java.lang.NoSuchFieldException: parentOffset at java.base/java.lang.Class.getDeclaredField(Class.java:2412) at org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:384) ... 29 more Job Submission failed with exception 'java.lang.RuntimeException(java.lang.NoSuchFieldException: parentOffset)' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. java.lang.NoSuchFieldException: parentOffset {noformat} The reason is Java remove {{parentOffset}}. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (HIVE-22096) Backport HIVE-21584 to branch-2.3
Yuming Wang created HIVE-22096: -- Summary: Backport HIVE-21584 to branch-2.3 Key: HIVE-22096 URL: https://issues.apache.org/jira/browse/HIVE-22096 Project: Hive Issue Type: Improvement Components: Hive Reporter: Yuming Wang Assignee: Yuming Wang -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (HIVE-22012) Support timestamp type + string type
Yuming Wang created HIVE-22012: -- Summary: Support timestamp type + string type Key: HIVE-22012 URL: https://issues.apache.org/jira/browse/HIVE-22012 Project: Hive Issue Type: Improvement Components: Parser Affects Versions: 4.0.0 Reporter: Yuming Wang {code:sql} hive> select current_timestamp() + '100 days'; FAILED: SemanticException [Error 10014]: Line 1:7 Wrong arguments ''100 days'': No matching method for class org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPDTIPlus with (timestamp, string) hive> {code} {code:sql} postgres=# explain verbose select now() + '100 days', '100 days' + now(); QUERY PLAN -- Result (cost=0.00..0.02 rows=1 width=16) Output: (now() + '100 days'::interval), (now() + '100 days'::interval) (2 rows) {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (HIVE-21680) Backport HIVE-17644 to branch-2 and branch-2.3
Yuming Wang created HIVE-21680: -- Summary: Backport HIVE-17644 to branch-2 and branch-2.3 Key: HIVE-21680 URL: https://issues.apache.org/jira/browse/HIVE-21680 Project: Hive Issue Type: Bug Reporter: Yuming Wang Assignee: Yuming Wang {code:scala} test("get statistics when not analyzed in Hive or Spark") { val tabName = "tab1" withTable(tabName) { createNonPartitionedTable(tabName, analyzedByHive = false, analyzedBySpark = false) checkTableStats(tabName, hasSizeInBytes = true, expectedRowCounts = None) // ALTER TABLE SET TBLPROPERTIES invalidates some contents of Hive specific statistics // This is triggered by the Hive alterTable API val describeResult = hiveClient.runSqlHive(s"DESCRIBE FORMATTED $tabName") val rawDataSize = extractStatsPropValues(describeResult, "rawDataSize") val numRows = extractStatsPropValues(describeResult, "numRows") val totalSize = extractStatsPropValues(describeResult, "totalSize") assert(rawDataSize.isEmpty, "rawDataSize should not be shown without table analysis") assert(numRows.isEmpty, "numRows should not be shown without table analysis") assert(totalSize.isDefined && totalSize.get > 0, "totalSize is lost") } } // https://github.com/apache/spark/blob/43dcb91a4cb25aa7e1cc5967194f098029a0361e/sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala#L789-L806 {code} {noformat} 06:23:46.103 WARN org.apache.hadoop.hive.metastore.MetaStoreDirectSql: Failed to execute [SELECT "DBS"."NAME", "TBLS"."TBL_NAME", "COLUMNS_V2"."COLUMN_NAME","KEY_CONSTRAINTS"."POSITION", "KEY_CONSTRAINTS"."CONSTRAINT_NAME", "KEY_CONSTRAINTS"."ENABLE_VALIDATE_RELY" FROM "TBLS" INNER JOIN "KEY_CONSTRAINTS" ON "TBLS"."TBL_ID" = "KEY_CONSTRAINTS"."PARENT_TBL_ID" INNER JOIN "DBS" ON "TBLS"."DB_ID" = "DBS"."DB_ID" INNER JOIN "COLUMNS_V2" ON "COLUMNS_V2"."CD_ID" = "KEY_CONSTRAINTS"."PARENT_CD_ID" AND "COLUMNS_V2"."INTEGER_IDX" = "KEY_CONSTRAINTS"."PARENT_INTEGER_IDX" WHERE "KEY_CONSTRAINTS"."CONSTRAINT_TYPE" = 0 AND "DBS"."NAME" = ? AND "TBLS"."TBL_NAME" = ?] with parameters [default, tab1] javax.jdo.JDODataStoreException: Error executing SQL query "SELECT "DBS"."NAME", "TBLS"."TBL_NAME", "COLUMNS_V2"."COLUMN_NAME","KEY_CONSTRAINTS"."POSITION", "KEY_CONSTRAINTS"."CONSTRAINT_NAME", "KEY_CONSTRAINTS"."ENABLE_VALIDATE_RELY" FROM "TBLS" INNER JOIN "KEY_CONSTRAINTS" ON "TBLS"."TBL_ID" = "KEY_CONSTRAINTS"."PARENT_TBL_ID" INNER JOIN "DBS" ON "TBLS"."DB_ID" = "DBS"."DB_ID" INNER JOIN "COLUMNS_V2" ON "COLUMNS_V2"."CD_ID" = "KEY_CONSTRAINTS"."PARENT_CD_ID" AND "COLUMNS_V2"."INTEGER_IDX" = "KEY_CONSTRAINTS"."PARENT_INTEGER_IDX" WHERE "KEY_CONSTRAINTS"."CONSTRAINT_TYPE" = 0 AND "DBS"."NAME" = ? AND "TBLS"."TBL_NAME" = ?". at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543) at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:391) at org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:267) at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.executeWithArray(MetaStoreDirectSql.java:1750) at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPrimaryKeys(MetaStoreDirectSql.java:1939) at org.apache.hadoop.hive.metastore.ObjectStore$11.getSqlResult(ObjectStore.java:8213) at org.apache.hadoop.hive.metastore.ObjectStore$11.getSqlResult(ObjectStore.java:8209) at org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2719) at org.apache.hadoop.hive.metastore.ObjectStore.getPrimaryKeysInternal(ObjectStore.java:8221) at org.apache.hadoop.hive.metastore.ObjectStore.getPrimaryKeys(ObjectStore.java:8199) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101) at com.sun.proxy.$Proxy24.getPrimaryKeys(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_primary_keys(HiveMetaStore.java:6830) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) at
[jira] [Created] (HIVE-21639) Spark test failed since HIVE-10632
Yuming Wang created HIVE-21639: -- Summary: Spark test failed since HIVE-10632 Key: HIVE-21639 URL: https://issues.apache.org/jira/browse/HIVE-21639 Project: Hive Issue Type: Bug Reporter: Yuming Wang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21589) Remove org.eclipse.jetty.orbit:javax.servlet from hive-common
Yuming Wang created HIVE-21589: -- Summary: Remove org.eclipse.jetty.orbit:javax.servlet from hive-common Key: HIVE-21589 URL: https://issues.apache.org/jira/browse/HIVE-21589 Project: Hive Issue Type: Task Components: Spark Affects Versions: 2.3.4 Reporter: Yuming Wang Assignee: Yuming Wang HIVE-12783 includes org.eclipse.jetty.orbit:javax.servlet to fix the Hive on Spark test failure. Since Spark 2.0, We do not need it, see SPARK-14897. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21588) Remove HBase dependency from hive-metastore
Yuming Wang created HIVE-21588: -- Summary: Remove HBase dependency from hive-metastore Key: HIVE-21588 URL: https://issues.apache.org/jira/browse/HIVE-21588 Project: Hive Issue Type: Task Components: HBase Metastore Affects Versions: 4.0.0 Reporter: Yuming Wang Assignee: Yuming Wang HIVE-17234 has removed HBase metastore from master. But maven dependency have not been removed -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21563) Improve Table#getEmptyTable performance by disable registerAllFunctionsOnce
Yuming Wang created HIVE-21563: -- Summary: Improve Table#getEmptyTable performance by disable registerAllFunctionsOnce Key: HIVE-21563 URL: https://issues.apache.org/jira/browse/HIVE-21563 Project: Hive Issue Type: Improvement Reporter: Yuming Wang Assignee: Yuming Wang We do not need registerAllFunctionsOnce when {{Table#getEmptyTable}}. The stack trace: {noformat} at org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDF(Registry.java:177) at org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDF(Registry.java:170) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.(FunctionRegistry.java:209) at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:247) at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:231) at org.apache.hadoop.hive.ql.metadata.Hive.(Hive.java:388) at org.apache.hadoop.hive.ql.metadata.Hive.create(Hive.java:332) at org.apache.hadoop.hive.ql.metadata.Hive.getInternal(Hive.java:312) at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:288) at org.apache.hadoop.hive.ql.session.SessionState.setAuthorizerV2Config(SessionState.java:913) at org.apache.hadoop.hive.ql.session.SessionState.setupAuth(SessionState.java:877) at org.apache.hadoop.hive.ql.session.SessionState.getAuthenticator(SessionState.java:1479) at org.apache.hadoop.hive.ql.session.SessionState.getUserFromAuthenticator(SessionState.java:1150) at org.apache.hadoop.hive.ql.metadata.Table.getEmptyTable(Table.java:180) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21552) Remove tomcat:jasper-* from hive-service-rpc
Yuming Wang created HIVE-21552: -- Summary: Remove tomcat:jasper-* from hive-service-rpc Key: HIVE-21552 URL: https://issues.apache.org/jira/browse/HIVE-21552 Project: Hive Issue Type: Improvement Reporter: Yuming Wang Assignee: Yuming Wang {{hive-service}} added these dependency. {{hive-service-rpc}} do not need these dependency. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21551) Remove tomcat:jasper-* from hive-service-rpc
Yuming Wang created HIVE-21551: -- Summary: Remove tomcat:jasper-* from hive-service-rpc Key: HIVE-21551 URL: https://issues.apache.org/jira/browse/HIVE-21551 Project: Hive Issue Type: Improvement Reporter: Yuming Wang Assignee: Yuming Wang {{hive-service}} added these dependency. {{hive-service-rpc}} do not need these dependency. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21536) Backport HIVE-17764 to branch-2.3
Yuming Wang created HIVE-21536: -- Summary: Backport HIVE-17764 to branch-2.3 Key: HIVE-21536 URL: https://issues.apache.org/jira/browse/HIVE-21536 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 2.3.4 Reporter: Yuming Wang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21521) Upgrade Hive to use ORC 1.5.5
Yuming Wang created HIVE-21521: -- Summary: Upgrade Hive to use ORC 1.5.5 Key: HIVE-21521 URL: https://issues.apache.org/jira/browse/HIVE-21521 Project: Hive Issue Type: Improvement Affects Versions: 2.3.4 Reporter: Yuming Wang Assignee: Yuming Wang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20650) trunc string type throw FAILED: ArrayIndexOutOfBoundsException 1
Yuming Wang created HIVE-20650: -- Summary: trunc string type throw FAILED: ArrayIndexOutOfBoundsException 1 Key: HIVE-20650 URL: https://issues.apache.org/jira/browse/HIVE-20650 Project: Hive Issue Type: Bug Affects Versions: 2.3.3 Reporter: Yuming Wang {code:sql} hive> select trunc('2.5'); FAILED: ArrayIndexOutOfBoundsException 1 hive> SELECT trunc('2009-02-12'); FAILED: ArrayIndexOutOfBoundsException 1 {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20216) Support range partition
Yuming Wang created HIVE-20216: -- Summary: Support range partition Key: HIVE-20216 URL: https://issues.apache.org/jira/browse/HIVE-20216 Project: Hive Issue Type: Improvement Components: SQL Affects Versions: 3.2.0 Reporter: Yuming Wang Support RANGE PARTITION to improvement performance: {code:sql} CREATE TABLE employees ( id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, fname VARCHAR(25) NOT NULL, lname VARCHAR(25) NOT NULL, store_id INT NOT NULL, department_id INT NOT NULL ) PARTITION BY RANGE(id) ( PARTITION p0 VALUES LESS THAN (5), PARTITION p1 VALUES LESS THAN (10), PARTITION p2 VALUES LESS THAN (15), PARTITION p3 VALUES LESS THAN MAXVALUE ); {code} https://dev.mysql.com/doc/refman/5.6/en/partitioning-selection.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19740) Hiveserver2 can't connect to metastore when using Hive 3.0
heyang wang created HIVE-19740: -- Summary: Hiveserver2 can't connect to metastore when using Hive 3.0 Key: HIVE-19740 URL: https://issues.apache.org/jira/browse/HIVE-19740 Project: Hive Issue Type: Bug Affects Versions: 3.0.0 Reporter: heyang wang Attachments: hive-site.xml I am using docker to deploy Hadoop 2.7, Hive 3.0 and Spark 2.3. After starting all the docker image. Hive server2 can't start while outputting the following error log: 2018-05-30T14:13:53,832 WARN [main]: server.HiveServer2 (HiveServer2.java:startHiveServer2(1041)) - Error starting HiveServer2 on attempt 1, will retry in 6ms java.lang.RuntimeException: Error initializing notification event poll at org.apache.hive.service.server.HiveServer2.init(HiveServer2.java:269) ~[hive-service-3.0.0.jar:3.0.0] at org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:1013) [hive-service-3.0.0.jar:3.0.0] at org.apache.hive.service.server.HiveServer2.access$1800(HiveServer2.java:134) [hive-service-3.0.0.jar:3.0.0] at org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:1282) [hive-service-3.0.0.jar:3.0.0] at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:1126) [hive-service-3.0.0.jar:3.0.0] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_131] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_131] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_131] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_131] at org.apache.hadoop.util.RunJar.run(RunJar.java:221) [hadoop-common-2.7.4.jar:?] at org.apache.hadoop.util.RunJar.main(RunJar.java:136) [hadoop-common-2.7.4.jar:?] Caused by: java.io.IOException: org.apache.thrift.TApplicationException: Internal error processing get_current_notificationEventId at org.apache.hadoop.hive.metastore.messaging.EventUtils$MSClientNotificationFetcher.getCurrentNotificationEventId(EventUtils.java:75) ~[hive-exec-3.0.0.jar:3.0.0] at org.apache.hadoop.hive.ql.metadata.events.NotificationEventPoll.(NotificationEventPoll.java:103) ~[hive-exec-3.0.0.jar:3.0.0] at org.apache.hadoop.hive.ql.metadata.events.NotificationEventPoll.initialize(NotificationEventPoll.java:59) ~[hive-exec-3.0.0.jar:3.0.0] at org.apache.hive.service.server.HiveServer2.init(HiveServer2.java:267) ~[hive-service-3.0.0.jar:3.0.0] ... 10 more Caused by: org.apache.thrift.TApplicationException: Internal error processing get_current_notificationEventId at org.apache.thrift.TApplicationException.read(TApplicationException.java:111) ~[hive-exec-3.0.0.jar:3.0.0] at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79) ~[hive-exec-3.0.0.jar:3.0.0] at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_current_notificationEventId(ThriftHiveMetastore.java:5541) ~[hive-exec-3.0.0.jar:3.0.0] at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_current_notificationEventId(ThriftHiveMetastore.java:5529) ~[hive-exec-3.0.0.jar:3.0.0] at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getCurrentNotificationEventId(HiveMetaStoreClient.java:2713) ~[hive-exec-3.0.0.jar:3.0.0] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_131] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_131] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_131] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_131] at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212) ~[hive-exec-3.0.0.jar:3.0.0] at com.sun.proxy.$Proxy34.getCurrentNotificationEventId(Unknown Source) ~[?:?] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_131] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_131] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_131] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_131] at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2763) ~[hive-exec-3.0.0.jar:3.0.0] at com.sun.proxy.$Proxy34.getCurrentNotificationEventId(Unknown Source) ~[?:?] at org.apache.hadoop.hive.metastore.messaging.EventUtils$MSClientNotificationFetcher.getCurrentNotificationEventId(EventUtils.java:73) ~[hive-exec-3.0.0.jar:3.0.0] at org.apache.hadoop.hive.ql.metadata.events.NotificationEventPoll.(NotificationEventPoll.java:103) ~[hive-exec-3.0.0.jar:3.0.0] at org.apache.hadoop.hive.ql.metadata.events.NotificationEventPoll.initialize(NotificationEventPoll.java:59) ~[hive-exec-3.0.0.jar:3.0.0] at
[jira] [Created] (HIVE-18856) param note error
Yu Wang created HIVE-18856: -- Summary: param note error Key: HIVE-18856 URL: https://issues.apache.org/jira/browse/HIVE-18856 Project: Hive Issue Type: Improvement Affects Versions: 1.1.0 Reporter: Yu Wang Assignee: Yu Wang Fix For: 1.1.0 The PerfLogBegin method in the PerfLogger file comments with an error -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18767) Some alterPartitions throw NumberFormatException: null
Yuming Wang created HIVE-18767: -- Summary: Some alterPartitions throw NumberFormatException: null Key: HIVE-18767 URL: https://issues.apache.org/jira/browse/HIVE-18767 Project: Hive Issue Type: Bug Affects Versions: 2.3.2 Reporter: Yuming Wang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-17240) function acos(2) should be null
Yuming Wang created HIVE-17240: -- Summary: function acos(2) should be null Key: HIVE-17240 URL: https://issues.apache.org/jira/browse/HIVE-17240 Project: Hive Issue Type: Bug Components: UDF Affects Versions: 2.2.0, 1.2.2, 1.1.1 Reporter: Yuming Wang {{acos(2)}} should be null, same as MySQL: {code:sql} hive> desc function extended acos; OK acos(x) - returns the arc cosine of x if -1<=x<=1 or NULL otherwise Example: > SELECT acos(1) FROM src LIMIT 1; 0 > SELECT acos(2) FROM src LIMIT 1; NULL Time taken: 0.009 seconds, Fetched: 6 row(s) hive> select acos(2); OK NaN Time taken: 0.437 seconds, Fetched: 1 row(s) {code} {code:sql} mysql> select acos(2); +-+ | acos(2) | +-+ |NULL | +-+ 1 row in set (0.00 sec) {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-16641) Use DistCpOptions.Builder in Hadoop shims
Andrew Wang created HIVE-16641: -- Summary: Use DistCpOptions.Builder in Hadoop shims Key: HIVE-16641 URL: https://issues.apache.org/jira/browse/HIVE-16641 Project: Hive Issue Type: Bug Components: Shims Reporter: Andrew Wang Doing some testing against Hadoop trunk. HADOOP-14267 changed how DistCp is invoked. Options are now specified via a builder. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16490) Hive should not use private HDFS APIs for encryption
Andrew Wang created HIVE-16490: -- Summary: Hive should not use private HDFS APIs for encryption Key: HIVE-16490 URL: https://issues.apache.org/jira/browse/HIVE-16490 Project: Hive Issue Type: Improvement Components: Encryption Affects Versions: 2.2.0 Reporter: Andrew Wang Priority: Critical When compiling against bleeding edge versions of Hive and Hadoop, we discovered that HIVE-16047 references a private HDFS API, DFSClient, to get at various encryption related information. The private API was recently changed by HADOOP-14104, which broke Hive compilation. It'd be better to instead use publicly supported APIs. HDFS-11687 has been filed to add whatever encryption APIs are needed by Hive. This JIRA is to move Hive over to these new APIs. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-15794) Support get hdfsEncryptionShim if FileSystem is ViewFileSystem
Yuming Wang created HIVE-15794: -- Summary: Support get hdfsEncryptionShim if FileSystem is ViewFileSystem Key: HIVE-15794 URL: https://issues.apache.org/jira/browse/HIVE-15794 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 1.1.0, 1.2.0, 2.2.0 Reporter: Yuming Wang Assignee: Yuming Wang *SQL*: {code:sql} hive> create table table2 as select * from table1; hive> show create table table2; OK CREATE TABLE `table2`( `id` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 'viewfs://cluster4/user/hive/warehouse/table2' TBLPROPERTIES ( 'transient_lastDdlTime'='1486050317') {code} *LOG*: {noformat} 2017-02-02T20:12:49,738 INFO [99374b82-e9ca-4654-b803-93b194b9331b main] session.SessionState: Could not get hdfsEncryptionShim, it is only applicable to hdfs filesystem. 2017-02-02T20:12:49,738 INFO [99374b82-e9ca-4654-b803-93b194b9331b main] session.SessionState: Could not get hdfsEncryptionShim, it is only applicable to hdfs filesystem. {noformat} Can’t get hdfsEncryptionShim if {{FileSystem}} is [ViewFileSystem|http://hadoop.apache.org/docs/r2.6.5/hadoop-project-dist/hadoop-hdfs/ViewFs.html], we should support it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-15379) Get the key of hive.metastore.* values should be consistent with Hive Metastore Server.
Yuming Wang created HIVE-15379: -- Summary: Get the key of hive.metastore.* values should be consistent with Hive Metastore Server. Key: HIVE-15379 URL: https://issues.apache.org/jira/browse/HIVE-15379 Project: Hive Issue Type: Bug Components: Beeline, CLI Affects Versions: 1.1.0 Reporter: Yuming Wang Priority: Minor Hive Metastore Server's {{hive.metastore.try.direct.sql=false}} when using Cloudera Manager. But cli or beeline read the client configure and return true. It is meaningless. {code} hive> set hive.metastore.try.direct.sql; hive.metastore.try.direct.sql=true hive> {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14918) Function concat_ws get a wrong value
Xiaowei Wang created HIVE-14918: --- Summary: Function concat_ws get a wrong value Key: HIVE-14918 URL: https://issues.apache.org/jira/browse/HIVE-14918 Project: Hive Issue Type: Bug Components: UDF Affects Versions: 2.0.1, 2.1.0, 2.0.0, 1.1.1 Reporter: Xiaowei Wang Assignee: Xiaowei Wang Priority: Critical Fix For: 2.1.0 FROM src INSERT OVERWRITE TABLE dest1 SELECT 'abc', 'xyz', '8675309' WHERE src.key = 86; SELECT concat_ws('.',NULL) FROM dest1 ; The result is a empty string "",but I think it should be return NULL . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14112) Join a HBase mapped big table shouldn't convert to MapJoin
Yuming Wang created HIVE-14112: -- Summary: Join a HBase mapped big table shouldn't convert to MapJoin Key: HIVE-14112 URL: https://issues.apache.org/jira/browse/HIVE-14112 Project: Hive Issue Type: Bug Components: StorageHandler Affects Versions: 1.1.0, 1.2.0 Reporter: Yuming Wang Assignee: Yuming Wang Priority: Minor Two tables, _hbasetable_risk_control_defense_idx_uid_ is HBase mapped table: {noformat} [root@dev01 ~]# hadoop fs -du -s -h /hbase/data/tandem/hbase-table-risk-control-defense-idx-uid 3.0 G 9.0 G /hbase/data/tandem/hbase-table-risk-control-defense-idx-uid [root@dev01 ~]# hadoop fs -du -s -h /user/hive/warehouse/openapi_invoke_base 6.6 G 19.7 G /user/hive/warehouse/openapi_invoke_base {noformat} The smallest table is 3.0G, is greater than _hive.mapjoin.smalltable.filesize_ and _hive.auto.convert.join.noconditionaltask.size_. When join these tables, Hive auto convert it to mapjoin: {noformat} hive> select count(*) from hbasetable_risk_control_defense_idx_uid t1 join openapi_invoke_base t2 on (t1.key=t2.merchantid); Query ID = root_2016062809_9f9d3f25-857b-412c-8a75-3d9228bd5ee5 Total jobs = 1 Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0 Execution log at: /tmp/root/root_2016062809_9f9d3f25-857b-412c-8a75-3d9228bd5ee5.log 2016-06-28 09:22:10 Starting to launch local task to process map join; maximum memory = 1908932608 {noformat} the root cause is hive use _/user/hive/warehouse/hbasetable_risk_control_defense_idx_uid_ as it location, but it empty. so hive auto convert it to mapjoin. My opinion is set right location when mapping HBase table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13293) Query occurs performance degradation after enabling parallel order by for Hive on sprak
Lifeng Wang created HIVE-13293: -- Summary: Query occurs performance degradation after enabling parallel order by for Hive on sprak Key: HIVE-13293 URL: https://issues.apache.org/jira/browse/HIVE-13293 Project: Hive Issue Type: Bug Components: Spark Affects Versions: 2.0.0 Reporter: Lifeng Wang Assignee: Xuefu Zhang I use TPCx-BB to do some performance test on Hive on Spark engine. And found query 10 has performance degradation when enabling parallel order by. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12874) dynamic partition insert project wrong column
bin wang created HIVE-12874: --- Summary: dynamic partition insert project wrong column Key: HIVE-12874 URL: https://issues.apache.org/jira/browse/HIVE-12874 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.14.1 Environment: hive 1.1.0-cdh5.4.8 Reporter: bin wang Assignee: Alan Gates We have two table as below: create table test ( id bigint comment ' id', ) PARTITIONED BY(etl_dt string) STORED AS ORC; create table test1 ( id bigint start_time int, ) PARTITIONED BY(etl_dt string) STORED AS ORC; we use sql like below to import rows from test1 to test: insert overwrite table test PARTITION(etl_dt) select id ,from_unixtime(start_time,'-MM-dd') as etl_dt from test1 where test1.etl_dt='2016-01-12'; but it behave wrong, it use test1.etl_dt as the test's partition value, not the 'etl_dt' in select. We think it's a bug, anyone to fix it? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12652) SymbolicTextInputFormat should supports the path with regex ,especially using CombineHiveInputFormat .Add test sql .
Xiaowei Wang created HIVE-12652: --- Summary: SymbolicTextInputFormat should supports the path with regex ,especially using CombineHiveInputFormat .Add test sql . Key: HIVE-12652 URL: https://issues.apache.org/jira/browse/HIVE-12652 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Xiaowei Wang Assignee: Xiaowei Wang Fix For: 1.2.1 1, In fact,SybolicTextInputFormat supports the path with regex .I add some test sql . 2, But ,when using CombineHiveInputFormat to merge small file , It cannot resolve the path with regex ,so it will get a wrong result. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12541) Using CombineHiveInputFormat with the origin inputformat SymbolicTextInputFormat ,it will get a wrong result
Xiaowei Wang created HIVE-12541: --- Summary: Using CombineHiveInputFormat with the origin inputformat SymbolicTextInputFormat ,it will get a wrong result Key: HIVE-12541 URL: https://issues.apache.org/jira/browse/HIVE-12541 Project: Hive Issue Type: Bug Affects Versions: 1.2.1, 1.2.0, 0.14.0 Reporter: Xiaowei Wang Assignee: Xiaowei Wang Table desc : {noformat} CREATE External TABLE `symlink_text_input_format`( `key` string, `value` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 'viewfs://nsX/user/hive/warehouse/symlink_text_input_format' {noformat} There is a link file in the dir '/user/hive/warehouse/symlink_text_input_format' , the content of the link file is {noformat} "viewfs://nsx/tmp/symlink* " {noformat} it contains one path ,and the path contains a regex! Execute the sql : {noformat} set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat; set mapred.min.split.size.per.rack= 0 ; set mapred.min.split.size.per.node= 0 ; set mapred.max.split.size= 0 ; select count(*) from symlink_text_input_format ; {noformat} It will result a wrong result :0 At the same time ,I add a test case in the patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12482) When execution.engine=tez,set mapreduce.job.name does not work.
Xiaowei Wang created HIVE-12482: --- Summary: When execution.engine=tez,set mapreduce.job.name does not work. Key: HIVE-12482 URL: https://issues.apache.org/jira/browse/HIVE-12482 Project: Hive Issue Type: Bug Components: Query Planning Affects Versions: 1.2.1, 1.0.1, 1.0.0, 0.14.0 Reporter: Xiaowei Wang Fix For: 0.14.1 When execution.engine=tez,set mapreduce.job.name does not work. In Tez mode, the default job name is "Hive_"+Sessionid ,for example HIVE-ce5784d0-320c-4fb9-8b0b-2d92539dfd9e .It is difficulty to distinguish job when there are too much jobs . A better way is to set the var of mapreduce.job.name .But set mapreduce.job.name does not work! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12303) HCatRecordSerDe throw a IndexOutOfBoundsException
Xiaowei Wang created HIVE-12303: --- Summary: HCatRecordSerDe throw a IndexOutOfBoundsException Key: HIVE-12303 URL: https://issues.apache.org/jira/browse/HIVE-12303 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 1.2.1, 0.14.0 Reporter: Xiaowei Wang Assignee: Sushanth Sowmyan Fix For: 1.2.1 When access hive table using hcatlog in Pig,sometime it throws a exception ! Exception {noformat} 2015-10-30 06:44:35,219 WARN [Thread-4] org.apache.hadoop.mapred.YarnChild: Exception running child : org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error converting read value to tuple at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76) at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:59) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:204) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: java.lang.IndexOutOfBoundsException: Index: 24, Size: 24 at java.util.ArrayList.rangeCheck(ArrayList.java:635) at java.util.ArrayList.get(ArrayList.java:411) at org.apache.hive.hcatalog.data.HCatRecordSerDe.serializeStruct(HCatRecordSerDe.java:175) at org.apache.hive.hcatalog.data.HCatRecordSerDe.serializeList(HCatRecordSerDe.java:244) at org.apache.hive.hcatalog.data.HCatRecordSerDe.serializeField(HCatRecordSerDe.java:196) at org.apache.hive.hcatalog.data.LazyHCatRecord.get(LazyHCatRecord.java:53) at org.apache.hive.hcatalog.data.LazyHCatRecord.get(LazyHCatRecord.java:97) at org.apache.hive.hcatalog.mapreduce.HCatRecordReader.nextKeyValue(HCatRecordReader.java:204) at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:63) ... 13 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12229) Custom script in query cannot be executed in yarn-cluster mode [Spark Branch].
Lifeng Wang created HIVE-12229: -- Summary: Custom script in query cannot be executed in yarn-cluster mode [Spark Branch]. Key: HIVE-12229 URL: https://issues.apache.org/jira/browse/HIVE-12229 Project: Hive Issue Type: Bug Components: Spark Affects Versions: 1.1.0 Reporter: Lifeng Wang Added one python script in the query and the python script cannot be found during execution in yarn-cluster mode. {noformat} 15/10/21 21:10:55 INFO exec.ScriptOperator: Executing [/usr/bin/python, q2-sessionize.py, 3600] 15/10/21 21:10:55 INFO exec.ScriptOperator: tablename=null 15/10/21 21:10:55 INFO exec.ScriptOperator: partname=null 15/10/21 21:10:55 INFO exec.ScriptOperator: alias=null 15/10/21 21:10:55 INFO spark.SparkRecordHandler: processing 10 rows: used memory = 324896224 15/10/21 21:10:55 INFO exec.ScriptOperator: ErrorStreamProcessor calling reporter.progress() /usr/bin/python: can't open file 'q2-sessionize.py': [Errno 2] No such file or directory 15/10/21 21:10:55 INFO exec.ScriptOperator: StreamThread OutputProcessor done 15/10/21 21:10:55 INFO exec.ScriptOperator: StreamThread ErrorProcessor done 15/10/21 21:10:55 INFO spark.SparkRecordHandler: processing 100 rows: used memory = 325619920 15/10/21 21:10:55 ERROR exec.ScriptOperator: Error in writing to script: Stream closed 15/10/21 21:10:55 INFO exec.ScriptOperator: The script did not consume all input data. This is considered as an error. 15/10/21 21:10:55 INFO exec.ScriptOperator: set hive.exec.script.allow.partial.consumption=true; to ignore it. 15/10/21 21:10:55 ERROR spark.SparkReduceRecordHandler: Fatal error: org.apache.hadoop.hive.ql.metadata.HiveException: Error while processing row (tag=0) {"key":{"reducesinkkey0":2,"reducesinkkey1":3316240655},"value":{"_col0":5529}} org.apache.hadoop.hive.ql.metadata.HiveException: Error while processing row (tag=0) {"key":{"reducesinkkey0":2,"reducesinkkey1":3316240655},"value":{"_col0":5529}} at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:340) at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:289) at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49) at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28) at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:95) at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.insertAll(BypassMergeSortShuffleWriter.java:99) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:88) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20001]: An error occurred while reading or writing to your custom script. It may have crashed with an error. at org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:453) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:331) ... 14 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11126) multiple insert fails when select with group by clause
Guodong Wang created HIVE-11126: --- Summary: multiple insert fails when select with group by clause Key: HIVE-11126 URL: https://issues.apache.org/jira/browse/HIVE-11126 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 0.12.0 Reporter: Guodong Wang When the select statement contains group by clause, multiple insert fails. Here is the sample sql. {code} from test_src_table insert overwrite table test_target_table partition(p) select src_id as id, lala as p group by src_id insert overwrite table test_target_table partition(p) select id, p from select src_id as id, papa as p group by src_id {code} The exception is like this {code} java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {_col0:1107625...@qq.com,_col1:lala} at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) at org.apache.hadoop.mapred.Child$4.run(Child.java:268) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.mapred.Child.main(Child.java:262) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {_col0:1107625...@qq.com,_col1:lala} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(Ex FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11095) SerDeUtils another bug ,when Text is reused
xiaowei wang created HIVE-11095: --- Summary: SerDeUtils another bug ,when Text is reused Key: HIVE-11095 URL: https://issues.apache.org/jira/browse/HIVE-11095 Project: Hive Issue Type: Bug Components: API, CLI Affects Versions: 1.2.0, 1.0.0, 0.14.0 Environment: Hadoop 2.3.0-cdh5.0.0 Hive 0.14 Reporter: xiaowei wang Assignee: xiaowei wang Priority: Critical Fix For: 1.2.0 the method transformTextFromUTF8 have a bug, When i query data from a lzo table , I found in results : the length of the current row is always largr than the previous row, and sometimes,the current row contains the contents of the previous row。 For example ,i execute a sql ,select * from web_searchhub where logdate=2015061003, the result of sql see blow.Notice that ,the second row content contains the first row content. INFO [03:00:05.589] HttpFrontServer::FrontSH msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003 INFO [03:00:05.594] 18941e66-9962-44ad-81bc-3519f47ba274 session=901,thread=223ession=3151,thread=254 2015061003 The content of origin lzo file content see below ,just 2 rows. INFO [03:00:05.635] b88e0473-7530-494c-82d8-e2d2ebd2666c_forweb session=3148,thread=285 INFO [03:00:05.635] HttpFrontServer::FrontSH msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285 I think this error is caused by the Text reuse,and I found the solutions . Addicational, table create sql is : CREATE EXTERNAL TABLE `web_searchhub`( `line` string) PARTITIONED BY ( `logdate` string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ' U' WITH SERDEPROPERTIES ( 'serialization.encoding'='GBK') STORED AS INPUTFORMAT com.hadoop.mapred.DeprecatedLzoTextInputFormat OUTPUTFORMAT org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat; LOCATION 'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' ; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10983) Lazysimpleserde bug when Text is reused
xiaowei wang created HIVE-10983: --- Summary: Lazysimpleserde bug when Text is reused Key: HIVE-10983 URL: https://issues.apache.org/jira/browse/HIVE-10983 Project: Hive Issue Type: Bug Components: API Affects Versions: 0.14.0 Environment: Hadoop 2.3.0-cdh5.0.0 Hive 0.14 Reporter: xiaowei wang Assignee: xiaowei wang Priority: Critical When i query data from a lzo table , I found in results : the length of the current row is always largr than the previous row, and sometimes,the current row contains the contents of the previous row。 For example ,i execute a sql ,select * from web_searchhub where logdate=2015061003, the result of sql see blow.Notice that ,the second row content contains the first row content. INFO [03:00:05.589] HttpFrontServer::FrontSH msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003 INFO [03:00:05.594] 18941e66-9962-44ad-81bc-3519f47ba274 session=901,thread=223ession=3151,thread=254 2015061003 The content of origin lzo file content see below ,just 2 rows. INFO [03:00:05.635] b88e0473-7530-494c-82d8-e2d2ebd2666c_forweb session=3148,thread=285 INFO [03:00:05.635] HttpFrontServer::FrontSH msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285 I think this error is caused by the Text reuse,and I found the solutions . Addicational, table create sql is : CREATE EXTERNAL TABLE `web_searchhub`( `line` string) PARTITIONED BY ( `logdate` string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\\U' WITH SERDEPROPERTIES ( 'serialization.encoding'='GBK') STORED AS INPUTFORMAT com.hadoop.mapred.DeprecatedLzoTextInputFormat OUTPUTFORMAT org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat; LOCATION 'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' ; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10790) orc file sql excute fail
xiaowei wang created HIVE-10790: --- Summary: orc file sql excute fail Key: HIVE-10790 URL: https://issues.apache.org/jira/browse/HIVE-10790 Project: Hive Issue Type: Bug Components: API Affects Versions: 0.14.0, 0.13.0 Environment: Hadoop 2.5.0-cdh5.3.2 hive 0.14 Reporter: xiaowei wang Assignee: xiaowei wang from a text table insert into a orc table,like as insert overwrite table custom.rank_less_orc_none partition(logdate='2015051500') select ur,rf,it,dt from custom.rank_text where logdate='2015051500'; will throws a error ,Error: java.lang.RuntimeException: Hive Runtime Error while closing operators at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:260) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.hadoop.fs.viewfs.NotInMountpointException: getDefaultReplication on empty path is invalid at org.apache.hadoop.fs.viewfs.ViewFileSystem.getDefaultReplication(ViewFileSystem.java:593) at org.apache.hadoop.hive.ql.io.orc.WriterImpl.getStream(WriterImpl.java:1750) at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1767) at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:2040) at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:105) at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:164) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:842) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:577) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:227) ... 8 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10237) create external table, location path contains space ,like '/user/hive/warehouse/custom.db/uigs_kmap '
xiaowei wang created HIVE-10237: --- Summary: create external table, location path contains space ,like '/user/hive/warehouse/custom.db/uigs_kmap ' Key: HIVE-10237 URL: https://issues.apache.org/jira/browse/HIVE-10237 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.13.1 Environment: Hadoop 2.3.0-cdh5.0.0 hive 0.13.1 Reporter: xiaowei wang when i want to create a external table and give the table a location ,i write a wront location path, /user/hive/warehouse/custom.db/uigs_kmap ,which contains a space at the end of the path。 I think hive will trim the space of the location,but it does not。 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10176) skip.header.line.count causes values to be skipped when performing insert values
Wenbo Wang created HIVE-10176: - Summary: skip.header.line.count causes values to be skipped when performing insert values Key: HIVE-10176 URL: https://issues.apache.org/jira/browse/HIVE-10176 Project: Hive Issue Type: Bug Affects Versions: 1.0.0 Reporter: Wenbo Wang When inserting values in to tables with TBLPROPERTIES (skip.header.line.count=1) the first value listed is also skipped. create table test (row int, name string) TBLPROPERTIES (skip.header.line.count=1); load data local inpath '/root/data' into table test; insert into table test values (1, 'a'), (2, 'b'), (3, 'c'); (1, 'a') isn't inserted into the table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9940) The standard output of Python reduce script can not be interpreted correctly by Hive
Eric Wang created HIVE-9940: --- Summary: The standard output of Python reduce script can not be interpreted correctly by Hive Key: HIVE-9940 URL: https://issues.apache.org/jira/browse/HIVE-9940 Project: Hive Issue Type: Bug Components: Hive Reporter: Eric Wang use HQL statement like: FROM ( select_statement ) map_output INSERT OVERWRITE TABLE table REDUCE map_output.a, map_output.b USING 'py_script' AS col1, col2; (1)original type stdout of Python has Records where the 2nd column = 'Meerjungfrau' 527500 Meerjungfrau25 AO DE 20140704 ... Hive interprets these as: 527500 Meernull AO DE 20140704 ... stderr_log interprets these as: 527500 Meerjungfrau25 AO DE 20140704 (2)change all 'Meerjungfrau' to 'bug' in Python script stdout of Python has Records where the 2nd column = 'bug' 527500 bug 25 AO DE 20140704 ... Hive interprets these as: 527500 b null AO DE 20140704 ... stderr_log interprets these as: 527500 bug 25 AO DE 20140704 (3)put 2nd column to the last column stdout of Python has Records where the 2nd column = 'Meerjungfrau' 527500 25 AO DE 20140704Meerjungfrau ... Hive interprets these as: 527500 25 null 20140704Meerjungfrau ... stderr_log interprets these as: 527500 25 AO DE 20140704Meerjungfrau -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8119) Implement Date in ParquetSerde
[ https://issues.apache.org/jira/browse/HIVE-8119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294832#comment-14294832 ] Adrian Wang commented on HIVE-8119: --- Is there any update? Implement Date in ParquetSerde -- Key: HIVE-8119 URL: https://issues.apache.org/jira/browse/HIVE-8119 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Mohit Sabharwal Date type in Parquet is discussed here: http://mail-archives.apache.org/mod_mbox/incubator-parquet-dev/201406.mbox/%3CCAKa9qDkp7xn+H8fNZC7ms3ckd=xr8gdpe7gqgj5o+pybdem...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8439) query processor fails to handle multiple insert clauses for the same table
[ https://issues.apache.org/jira/browse/HIVE-8439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gordon Wang updated HIVE-8439: -- Summary: query processor fails to handle multiple insert clauses for the same table (was: multiple insert into the same table) query processor fails to handle multiple insert clauses for the same table -- Key: HIVE-8439 URL: https://issues.apache.org/jira/browse/HIVE-8439 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0, 0.13.0 Reporter: Gordon Wang when putting multiple inserts for the same table in one SQL, hive query plan analyzer fails to synthesis the right plan. Here is the reproduce steps. {noformat} create table T1(i int, j int); create table T2(m int) partitioned by (n int); explain from T1 insert into table T2 partition (n = 1) select T1.i where T1.j = 1 insert overwrite table T2 partition (n = 2) select T1.i where T1.j = 2 ; {noformat} When there is a insert into clause in the multiple insert part, the insert overwrite is considered as insert into. I dig into the source code, looks like Hive does not support mixing insert into and insert overwrite for the same table in multiple insert clauses. Here is my finding. 1. in semantic analyzer, when processing TOK_INSERT_INTO, the analyzer will put the table name into a set which contains all the insert into table names. 2. when generating file sink plan, the analyzer will check if the table name is in the set, if in the set, the replace flag is set to false. Here is the code snippet. {noformat} // Create the work for moving the table // NOTE: specify Dynamic partitions in dest_tab for WriteEntity if (!isNonNativeTable) { ltd = new LoadTableDesc(queryTmpdir, ctx.getExternalTmpFileURI(dest_path.toUri()), table_desc, dpCtx); ltd.setReplace(!qb.getParseInfo().isInsertIntoTable(dest_tab.getDbName(), dest_tab.getTableName())); ltd.setLbCtx(lbCtx); if (holdDDLTime) { LOG.info(this query will not update transient_lastDdlTime!); ltd.setHoldDDLTime(true); } loadTableWork.add(ltd); } {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8439) multiple insert into the same table
[ https://issues.apache.org/jira/browse/HIVE-8439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gordon Wang updated HIVE-8439: -- Description: when putting multiple inserts for the same table in one SQL, hive query plan analyzer fails to synthesis the right plan. Here is the reproduce steps. {noformat} create table T1(i int, j int); create table T2(m int) partitioned by (n int); explain from T1 insert into table T2 partition (n = 1) select T1.i where T1.j = 1 insert overwrite table T2 partition (n = 2) select T1.i where T1.j = 2 ; {noformat} When there is a insert into clause in the multiple insert part, the insert overwrite is considered as insert into. I dig into the source code, looks like Hive does not support mixing insert into and insert overwrite for the same table in multiple insert clauses. Here is my finding. 1. in semantic analyzer, when processing TOK_INSERT_INTO, the analyzer will put the table name into a set which contains all the insert into table names. 2. when generating file sink plan, the analyzer will check if the table name is in the set, if in the set, the replace flag is set to false. Here is the code snippet. {noformat} // Create the work for moving the table // NOTE: specify Dynamic partitions in dest_tab for WriteEntity if (!isNonNativeTable) { ltd = new LoadTableDesc(queryTmpdir, ctx.getExternalTmpFileURI(dest_path.toUri()), table_desc, dpCtx); ltd.setReplace(!qb.getParseInfo().isInsertIntoTable(dest_tab.getDbName(), dest_tab.getTableName())); ltd.setLbCtx(lbCtx); if (holdDDLTime) { LOG.info(this query will not update transient_lastDdlTime!); ltd.setHoldDDLTime(true); } loadTableWork.add(ltd); } {noformat} was: when putting multiple inserts for the same table in one SQL, hive query plan analyzer fails to synthesis the right plan. Here is the reproduce steps. {noformat} create table T1(i int, j int); create table T2(m int) partitioned by (n int); explain from T1 insert into table T2 partition (n = 1) select T1.i where T1.j = 1 insert overwrite table T2 partition (n = 2) select T1.i where T1.j = 2 ; {noformat} When there is a insert into clause in the multiple insert part, the insert overwrite is considered as insert into. I dig into the source code, looks like Hive does not support mixing insert into and insert overwrite for the same table in multiple insert clauses. Here is my finding. 1. in semantic analyzer, when processing TOK_INSERT_INTO, the analyzer will put the table name into a set which contains all the insert into table names. 2. when generate file sink plan, the analyzer will check if the table name is in the set, if in the set, the replace flag is set to false. Here is the code snippet. {noformat} // Create the work for moving the table // NOTE: specify Dynamic partitions in dest_tab for WriteEntity if (!isNonNativeTable) { ltd = new LoadTableDesc(queryTmpdir, ctx.getExternalTmpFileURI(dest_path.toUri()), table_desc, dpCtx); ltd.setReplace(!qb.getParseInfo().isInsertIntoTable(dest_tab.getDbName(), dest_tab.getTableName())); ltd.setLbCtx(lbCtx); if (holdDDLTime) { LOG.info(this query will not update transient_lastDdlTime!); ltd.setHoldDDLTime(true); } loadTableWork.add(ltd); } {noformat} multiple insert into the same table --- Key: HIVE-8439 URL: https://issues.apache.org/jira/browse/HIVE-8439 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0, 0.13.0 Reporter: Gordon Wang when putting multiple inserts for the same table in one SQL, hive query plan analyzer fails to synthesis the right plan. Here is the reproduce steps. {noformat} create table T1(i int, j int); create table T2(m int) partitioned by (n int); explain from T1 insert into table T2 partition (n = 1) select T1.i where T1.j = 1 insert overwrite table T2 partition (n = 2) select T1.i where T1.j = 2 ; {noformat} When there is a insert into clause in the multiple insert part, the insert overwrite is considered as insert into. I dig into the source code, looks like Hive does not support mixing insert into and insert overwrite for the same table in multiple insert clauses. Here is my finding. 1. in semantic analyzer, when processing TOK_INSERT_INTO, the analyzer will put the table name into a set which contains all the insert into table names. 2. when generating file sink plan, the analyzer will check if the table name is in the set, if in the set, the replace flag is set to false. Here is the code snippet. {noformat} // Create the work for moving the table // NOTE: specify Dynamic
[jira] [Commented] (HIVE-8439) multiple insert into the same table
[ https://issues.apache.org/jira/browse/HIVE-8439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184938#comment-14184938 ] Gordon Wang commented on HIVE-8439: --- currently, hive semantic analyzer can not handle multiple insert clause correctly. When mixing INSERT INTO and INSERT OVERWRITE with the same table, semantic analyzer can not aware which clause is OVERWRITE. Some more information about overwrite clause should be recorded in QueryBlock. multiple insert into the same table --- Key: HIVE-8439 URL: https://issues.apache.org/jira/browse/HIVE-8439 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0, 0.13.0 Reporter: Gordon Wang when putting multiple inserts for the same table in one SQL, hive query plan analyzer fails to synthesis the right plan. Here is the reproduce steps. {noformat} create table T1(i int, j int); create table T2(m int) partitioned by (n int); explain from T1 insert into table T2 partition (n = 1) select T1.i where T1.j = 1 insert overwrite table T2 partition (n = 2) select T1.i where T1.j = 2 ; {noformat} When there is a insert into clause in the multiple insert part, the insert overwrite is considered as insert into. I dig into the source code, looks like Hive does not support mixing insert into and insert overwrite for the same table in multiple insert clauses. Here is my finding. 1. in semantic analyzer, when processing TOK_INSERT_INTO, the analyzer will put the table name into a set which contains all the insert into table names. 2. when generating file sink plan, the analyzer will check if the table name is in the set, if in the set, the replace flag is set to false. Here is the code snippet. {noformat} // Create the work for moving the table // NOTE: specify Dynamic partitions in dest_tab for WriteEntity if (!isNonNativeTable) { ltd = new LoadTableDesc(queryTmpdir, ctx.getExternalTmpFileURI(dest_path.toUri()), table_desc, dpCtx); ltd.setReplace(!qb.getParseInfo().isInsertIntoTable(dest_tab.getDbName(), dest_tab.getTableName())); ltd.setLbCtx(lbCtx); if (holdDDLTime) { LOG.info(this query will not update transient_lastDdlTime!); ltd.setHoldDDLTime(true); } loadTableWork.add(ltd); } {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8532) return code of source xxx clause is missing
[ https://issues.apache.org/jira/browse/HIVE-8532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182375#comment-14182375 ] Gordon Wang commented on HIVE-8532: --- Looks like the UT failure is not caused by this patch. The failure UT is not in the changed code path. return code of source xxx clause is missing - Key: HIVE-8532 URL: https://issues.apache.org/jira/browse/HIVE-8532 Project: Hive Issue Type: Bug Components: Clients Affects Versions: 0.12.0, 0.13.1 Reporter: Gordon Wang Attachments: HIVE-8532.patch When executing source hql-file clause, hive client driver does not catch the return code of this command. This behaviour causes an issue when running hive query in Oozie workflow. When the source clause is put into a Oozie workflow, Oozie can not get the return code of this command. Thus, Oozie consider the source clause as successful all the time. So, when the source clause fails, the hive query does not abort and the oozie workflow does not abort either. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8532) return code of source xxx clause is missing
Gordon Wang created HIVE-8532: - Summary: return code of source xxx clause is missing Key: HIVE-8532 URL: https://issues.apache.org/jira/browse/HIVE-8532 Project: Hive Issue Type: Bug Components: Clients Affects Versions: 0.13.1, 0.12.0 Reporter: Gordon Wang When executing source hql-file clause, hive client driver does not catch the return code of this command. This behaviour causes an issue when running hive query in Oozie workflow. When the source clause is put into a Oozie workflow, Oozie can not get the return code of this command. Thus, Oozie consider the source clause as successful all the time. So, when the source clause fails, the hive query does not abort and the oozie workflow does not abort either. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8532) return code of source xxx clause is missing
[ https://issues.apache.org/jira/browse/HIVE-8532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14177832#comment-14177832 ] Gordon Wang commented on HIVE-8532: --- The fix is easy, I think a patch would come soon. return code of source xxx clause is missing - Key: HIVE-8532 URL: https://issues.apache.org/jira/browse/HIVE-8532 Project: Hive Issue Type: Bug Components: Clients Affects Versions: 0.12.0, 0.13.1 Reporter: Gordon Wang When executing source hql-file clause, hive client driver does not catch the return code of this command. This behaviour causes an issue when running hive query in Oozie workflow. When the source clause is put into a Oozie workflow, Oozie can not get the return code of this command. Thus, Oozie consider the source clause as successful all the time. So, when the source clause fails, the hive query does not abort and the oozie workflow does not abort either. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8439) multiple insert into the same table
Gordon Wang created HIVE-8439: - Summary: multiple insert into the same table Key: HIVE-8439 URL: https://issues.apache.org/jira/browse/HIVE-8439 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.13.0, 0.12.0 Reporter: Gordon Wang when putting multiple inserts for the same table in one SQL, hive query plan analyzer fails to synthesis the right plan. Here is the reproduce steps. {noformat} create table T1(i int, j int); create table T2(m int) partitioned by (n int); explain from T1 insert into table T2 partition (n = 1) select T1.i where T1.j = 1 insert overwrite table T2 partition (n = 2) select T1.i where T1.j = 2 ; {noformat} When there is a insert into clause in the multiple insert part, the insert overwrite is considered as insert into. I dig into the source code, looks like Hive does not support mixing insert into and insert overwrite for the same table in multiple insert clauses. Here is my finding. 1. in semantic analyzer, when processing TOK_INSERT_INTO, the analyzer will put the table name into a set which contains all the insert into table names. 2. when generate file sink plan, the analyzer will check if the table name is in the set, if in the set, the replace flag is set to false. Here is the code snippet. {noformat} // Create the work for moving the table // NOTE: specify Dynamic partitions in dest_tab for WriteEntity if (!isNonNativeTable) { ltd = new LoadTableDesc(queryTmpdir, ctx.getExternalTmpFileURI(dest_path.toUri()), table_desc, dpCtx); ltd.setReplace(!qb.getParseInfo().isInsertIntoTable(dest_tab.getDbName(), dest_tab.getTableName())); ltd.setLbCtx(lbCtx); if (holdDDLTime) { LOG.info(this query will not update transient_lastDdlTime!); ltd.setHoldDDLTime(true); } loadTableWork.add(ltd); } {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-2777) ability to add and drop partitions atomically
[ https://issues.apache.org/jira/browse/HIVE-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinyu Wang updated HIVE-2777: - Attachment: (was: hive-2777.patch) ability to add and drop partitions atomically - Key: HIVE-2777 URL: https://issues.apache.org/jira/browse/HIVE-2777 Project: Hive Issue Type: New Feature Components: Metastore Affects Versions: 0.13.0 Reporter: Aniket Mokashi Assignee: Aniket Mokashi Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2777.D2271.1.patch Hive should have ability to atomically add and drop partitions. This way admins can change partitions atomically without breaking the running jobs. It allows admin to merge several partitions into one. Essentially, we would like to have an api- add_drop_partitions(String db, String tbl_name, ListPartition addParts, ListListString dropParts, boolean deleteData); This jira covers changes required for metastore and thrift. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-2777) ability to add and drop partitions atomically
[ https://issues.apache.org/jira/browse/HIVE-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinyu Wang updated HIVE-2777: - Attachment: hive-2777-updated.patch Updated hive-2777 patch which fixed all the testPartition() tests. Also for the other failed tests, they are failing as well in branch-0.13 too. So please help rerun the test again. Thank you! ability to add and drop partitions atomically - Key: HIVE-2777 URL: https://issues.apache.org/jira/browse/HIVE-2777 Project: Hive Issue Type: New Feature Components: Metastore Affects Versions: 0.13.0 Reporter: Aniket Mokashi Assignee: Aniket Mokashi Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2777.D2271.1.patch, hive-2777-updated.patch Hive should have ability to atomically add and drop partitions. This way admins can change partitions atomically without breaking the running jobs. It allows admin to merge several partitions into one. Essentially, we would like to have an api- add_drop_partitions(String db, String tbl_name, ListPartition addParts, ListListString dropParts, boolean deleteData); This jira covers changes required for metastore and thrift. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6179) OOM occurs when query spans to a large number of partitions
[ https://issues.apache.org/jira/browse/HIVE-6179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] perry wang updated HIVE-6179: - Description: When executing a query against a large number of partitions, such as select count(*) from table, OOM error may occur because Hive fetches the metadata for all partitions involved and tries to store it in memory. {code} 2014-01-09 13:14:17,090 ERROR metastore.RetryingHMSHandler (RetryingHMSHandler.java:invoke(141)) - java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2367) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130) at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415) at java.lang.StringBuffer.append(StringBuffer.java:237) at org.apache.derby.impl.sql.conn.GenericStatementContext.appendErrorInfo(Unknown Source) at org.apache.derby.iapi.services.context.ContextManager.cleanupOnError(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source) at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedResultSet.closeOnTransactionError(Unknown Source) at org.apache.derby.impl.jdbc.EmbedResultSet.movePosition(Unknown Source) at org.apache.derby.impl.jdbc.EmbedResultSet.next(Unknown Source) at org.datanucleus.store.rdbms.query.ForwardQueryResult.nextResultSetElement(ForwardQueryResult.java:191) at org.datanucleus.store.rdbms.query.ForwardQueryResult$QueryResultIterator.next(ForwardQueryResult.java:379) at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.loopJoinOrderedResult(MetaStoreDirectSql.java:641) at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:410) at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitions(MetaStoreDirectSql.java:205) at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsInternal(ObjectStore.java:1433) at org.apache.hadoop.hive.metastore.ObjectStore.getPartitions(ObjectStore.java:1420) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:122) at com.sun.proxy.$Proxy7.getPartitions(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions(HiveMetaStore.java:2128) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103) {code} The above error happened when executing select count(*) on a table with 40K partitions. was: When executing a query against a large number of partitions, such as select count(*) from table, OOM error may occur because Hive fetches the metadata for all partitions involved and tries to store it in memory. {code} 2014-01-09 13:14:17,090 ERROR metastore.RetryingHMSHandler (RetryingHMSHandler.java:invoke(141)) - java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2367) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130) at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415) at java.lang.StringBuffer.append(StringBuffer.java:237) at org.apache.derby.impl.sql.conn.GenericStatementContext.appendErrorInfo(Unknown Source) at org.apache.derby.iapi.services.context.ContextManager.cleanupOnError(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source) at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedResultSet.closeOnTransactionError(Unknown Source) at org.apache.derby.impl.jdbc.EmbedResultSet.movePosition(Unknown Source) at
[jira] [Commented] (HIVE-7384) Research into reduce-side join [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14105301#comment-14105301 ] Lianhui Wang commented on HIVE-7384: i think current spark already support hash by join_col,sort by {join_col,tag}. because in spark map's shuffleWriter hash by Key.hashcode and sort by Key and in Hive HiveKey class already define the hashcode. so that can support hash by HiveKey.hashcode, sort by HiveKey's bytes Research into reduce-side join [Spark Branch] - Key: HIVE-7384 URL: https://issues.apache.org/jira/browse/HIVE-7384 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Szehon Ho Attachments: Hive on Spark Reduce Side Join.docx, sales_items.txt, sales_products.txt, sales_stores.txt Hive's join operator is very sophisticated, especially for reduce-side join. While we expect that other types of join, such as map-side join and SMB map-side join, will work out of the box with our design, there may be some complication in reduce-side join, which extensively utilizes key tag and shuffle behavior. Our design principle prefers to making Hive implementation work out of box also, which might requires new functionality from Spark. The tasks is to research into this area, identifying requirements for Spark community and the work to be done on Hive to make reduce-side join work. A design doc might be needed for this. For more information, please refer to the overall design doc on wiki. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7384) Research into reduce-side join [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106343#comment-14106343 ] Lianhui Wang commented on HIVE-7384: @Szehon Ho yes,i read OrderedRDDFunctions code and discove that sortByKey actually does a range-partition. we need to replace range-partition with hash partition. so spark maybe should create a new interface example: partitionSortByKey. @Brock Noland code in 1) means when sample data and more than one reducers, Hive does a total order sort. so join does not sample data, it does not need a total order sort. 2) i think we really need auto-parallelism. before i talk it with Reynold Xin, spark need to support re-partition mapoutput's data as same as tez does. Research into reduce-side join [Spark Branch] - Key: HIVE-7384 URL: https://issues.apache.org/jira/browse/HIVE-7384 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Szehon Ho Attachments: Hive on Spark Reduce Side Join.docx, sales_items.txt, sales_products.txt, sales_stores.txt Hive's join operator is very sophisticated, especially for reduce-side join. While we expect that other types of join, such as map-side join and SMB map-side join, will work out of the box with our design, there may be some complication in reduce-side join, which extensively utilizes key tag and shuffle behavior. Our design principle prefers to making Hive implementation work out of box also, which might requires new functionality from Spark. The tasks is to research into this area, identifying requirements for Spark community and the work to be done on Hive to make reduce-side join work. A design doc might be needed for this. For more information, please refer to the overall design doc on wiki. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7384) Research into reduce-side join [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106407#comment-14106407 ] Lianhui Wang commented on HIVE-7384: i think the thoughts is same as ideas that you said before. like HIVE-7158, that will auto-calculate the number of reducers based on some input from Hive (upper/lower bound). Research into reduce-side join [Spark Branch] - Key: HIVE-7384 URL: https://issues.apache.org/jira/browse/HIVE-7384 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Szehon Ho Attachments: Hive on Spark Reduce Side Join.docx, sales_items.txt, sales_products.txt, sales_stores.txt Hive's join operator is very sophisticated, especially for reduce-side join. While we expect that other types of join, such as map-side join and SMB map-side join, will work out of the box with our design, there may be some complication in reduce-side join, which extensively utilizes key tag and shuffle behavior. Our design principle prefers to making Hive implementation work out of box also, which might requires new functionality from Spark. The tasks is to research into this area, identifying requirements for Spark community and the work to be done on Hive to make reduce-side join work. A design doc might be needed for this. For more information, please refer to the overall design doc on wiki. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7645) Hive CompactorMR job set NUM_BUCKETS mistake
[ https://issues.apache.org/jira/browse/HIVE-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14090728#comment-14090728 ] Xiaoyu Wang commented on HIVE-7645: --- This error should not cause by this patch! Hive CompactorMR job set NUM_BUCKETS mistake Key: HIVE-7645 URL: https://issues.apache.org/jira/browse/HIVE-7645 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.13.1 Reporter: Xiaoyu Wang Attachments: HIVE-7645.patch code: job.setInt(NUM_BUCKETS, sd.getBucketColsSize()); should change to: job.setInt(NUM_BUCKETS, sd.getNumBuckets()); -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7645) Hive CompactorMR job set NUM_BUCKETS mistake
Xiaoyu Wang created HIVE-7645: - Summary: Hive CompactorMR job set NUM_BUCKETS mistake Key: HIVE-7645 URL: https://issues.apache.org/jira/browse/HIVE-7645 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.13.1 Reporter: Xiaoyu Wang code: job.setInt(NUM_BUCKETS, sd.getBucketColsSize()); should change to: job.setInt(NUM_BUCKETS, sd.getNumBuckets()); -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7645) Hive CompactorMR job set NUM_BUCKETS mistake
[ https://issues.apache.org/jira/browse/HIVE-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Wang updated HIVE-7645: -- Attachment: HIVE-7645.patch Hive CompactorMR job set NUM_BUCKETS mistake Key: HIVE-7645 URL: https://issues.apache.org/jira/browse/HIVE-7645 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.13.1 Reporter: Xiaoyu Wang Attachments: HIVE-7645.patch code: job.setInt(NUM_BUCKETS, sd.getBucketColsSize()); should change to: job.setInt(NUM_BUCKETS, sd.getNumBuckets()); -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7645) Hive CompactorMR job set NUM_BUCKETS mistake
[ https://issues.apache.org/jira/browse/HIVE-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Wang updated HIVE-7645: -- Status: Patch Available (was: Open) Hive CompactorMR job set NUM_BUCKETS mistake Key: HIVE-7645 URL: https://issues.apache.org/jira/browse/HIVE-7645 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.13.1 Reporter: Xiaoyu Wang Attachments: HIVE-7645.patch code: job.setInt(NUM_BUCKETS, sd.getBucketColsSize()); should change to: job.setInt(NUM_BUCKETS, sd.getNumBuckets()); -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7483) hive insert overwrite table select from self dead lock
[ https://issues.apache.org/jira/browse/HIVE-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072654#comment-14072654 ] Xiaoyu Wang commented on HIVE-7483: --- but still deadlock. hive insert overwrite table select from self dead lock -- Key: HIVE-7483 URL: https://issues.apache.org/jira/browse/HIVE-7483 Project: Hive Issue Type: Bug Components: Locking Affects Versions: 0.13.1 Reporter: Xiaoyu Wang CREATE TABLE test( id int, msg string) PARTITIONED BY ( continent string, country string) CLUSTERED BY (id) INTO 10 BUCKETS STORED AS ORC; alter table test add partition(continent='Asia',country='India'); in hive-site.xml: hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; hive.support.concurrency=true; in hive shell: set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; insert into test table some records first. then execute sql: insert overwrite table test partition(continent='Asia',country='India') select id,msg from test; the log stop at : INFO log.PerfLogger: PERFLOG method=acquireReadWriteLocks from=org.apache.hadoop.hive.ql.Driver i think it has dead lock when insert overwrite table from it self. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7483) hive insert overwrite table select from self dead lock
Xiaoyu Wang created HIVE-7483: - Summary: hive insert overwrite table select from self dead lock Key: HIVE-7483 URL: https://issues.apache.org/jira/browse/HIVE-7483 Project: Hive Issue Type: Bug Components: Locking Affects Versions: 0.13.1 Reporter: Xiaoyu Wang CREATE TABLE test( id int, msg string) PARTITIONED BY ( continent string, country string) CLUSTERED BY (id) INTO 10 BUCKETS STORED AS ORC; alter table test add partition(continent='Asia',country='India'); in hive-site.xml: hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; hive.support.concurrency=true; hive.zookeeper.quorum=zk1,zk2,zk3; in hive shell: set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; insert into test table some records first. then execute sql: insert overwrite table test partition(continent='Asia',country='India') select id,msg from test; the log stop at : INFO log.PerfLogger: PERFLOG method=acquireReadWriteLocks from=org.apache.hadoop.hive.ql.Driver i think it has dead lock when insert overwrite table from it self. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7483) hive insert overwrite table select from self dead lock
[ https://issues.apache.org/jira/browse/HIVE-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071356#comment-14071356 ] Xiaoyu Wang commented on HIVE-7483: --- yes you are right! hive insert overwrite table select from self dead lock -- Key: HIVE-7483 URL: https://issues.apache.org/jira/browse/HIVE-7483 Project: Hive Issue Type: Bug Components: Locking Affects Versions: 0.13.1 Reporter: Xiaoyu Wang CREATE TABLE test( id int, msg string) PARTITIONED BY ( continent string, country string) CLUSTERED BY (id) INTO 10 BUCKETS STORED AS ORC; alter table test add partition(continent='Asia',country='India'); in hive-site.xml: hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; hive.support.concurrency=true; hive.zookeeper.quorum=zk1,zk2,zk3; in hive shell: set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; insert into test table some records first. then execute sql: insert overwrite table test partition(continent='Asia',country='India') select id,msg from test; the log stop at : INFO log.PerfLogger: PERFLOG method=acquireReadWriteLocks from=org.apache.hadoop.hive.ql.Driver i think it has dead lock when insert overwrite table from it self. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7483) hive insert overwrite table select from self dead lock
[ https://issues.apache.org/jira/browse/HIVE-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Wang updated HIVE-7483: -- Description: CREATE TABLE test( id int, msg string) PARTITIONED BY ( continent string, country string) CLUSTERED BY (id) INTO 10 BUCKETS STORED AS ORC; alter table test add partition(continent='Asia',country='India'); in hive-site.xml: hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; hive.support.concurrency=true; in hive shell: set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; insert into test table some records first. then execute sql: insert overwrite table test partition(continent='Asia',country='India') select id,msg from test; the log stop at : INFO log.PerfLogger: PERFLOG method=acquireReadWriteLocks from=org.apache.hadoop.hive.ql.Driver i think it has dead lock when insert overwrite table from it self. was: CREATE TABLE test( id int, msg string) PARTITIONED BY ( continent string, country string) CLUSTERED BY (id) INTO 10 BUCKETS STORED AS ORC; alter table test add partition(continent='Asia',country='India'); in hive-site.xml: hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; hive.support.concurrency=true; hive.zookeeper.quorum=zk1,zk2,zk3; in hive shell: set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; insert into test table some records first. then execute sql: insert overwrite table test partition(continent='Asia',country='India') select id,msg from test; the log stop at : INFO log.PerfLogger: PERFLOG method=acquireReadWriteLocks from=org.apache.hadoop.hive.ql.Driver i think it has dead lock when insert overwrite table from it self. hive insert overwrite table select from self dead lock -- Key: HIVE-7483 URL: https://issues.apache.org/jira/browse/HIVE-7483 Project: Hive Issue Type: Bug Components: Locking Affects Versions: 0.13.1 Reporter: Xiaoyu Wang CREATE TABLE test( id int, msg string) PARTITIONED BY ( continent string, country string) CLUSTERED BY (id) INTO 10 BUCKETS STORED AS ORC; alter table test add partition(continent='Asia',country='India'); in hive-site.xml: hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; hive.support.concurrency=true; in hive shell: set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; insert into test table some records first. then execute sql: insert overwrite table test partition(continent='Asia',country='India') select id,msg from test; the log stop at : INFO log.PerfLogger: PERFLOG method=acquireReadWriteLocks from=org.apache.hadoop.hive.ql.Driver i think it has dead lock when insert overwrite table from it self. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-707) add group_concat
[ https://issues.apache.org/jira/browse/HIVE-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052301#comment-14052301 ] Jian Wang commented on HIVE-707: [~ph4t] I use this concat_ws(' ', map_keys(UNION_MAP(MAP(your_column, 'dummy' method instead of group_concat,but I got a error like this {code} FAILED: SemanticException [Error 10011]: Line 172:30 Invalid function 'UNION_MAP' {/code} should I add some jars ? add group_concat Key: HIVE-707 URL: https://issues.apache.org/jira/browse/HIVE-707 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Min Zhou Moving the discussion to a new jira: I've implemented group_cat() in a rush, and found something difficult to slove: 1. function group_cat() has a internal order by clause, currently, we can't implement such an aggregation in hive. 2. when the strings will be group concated are too large, in another words, if data skew appears, there is often not enough memory to store such a big result. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-2777) ability to add and drop partitions atomically
[ https://issues.apache.org/jira/browse/HIVE-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinyu Wang updated HIVE-2777: - Attachment: (was: hive-2777.patch) ability to add and drop partitions atomically - Key: HIVE-2777 URL: https://issues.apache.org/jira/browse/HIVE-2777 Project: Hive Issue Type: New Feature Components: Metastore Affects Versions: 0.13.0 Reporter: Aniket Mokashi Assignee: Aniket Mokashi Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2777.D2271.1.patch Hive should have ability to atomically add and drop partitions. This way admins can change partitions atomically without breaking the running jobs. It allows admin to merge several partitions into one. Essentially, we would like to have an api- add_drop_partitions(String db, String tbl_name, ListPartition addParts, ListListString dropParts, boolean deleteData); This jira covers changes required for metastore and thrift. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-2777) ability to add and drop partitions atomically
[ https://issues.apache.org/jira/browse/HIVE-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinyu Wang updated HIVE-2777: - Attachment: hive-2777.patch Sorry for the previous patch, I rebased it, and it seems fine now. Can someone please review? ability to add and drop partitions atomically - Key: HIVE-2777 URL: https://issues.apache.org/jira/browse/HIVE-2777 Project: Hive Issue Type: New Feature Components: Metastore Affects Versions: 0.13.0 Reporter: Aniket Mokashi Assignee: Aniket Mokashi Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2777.D2271.1.patch, hive-2777.patch Hive should have ability to atomically add and drop partitions. This way admins can change partitions atomically without breaking the running jobs. It allows admin to merge several partitions into one. Essentially, we would like to have an api- add_drop_partitions(String db, String tbl_name, ListPartition addParts, ListListString dropParts, boolean deleteData); This jira covers changes required for metastore and thrift. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6765) ASTNodeOrigin unserializable leads to fail when join with view
[ https://issues.apache.org/jira/browse/HIVE-6765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989287#comment-13989287 ] Adrian Wang commented on HIVE-6765: --- [~cdrome] good catch. I knew the serialization in Hive has been notorious for a long time, but I didn't know the progress they made there. Actually, I was real curious when I saw my case was OK with Tez with hive-0.13, while I never tried Apache's hive-0.13 since there was no official release. ASTNodeOrigin unserializable leads to fail when join with view -- Key: HIVE-6765 URL: https://issues.apache.org/jira/browse/HIVE-6765 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Adrian Wang Fix For: 0.13.0 Attachments: HIVE-6765.patch.1 when a view contains a UDF, and the view comes into a JOIN operation, Hive will encounter a bug with stack trace like Caused by: java.lang.InstantiationException: org.apache.hadoop.hive.ql.parse.ASTNodeOrigin at java.lang.Class.newInstance0(Class.java:359) at java.lang.Class.newInstance(Class.java:327) at sun.reflect.GeneratedMethodAccessor84.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6765) ASTNodeOrigin unserializable leads to fail when join with view
[ https://issues.apache.org/jira/browse/HIVE-6765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13988648#comment-13988648 ] Adrian Wang commented on HIVE-6765: --- [~selinazh] Thanks for your comment! It's so glad that someone also noticed this. Actually, I found that only when there was something like an aggregation function in the view, will the problem came up. The problem results from cloning the plan, but when joining with view as described, the plan would contain a node of ASTNodeOrigin, which does not have a default construct method, in which case when duplicating, exception will be thrown. Could you please try to apply my patch here to see whether your problem is resolved? Thanks again. ASTNodeOrigin unserializable leads to fail when join with view -- Key: HIVE-6765 URL: https://issues.apache.org/jira/browse/HIVE-6765 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Adrian Wang Fix For: 0.13.0 Attachments: HIVE-6765.patch.1 when a view contains a UDF, and the view comes into a JOIN operation, Hive will encounter a bug with stack trace like Caused by: java.lang.InstantiationException: org.apache.hadoop.hive.ql.parse.ASTNodeOrigin at java.lang.Class.newInstance0(Class.java:359) at java.lang.Class.newInstance(Class.java:327) at sun.reflect.GeneratedMethodAccessor84.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-2777) ability to add and drop partitions atomically
[ https://issues.apache.org/jira/browse/HIVE-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinyu Wang updated HIVE-2777: - Affects Version/s: 0.13.0 Status: Patch Available (was: Open) This is a rebased patch on top of hive branch-0.13. Please review. ability to add and drop partitions atomically - Key: HIVE-2777 URL: https://issues.apache.org/jira/browse/HIVE-2777 Project: Hive Issue Type: New Feature Components: Metastore Affects Versions: 0.13.0 Reporter: Aniket Mokashi Assignee: Aniket Mokashi Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2777.D2271.1.patch Hive should have ability to atomically add and drop partitions. This way admins can change partitions atomically without breaking the running jobs. It allows admin to merge several partitions into one. Essentially, we would like to have an api- add_drop_partitions(String db, String tbl_name, ListPartition addParts, ListListString dropParts, boolean deleteData); This jira covers changes required for metastore and thrift. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-2777) ability to add and drop partitions atomically
[ https://issues.apache.org/jira/browse/HIVE-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinyu Wang updated HIVE-2777: - Attachment: hive-2777.patch ability to add and drop partitions atomically - Key: HIVE-2777 URL: https://issues.apache.org/jira/browse/HIVE-2777 Project: Hive Issue Type: New Feature Components: Metastore Affects Versions: 0.13.0 Reporter: Aniket Mokashi Assignee: Aniket Mokashi Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2777.D2271.1.patch, hive-2777.patch Hive should have ability to atomically add and drop partitions. This way admins can change partitions atomically without breaking the running jobs. It allows admin to merge several partitions into one. Essentially, we would like to have an api- add_drop_partitions(String db, String tbl_name, ListPartition addParts, ListListString dropParts, boolean deleteData); This jira covers changes required for metastore and thrift. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6765) ASTNodeOrigin unserializable leads to fail when join with view
[ https://issues.apache.org/jira/browse/HIVE-6765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Wang updated HIVE-6765: -- Status: Patch Available (was: Open) ASTNodeOrigin unserializable leads to fail when join with view -- Key: HIVE-6765 URL: https://issues.apache.org/jira/browse/HIVE-6765 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Adrian Wang Fix For: 0.13.0 Attachments: HIVE-6765.patch.1 when a view contains a UDF, and the view comes into a JOIN operation, Hive will encounter a bug with stack trace like Caused by: java.lang.InstantiationException: org.apache.hadoop.hive.ql.parse.ASTNodeOrigin at java.lang.Class.newInstance0(Class.java:359) at java.lang.Class.newInstance(Class.java:327) at sun.reflect.GeneratedMethodAccessor84.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6765) ASTNodeOrigin unserializable leads to fail when join with view
[ https://issues.apache.org/jira/browse/HIVE-6765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Wang updated HIVE-6765: -- Fix Version/s: 0.13.0 ASTNodeOrigin unserializable leads to fail when join with view -- Key: HIVE-6765 URL: https://issues.apache.org/jira/browse/HIVE-6765 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Adrian Wang Fix For: 0.13.0 Attachments: HIVE-6765.patch.1 when a view contains a UDF, and the view comes into a JOIN operation, Hive will encounter a bug with stack trace like Caused by: java.lang.InstantiationException: org.apache.hadoop.hive.ql.parse.ASTNodeOrigin at java.lang.Class.newInstance0(Class.java:359) at java.lang.Class.newInstance(Class.java:327) at sun.reflect.GeneratedMethodAccessor84.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6765) ASTNodeOrigin unserializable leads to fail when join with view
[ https://issues.apache.org/jira/browse/HIVE-6765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Wang updated HIVE-6765: -- Component/s: (was: Query Processor) ASTNodeOrigin unserializable leads to fail when join with view -- Key: HIVE-6765 URL: https://issues.apache.org/jira/browse/HIVE-6765 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Adrian Wang Attachments: HIVE-6765.patch.1 when a view contains a UDF, and the view comes into a JOIN operation, Hive will encounter a bug with stack trace like Caused by: java.lang.InstantiationException: org.apache.hadoop.hive.ql.parse.ASTNodeOrigin at java.lang.Class.newInstance0(Class.java:359) at java.lang.Class.newInstance(Class.java:327) at sun.reflect.GeneratedMethodAccessor84.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6765) ASTNodeOrigin unserializable leads to fail when join with view
Adrian Wang created HIVE-6765: - Summary: ASTNodeOrigin unserializable leads to fail when join with view Key: HIVE-6765 URL: https://issues.apache.org/jira/browse/HIVE-6765 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: Adrian Wang when a view contains a UDF, and the view comes into a JOIN operation, Hive will encounter a bug with stack trace like Caused by: java.lang.InstantiationException: org.apache.hadoop.hive.ql.parse.ASTNodeOrigin at java.lang.Class.newInstance0(Class.java:359) at java.lang.Class.newInstance(Class.java:327) at sun.reflect.GeneratedMethodAccessor84.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6765) ASTNodeOrigin unserializable leads to fail when join with view
[ https://issues.apache.org/jira/browse/HIVE-6765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13949052#comment-13949052 ] Adrian Wang commented on HIVE-6765: --- I added a PersistenceDelegate in serializeObject() in Class Utilities and resolved the problem. later I'll attach the patch. ASTNodeOrigin unserializable leads to fail when join with view -- Key: HIVE-6765 URL: https://issues.apache.org/jira/browse/HIVE-6765 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: Adrian Wang when a view contains a UDF, and the view comes into a JOIN operation, Hive will encounter a bug with stack trace like Caused by: java.lang.InstantiationException: org.apache.hadoop.hive.ql.parse.ASTNodeOrigin at java.lang.Class.newInstance0(Class.java:359) at java.lang.Class.newInstance(Class.java:327) at sun.reflect.GeneratedMethodAccessor84.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6765) ASTNodeOrigin unserializable leads to fail when join with view
[ https://issues.apache.org/jira/browse/HIVE-6765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Wang updated HIVE-6765: -- Attachment: HIVE-6765.patch.1 ASTNodeOrigin unserializable leads to fail when join with view -- Key: HIVE-6765 URL: https://issues.apache.org/jira/browse/HIVE-6765 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: Adrian Wang Attachments: HIVE-6765.patch.1 when a view contains a UDF, and the view comes into a JOIN operation, Hive will encounter a bug with stack trace like Caused by: java.lang.InstantiationException: org.apache.hadoop.hive.ql.parse.ASTNodeOrigin at java.lang.Class.newInstance0(Class.java:359) at java.lang.Class.newInstance(Class.java:327) at sun.reflect.GeneratedMethodAccessor84.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6765) ASTNodeOrigin unserializable leads to fail when join with view
[ https://issues.apache.org/jira/browse/HIVE-6765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13949075#comment-13949075 ] Adrian Wang commented on HIVE-6765: --- Here's an example to see the Exception: CREATE TABLE t1 (a1 INT, b1 INT); CREATE VIEW v1 (x1) AS SELECT MAX(a1) FROM t1; SELECT s1.x1 FROM v1 s1 JOIN (SELECT MAX(a1) AS ma FROM t1) s2 ON s1.x1 = s2.ma; This is a bug on both ApacheHive and Tez, outputing return code 1 ... ASTNodeOrigin unserializable leads to fail when join with view -- Key: HIVE-6765 URL: https://issues.apache.org/jira/browse/HIVE-6765 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: Adrian Wang Attachments: HIVE-6765.patch.1 when a view contains a UDF, and the view comes into a JOIN operation, Hive will encounter a bug with stack trace like Caused by: java.lang.InstantiationException: org.apache.hadoop.hive.ql.parse.ASTNodeOrigin at java.lang.Class.newInstance0(Class.java:359) at java.lang.Class.newInstance(Class.java:327) at sun.reflect.GeneratedMethodAccessor84.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6765) ASTNodeOrigin unserializable leads to fail when join with view
[ https://issues.apache.org/jira/browse/HIVE-6765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13949076#comment-13949076 ] Adrian Wang commented on HIVE-6765: --- And I think this is just another drawback for using XMLEncoder to clone plan. ASTNodeOrigin unserializable leads to fail when join with view -- Key: HIVE-6765 URL: https://issues.apache.org/jira/browse/HIVE-6765 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: Adrian Wang Attachments: HIVE-6765.patch.1 when a view contains a UDF, and the view comes into a JOIN operation, Hive will encounter a bug with stack trace like Caused by: java.lang.InstantiationException: org.apache.hadoop.hive.ql.parse.ASTNodeOrigin at java.lang.Class.newInstance0(Class.java:359) at java.lang.Class.newInstance(Class.java:327) at sun.reflect.GeneratedMethodAccessor84.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6765) ASTNodeOrigin unserializable leads to fail when join with view
[ https://issues.apache.org/jira/browse/HIVE-6765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13949113#comment-13949113 ] Adrian Wang commented on HIVE-6765: --- Sorry, the previous example works on Tez with hive-0.13. But it fails when I run the query in Hive-0.12 in eclipse. ASTNodeOrigin unserializable leads to fail when join with view -- Key: HIVE-6765 URL: https://issues.apache.org/jira/browse/HIVE-6765 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: Adrian Wang Attachments: HIVE-6765.patch.1 when a view contains a UDF, and the view comes into a JOIN operation, Hive will encounter a bug with stack trace like Caused by: java.lang.InstantiationException: org.apache.hadoop.hive.ql.parse.ASTNodeOrigin at java.lang.Class.newInstance0(Class.java:359) at java.lang.Class.newInstance(Class.java:327) at sun.reflect.GeneratedMethodAccessor84.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) -- This message was sent by Atlassian JIRA (v6.2#6252)