[jira] [Comment Edited] (HIVE-22472) Unable to create hive view
[ https://issues.apache.org/jira/browse/HIVE-22472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16978035#comment-16978035 ] Eric Lin edited comment on HIVE-22472 at 11/20/19 3:26 AM: --- Seems to relate to HIVE-14719, or maybe a duplicate. was (Author: ericlin): Seems to relate to HIVE-14719 > Unable to create hive view > --- > > Key: HIVE-22472 > URL: https://issues.apache.org/jira/browse/HIVE-22472 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.3.5 >Reporter: Ashok >Priority: Major > > unable to create hive view with EXISTS clause: > Error: > FAILED: SemanticException Line 0:-1 Invalid table alias or column reference > 'sq_1': (possible column names are: _table_or_col lf) file_date) sq_corr_, (. > (tok_table_or_col sq_1) sq_corr_1)) > > Below reproduction steps: > -- Setup Tables > create table bug_part_1 (table_name string, partition_date date, file_date > timestamp); > create table bug_part_2 (id string, file_date timestamp) partitioned by > (partition_date date); > -- Example 1 - Works if just query. > select vlf.id > from bug_part_2 vlf > where 1=1 > and exists ( > select null > from ( > select max(file_date) file_date, max(partition_date) as partition_date > from bug_part_1 > ) lf > where lf.partition_date = vlf.partition_date and lf.file_date = vlf.file_date > ); > -- Example 2 - Fails in view. > create or replace view bug_view > as > select vlf.id > from bug_part_2 vlf > where 1=1 > and exists ( > select null > from ( > select max(file_date) file_date, max(partition_date) as partition_date > from bug_part_1 > ) lf > where lf.partition_date = vlf.partition_date and lf.file_date = vlf.file_date > ); > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22472) Unable to create hive view
[ https://issues.apache.org/jira/browse/HIVE-22472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16978035#comment-16978035 ] Eric Lin commented on HIVE-22472: - Seems to relate to HIVE-14719 > Unable to create hive view > --- > > Key: HIVE-22472 > URL: https://issues.apache.org/jira/browse/HIVE-22472 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.3.5 >Reporter: Ashok >Priority: Major > > unable to create hive view with EXISTS clause: > Error: > FAILED: SemanticException Line 0:-1 Invalid table alias or column reference > 'sq_1': (possible column names are: _table_or_col lf) file_date) sq_corr_, (. > (tok_table_or_col sq_1) sq_corr_1)) > > Below reproduction steps: > -- Setup Tables > create table bug_part_1 (table_name string, partition_date date, file_date > timestamp); > create table bug_part_2 (id string, file_date timestamp) partitioned by > (partition_date date); > -- Example 1 - Works if just query. > select vlf.id > from bug_part_2 vlf > where 1=1 > and exists ( > select null > from ( > select max(file_date) file_date, max(partition_date) as partition_date > from bug_part_1 > ) lf > where lf.partition_date = vlf.partition_date and lf.file_date = vlf.file_date > ); > -- Example 2 - Fails in view. > create or replace view bug_view > as > select vlf.id > from bug_part_2 vlf > where 1=1 > and exists ( > select null > from ( > select max(file_date) file_date, max(partition_date) as partition_date > from bug_part_1 > ) lf > where lf.partition_date = vlf.partition_date and lf.file_date = vlf.file_date > ); > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22472) Unable to create hive view
[ https://issues.apache.org/jira/browse/HIVE-22472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16978034#comment-16978034 ] Eric Lin commented on HIVE-22472: - Setting hive.cbo.enable=true can help to avoid the issue, but not sure why yet. > Unable to create hive view > --- > > Key: HIVE-22472 > URL: https://issues.apache.org/jira/browse/HIVE-22472 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.3.5 >Reporter: Ashok >Priority: Major > > unable to create hive view with EXISTS clause: > Error: > FAILED: SemanticException Line 0:-1 Invalid table alias or column reference > 'sq_1': (possible column names are: _table_or_col lf) file_date) sq_corr_, (. > (tok_table_or_col sq_1) sq_corr_1)) > > Below reproduction steps: > -- Setup Tables > create table bug_part_1 (table_name string, partition_date date, file_date > timestamp); > create table bug_part_2 (id string, file_date timestamp) partitioned by > (partition_date date); > -- Example 1 - Works if just query. > select vlf.id > from bug_part_2 vlf > where 1=1 > and exists ( > select null > from ( > select max(file_date) file_date, max(partition_date) as partition_date > from bug_part_1 > ) lf > where lf.partition_date = vlf.partition_date and lf.file_date = vlf.file_date > ); > -- Example 2 - Fails in view. > create or replace view bug_view > as > select vlf.id > from bug_part_2 vlf > where 1=1 > and exists ( > select null > from ( > select max(file_date) file_date, max(partition_date) as partition_date > from bug_part_1 > ) lf > where lf.partition_date = vlf.partition_date and lf.file_date = vlf.file_date > ); > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-22091) NPE on HS2 start up due to bad data in FUNCS table
[ https://issues.apache.org/jira/browse/HIVE-22091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin reassigned HIVE-22091: --- Assignee: Eric Lin > NPE on HS2 start up due to bad data in FUNCS table > -- > > Key: HIVE-22091 > URL: https://issues.apache.org/jira/browse/HIVE-22091 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.0 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Major > > If FUNCS table contains a stale DB_ID that has no links in DBS table, HS2 > will fail to start up with NPE error: > {code:bash} > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.hive.ql.metadata.HiveException: > MetaException(message:java.lang.NullPointerException) > at > org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:220) > at org.apache.hadoop.hive.ql.metadata.Hive.(Hive.java:338) > at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:299) > at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:274) > at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:256) > at > org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider.init(DefaultHiveAuthorizationProvider.java:29) > at > org.apache.hadoop.hive.ql.security.authorization.HiveAuthorizationProviderBase.setConf(HiveAuthorizationProviderBase.java:112) > ... 21 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > MetaException(message:java.lang.NullPointerException) > at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3646) > at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:231) > at > org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:215) > ... 27 more > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Resolved] (HIVE-14903) from_utc_time function issue for CET daylight savings
[ https://issues.apache.org/jira/browse/HIVE-14903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin resolved HIVE-14903. - Resolution: Fixed Tested in CDH5.11, confirm fixed, but not sure which JIRA has the fix in the upstream, will just resolve it for now. > from_utc_time function issue for CET daylight savings > - > > Key: HIVE-14903 > URL: https://issues.apache.org/jira/browse/HIVE-14903 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Eric Lin >Priority: Minor > > Based on https://en.wikipedia.org/wiki/Central_European_Summer_Time, the > summer time is between 1:00 UTC on the last Sunday of March and 1:00 on the > last Sunday of October, see test case below: > Impala: > {code} > select from_utc_timestamp('2016-10-30 00:30:00','CET'); > Query: select from_utc_timestamp('2016-10-30 00:30:00','CET') > +--+ > | from_utc_timestamp('2016-10-30 00:30:00', 'cet') | > +--+ > | 2016-10-30 01:30:00 | > +--+ > {code} > Hive: > {code} > select from_utc_timestamp('2016-10-30 00:30:00','CET'); > INFO : OK > ++--+ > | _c0 | > ++--+ > | 2016-10-30 01:30:00.0 | > ++--+ > {code} > MySQL: > {code} > mysql> SELECT CONVERT_TZ( '2016-10-30 00:30:00', 'UTC', 'CET' ); > +---+ > | CONVERT_TZ( '2016-10-30 00:30:00', 'UTC', 'CET' ) | > +---+ > | 2016-10-30 02:30:00 | > +---+ > {code} > At 00:30AM UTC, the daylight saving has not finished so the time different > should still be 2 hours rather than 1. MySQL returned correct result > At 1:30, results are correct: > Impala: > {code} > Query: select from_utc_timestamp('2016-10-30 01:30:00','CET') > +--+ > | from_utc_timestamp('2016-10-30 01:30:00', 'cet') | > +--+ > | 2016-10-30 02:30:00 | > +--+ > Fetched 1 row(s) in 0.01s > {code} > Hive: > {code} > ++--+ > | _c0 | > ++--+ > | 2016-10-30 02:30:00.0 | > ++--+ > 1 row selected (0.252 seconds) > {code} > MySQL: > {code} > mysql> SELECT CONVERT_TZ( '2016-10-30 01:30:00', 'UTC', 'CET' ); > +---+ > | CONVERT_TZ( '2016-10-30 01:30:00', 'UTC', 'CET' ) | > +---+ > | 2016-10-30 02:30:00 | > +---+ > 1 row in set (0.00 sec) > {code} > Seems like a bug. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-14903) from_utc_time function issue for CET daylight savings
[ https://issues.apache.org/jira/browse/HIVE-14903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201479#comment-16201479 ] Eric Lin commented on HIVE-14903: - Thanks for letting me know [~zsombor.klara], and apologies for the delay. > from_utc_time function issue for CET daylight savings > - > > Key: HIVE-14903 > URL: https://issues.apache.org/jira/browse/HIVE-14903 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Eric Lin >Priority: Minor > > Based on https://en.wikipedia.org/wiki/Central_European_Summer_Time, the > summer time is between 1:00 UTC on the last Sunday of March and 1:00 on the > last Sunday of October, see test case below: > Impala: > {code} > select from_utc_timestamp('2016-10-30 00:30:00','CET'); > Query: select from_utc_timestamp('2016-10-30 00:30:00','CET') > +--+ > | from_utc_timestamp('2016-10-30 00:30:00', 'cet') | > +--+ > | 2016-10-30 01:30:00 | > +--+ > {code} > Hive: > {code} > select from_utc_timestamp('2016-10-30 00:30:00','CET'); > INFO : OK > ++--+ > | _c0 | > ++--+ > | 2016-10-30 01:30:00.0 | > ++--+ > {code} > MySQL: > {code} > mysql> SELECT CONVERT_TZ( '2016-10-30 00:30:00', 'UTC', 'CET' ); > +---+ > | CONVERT_TZ( '2016-10-30 00:30:00', 'UTC', 'CET' ) | > +---+ > | 2016-10-30 02:30:00 | > +---+ > {code} > At 00:30AM UTC, the daylight saving has not finished so the time different > should still be 2 hours rather than 1. MySQL returned correct result > At 1:30, results are correct: > Impala: > {code} > Query: select from_utc_timestamp('2016-10-30 01:30:00','CET') > +--+ > | from_utc_timestamp('2016-10-30 01:30:00', 'cet') | > +--+ > | 2016-10-30 02:30:00 | > +--+ > Fetched 1 row(s) in 0.01s > {code} > Hive: > {code} > ++--+ > | _c0 | > ++--+ > | 2016-10-30 02:30:00.0 | > ++--+ > 1 row selected (0.252 seconds) > {code} > MySQL: > {code} > mysql> SELECT CONVERT_TZ( '2016-10-30 01:30:00', 'UTC', 'CET' ); > +---+ > | CONVERT_TZ( '2016-10-30 01:30:00', 'UTC', 'CET' ) | > +---+ > | 2016-10-30 02:30:00 | > +---+ > 1 row in set (0.00 sec) > {code} > Seems like a bug. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16794) Default value for hive.spark.client.connect.timeout of 1000ms is too low
[ https://issues.apache.org/jira/browse/HIVE-16794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-16794: Attachment: HIVE-16794.patch Increasing the timeout to 5 seconds. > Default value for hive.spark.client.connect.timeout of 1000ms is too low > > > Key: HIVE-16794 > URL: https://issues.apache.org/jira/browse/HIVE-16794 > Project: Hive > Issue Type: Task > Components: Spark >Affects Versions: 2.1.1 >Reporter: Eric Lin >Assignee: Eric Lin > Attachments: HIVE-16794.patch > > > Currently the default timeout value for hive.spark.client.connect.timeout is > set at 1000ms, which is only 1 second. This is not enough when cluster is > busy and user will constantly getting the following timeout errors: > {code} > 17/05/03 03:20:08 ERROR yarn.ApplicationMaster: User class threw exception: > java.util.concurrent.ExecutionException: > io.netty.channel.ConnectTimeoutException: connection timed out: > /172.19.22.11:35915 > java.util.concurrent.ExecutionException: > io.netty.channel.ConnectTimeoutException: connection timed out: > /172.19.22.11:35915 > at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) > at org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156) > at org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542) > > Caused by: io.netty.channel.ConnectTimeoutException: connection timed out: > /172.19.22.11:35915 > at > io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:220) > > at > io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38) > > at > io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:120) > > at > io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357) > > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > > at java.lang.Thread.run(Thread.java:745) > 17/05/03 03:20:08 INFO yarn.ApplicationMaster: Final app status: FAILED, > exitCode: 15, (reason: User class threw exception: > java.util.concurrent.ExecutionException: > io.netty.channel.ConnectTimeoutException: connection timed out: > /172.19.22.11:35915) > 17/05/03 03:20:16 ERROR yarn.ApplicationMaster: SparkContext did not > initialize after waiting for 10 ms. Please check earlier log output for > errors. Failing the application. > 17/05/03 03:20:16 INFO yarn.ApplicationMaster: Unregistering > ApplicationMaster with FAILED (diag message: User class threw exception: > java.util.concurrent.ExecutionException: > io.netty.channel.ConnectTimeoutException: connection timed out: > /172.19.22.11:35915) > 17/05/03 03:20:16 INFO yarn.ApplicationMaster: Deleting staging directory > .sparkStaging/application_1492040605432_11445 > 17/05/03 03:20:16 INFO util.ShutdownHookManager: Shutdown hook called > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16794) Default value for hive.spark.client.connect.timeout of 1000ms is too low
[ https://issues.apache.org/jira/browse/HIVE-16794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-16794: Status: Patch Available (was: Open) > Default value for hive.spark.client.connect.timeout of 1000ms is too low > > > Key: HIVE-16794 > URL: https://issues.apache.org/jira/browse/HIVE-16794 > Project: Hive > Issue Type: Task > Components: Spark >Affects Versions: 2.1.1 >Reporter: Eric Lin >Assignee: Eric Lin > Attachments: HIVE-16794.patch > > > Currently the default timeout value for hive.spark.client.connect.timeout is > set at 1000ms, which is only 1 second. This is not enough when cluster is > busy and user will constantly getting the following timeout errors: > {code} > 17/05/03 03:20:08 ERROR yarn.ApplicationMaster: User class threw exception: > java.util.concurrent.ExecutionException: > io.netty.channel.ConnectTimeoutException: connection timed out: > /172.19.22.11:35915 > java.util.concurrent.ExecutionException: > io.netty.channel.ConnectTimeoutException: connection timed out: > /172.19.22.11:35915 > at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) > at org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156) > at org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542) > > Caused by: io.netty.channel.ConnectTimeoutException: connection timed out: > /172.19.22.11:35915 > at > io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:220) > > at > io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38) > > at > io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:120) > > at > io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357) > > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > > at java.lang.Thread.run(Thread.java:745) > 17/05/03 03:20:08 INFO yarn.ApplicationMaster: Final app status: FAILED, > exitCode: 15, (reason: User class threw exception: > java.util.concurrent.ExecutionException: > io.netty.channel.ConnectTimeoutException: connection timed out: > /172.19.22.11:35915) > 17/05/03 03:20:16 ERROR yarn.ApplicationMaster: SparkContext did not > initialize after waiting for 10 ms. Please check earlier log output for > errors. Failing the application. > 17/05/03 03:20:16 INFO yarn.ApplicationMaster: Unregistering > ApplicationMaster with FAILED (diag message: User class threw exception: > java.util.concurrent.ExecutionException: > io.netty.channel.ConnectTimeoutException: connection timed out: > /172.19.22.11:35915) > 17/05/03 03:20:16 INFO yarn.ApplicationMaster: Deleting staging directory > .sparkStaging/application_1492040605432_11445 > 17/05/03 03:20:16 INFO util.ShutdownHookManager: Shutdown hook called > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16794) Default value for hive.spark.client.connect.timeout of 1000ms is too low
[ https://issues.apache.org/jira/browse/HIVE-16794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin reassigned HIVE-16794: --- Assignee: Eric Lin > Default value for hive.spark.client.connect.timeout of 1000ms is too low > > > Key: HIVE-16794 > URL: https://issues.apache.org/jira/browse/HIVE-16794 > Project: Hive > Issue Type: Task > Components: Spark >Affects Versions: 2.1.1 >Reporter: Eric Lin >Assignee: Eric Lin > > Currently the default timeout value for hive.spark.client.connect.timeout is > set at 1000ms, which is only 1 second. This is not enough when cluster is > busy and user will constantly getting the following timeout errors: > {code} > 17/05/03 03:20:08 ERROR yarn.ApplicationMaster: User class threw exception: > java.util.concurrent.ExecutionException: > io.netty.channel.ConnectTimeoutException: connection timed out: > /172.19.22.11:35915 > java.util.concurrent.ExecutionException: > io.netty.channel.ConnectTimeoutException: connection timed out: > /172.19.22.11:35915 > at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) > at org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156) > at org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542) > > Caused by: io.netty.channel.ConnectTimeoutException: connection timed out: > /172.19.22.11:35915 > at > io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:220) > > at > io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38) > > at > io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:120) > > at > io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357) > > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > > at java.lang.Thread.run(Thread.java:745) > 17/05/03 03:20:08 INFO yarn.ApplicationMaster: Final app status: FAILED, > exitCode: 15, (reason: User class threw exception: > java.util.concurrent.ExecutionException: > io.netty.channel.ConnectTimeoutException: connection timed out: > /172.19.22.11:35915) > 17/05/03 03:20:16 ERROR yarn.ApplicationMaster: SparkContext did not > initialize after waiting for 10 ms. Please check earlier log output for > errors. Failing the application. > 17/05/03 03:20:16 INFO yarn.ApplicationMaster: Unregistering > ApplicationMaster with FAILED (diag message: User class threw exception: > java.util.concurrent.ExecutionException: > io.netty.channel.ConnectTimeoutException: connection timed out: > /172.19.22.11:35915) > 17/05/03 03:20:16 INFO yarn.ApplicationMaster: Deleting staging directory > .sparkStaging/application_1492040605432_11445 > 17/05/03 03:20:16 INFO util.ShutdownHookManager: Shutdown hook called > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16029) COLLECT_SET and COLLECT_LIST does not return NULL in the result
[ https://issues.apache.org/jira/browse/HIVE-16029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16020747#comment-16020747 ] Eric Lin commented on HIVE-16029: - Hi [~appodictic], I have modified test so that it passes, please help to review the code at https://reviews.apache.org/r/57009/. Thanks > COLLECT_SET and COLLECT_LIST does not return NULL in the result > --- > > Key: HIVE-16029 > URL: https://issues.apache.org/jira/browse/HIVE-16029 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > Attachments: HIVE-16029.2.patch, HIVE-16029.3.patch, HIVE-16029.patch > > > See the test case below: > {code} > 0: jdbc:hive2://localhost:1/default> select * from collect_set_test; > +-+ > | collect_set_test.a | > +-+ > | 1 | > | 2 | > | NULL| > | 4 | > | NULL| > +-+ > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,4] | > +---+ > {code} > The correct result should be: > {code} > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,null,4] | > +---+ > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16029) COLLECT_SET and COLLECT_LIST does not return NULL in the result
[ https://issues.apache.org/jira/browse/HIVE-16029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-16029: Attachment: HIVE-16029.3.patch providing latest patch to update *.q.out files > COLLECT_SET and COLLECT_LIST does not return NULL in the result > --- > > Key: HIVE-16029 > URL: https://issues.apache.org/jira/browse/HIVE-16029 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > Attachments: HIVE-16029.2.patch, HIVE-16029.3.patch, HIVE-16029.patch > > > See the test case below: > {code} > 0: jdbc:hive2://localhost:1/default> select * from collect_set_test; > +-+ > | collect_set_test.a | > +-+ > | 1 | > | 2 | > | NULL| > | 4 | > | NULL| > +-+ > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,4] | > +---+ > {code} > The correct result should be: > {code} > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,null,4] | > +---+ > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16029) COLLECT_SET and COLLECT_LIST does not return NULL in the result
[ https://issues.apache.org/jira/browse/HIVE-16029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974206#comment-15974206 ] Eric Lin commented on HIVE-16029: - Hi [~appodictic], Thanks for the suggestion. I am trying to run the test for TestCliDriver using below command under directory itests/qtest by following the documentation https://cwiki.apache.org/confluence/display/Hive/HiveDeveloperFAQ#HiveDeveloperFAQ-Testing: {code} mvn test -Dtest=TestCliDriver {code} However, it kept failing with below error: {code} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test (default-test) on project hive-it-qfile: ExecutionException: java.lang.RuntimeException: The forked VM terminated without properly saying goodbye. VM crash or System.exit called? [ERROR] Command was /bin/sh -c cd /hadoop/code/hive/itests/qtest && /hadoop/jdk1.8.0_91/jre/bin/java -Xmx1024m -XX:MaxPermSize=256M -jar /hadoop/code/hive/itests/qtest/target/surefire/surefirebooter7738443094919274008.jar /hadoop/code/hive/itests/qtest/target/surefire/surefire4160478088421683107tmp /hadoop/code/hive/itests/qtest/target/surefire/surefire_05453129517537389906tmp [ERROR] -> [Help 1] org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test (default-test) on project hive-it-qfile: ExecutionException at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:213) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59) at org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183) at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161) at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320) at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:156) at org.apache.maven.cli.MavenCli.execute(MavenCli.java:537) at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:196) at org.apache.maven.cli.MavenCli.main(MavenCli.java:141) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:290) at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:230) at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:414) at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:357) Caused by: org.apache.maven.plugin.MojoFailureException: ExecutionException at org.apache.maven.plugin.surefire.SurefirePlugin.assertNoException(SurefirePlugin.java:262) at org.apache.maven.plugin.surefire.SurefirePlugin.handleSummary(SurefirePlugin.java:252) at org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:854) at org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:722) at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:101) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:209) ... 19 more Caused by: org.apache.maven.surefire.booter.SurefireBooterForkException: ExecutionException at org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkPerTestSet(ForkStarter.java:343) at org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:178) at org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:990) at org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:824) ... 22 more Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: The forked VM terminated without properly saying goodbye. VM crash or System.exit called? Command was /bin/sh -c cd /hadoop/code/hive/itests/qtest && /hadoop/jdk1.8.0_91/jre/bin/java -Xmx1024m -XX:MaxPermSize=256M -jar /hadoop/code/hive/itests/qtest/target/surefire/surefirebooter7738443094919274008.jar /hadoop/code/hive/itests/qtest/target/surefire/surefire4160478088421683107tmp
[jira] [Commented] (HIVE-16029) COLLECT_SET and COLLECT_LIST does not return NULL in the result
[ https://issues.apache.org/jira/browse/HIVE-16029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15969931#comment-15969931 ] Eric Lin commented on HIVE-16029: - Review is also updated: https://reviews.apache.org/r/57009/. Please help to review and see if there is any other changes required. > COLLECT_SET and COLLECT_LIST does not return NULL in the result > --- > > Key: HIVE-16029 > URL: https://issues.apache.org/jira/browse/HIVE-16029 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > Attachments: HIVE-16029.2.patch, HIVE-16029.patch > > > See the test case below: > {code} > 0: jdbc:hive2://localhost:1/default> select * from collect_set_test; > +-+ > | collect_set_test.a | > +-+ > | 1 | > | 2 | > | NULL| > | 4 | > | NULL| > +-+ > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,4] | > +---+ > {code} > The correct result should be: > {code} > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,null,4] | > +---+ > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16029) COLLECT_SET and COLLECT_LIST does not return NULL in the result
[ https://issues.apache.org/jira/browse/HIVE-16029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-16029: Attachment: HIVE-16029.2.patch Attaching new patch so that COLLECT_SET takes two arguments, first one is the same as before, second one is boolean value of true or false, which was suggested by Edward. > COLLECT_SET and COLLECT_LIST does not return NULL in the result > --- > > Key: HIVE-16029 > URL: https://issues.apache.org/jira/browse/HIVE-16029 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > Attachments: HIVE-16029.2.patch, HIVE-16029.patch > > > See the test case below: > {code} > 0: jdbc:hive2://localhost:1/default> select * from collect_set_test; > +-+ > | collect_set_test.a | > +-+ > | 1 | > | 2 | > | NULL| > | 4 | > | NULL| > +-+ > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,4] | > +---+ > {code} > The correct result should be: > {code} > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,null,4] | > +---+ > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15166) Provide beeline option to set the jline history max size
[ https://issues.apache.org/jira/browse/HIVE-15166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-15166: Attachment: HIVE-15166.3.patch New patch based on latest Hive master code base. > Provide beeline option to set the jline history max size > > > Key: HIVE-15166 > URL: https://issues.apache.org/jira/browse/HIVE-15166 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 2.1.0 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > Attachments: HIVE-15166.2.patch, HIVE-15166.3.patch, HIVE-15166.patch > > > Currently Beeline does not provide an option to limit the max size for > beeline history file, in the case that each query is very big, it will flood > the history file and slow down beeline on start up and shutdown. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15166) Provide beeline option to set the jline history max size
[ https://issues.apache.org/jira/browse/HIVE-15166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883444#comment-15883444 ] Eric Lin commented on HIVE-15166: - Looks like I need to rebase my code, as a lot has changed. Will update again the patch on coming Monday. > Provide beeline option to set the jline history max size > > > Key: HIVE-15166 > URL: https://issues.apache.org/jira/browse/HIVE-15166 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 2.1.0 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > Attachments: HIVE-15166.2.patch, HIVE-15166.patch > > > Currently Beeline does not provide an option to limit the max size for > beeline history file, in the case that each query is very big, it will flood > the history file and slow down beeline on start up and shutdown. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16029) COLLECT_SET and COLLECT_LIST does not return NULL in the result
[ https://issues.apache.org/jira/browse/HIVE-16029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881671#comment-15881671 ] Eric Lin commented on HIVE-16029: - Review request sent: https://reviews.apache.org/r/57009/ > COLLECT_SET and COLLECT_LIST does not return NULL in the result > --- > > Key: HIVE-16029 > URL: https://issues.apache.org/jira/browse/HIVE-16029 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > Attachments: HIVE-16029.patch > > > See the test case below: > {code} > 0: jdbc:hive2://localhost:1/default> select * from collect_set_test; > +-+ > | collect_set_test.a | > +-+ > | 1 | > | 2 | > | NULL| > | 4 | > | NULL| > +-+ > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,4] | > +---+ > {code} > The correct result should be: > {code} > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,null,4] | > +---+ > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16029) COLLECT_SET and COLLECT_LIST does not return NULL in the result
[ https://issues.apache.org/jira/browse/HIVE-16029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-16029: Attachment: HIVE-16029.patch Adding first patch. > COLLECT_SET and COLLECT_LIST does not return NULL in the result > --- > > Key: HIVE-16029 > URL: https://issues.apache.org/jira/browse/HIVE-16029 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > Attachments: HIVE-16029.patch > > > See the test case below: > {code} > 0: jdbc:hive2://localhost:1/default> select * from collect_set_test; > +-+ > | collect_set_test.a | > +-+ > | 1 | > | 2 | > | NULL| > | 4 | > | NULL| > +-+ > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,4] | > +---+ > {code} > The correct result should be: > {code} > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,null,4] | > +---+ > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16029) COLLECT_SET and COLLECT_LIST does not return NULL in the result
[ https://issues.apache.org/jira/browse/HIVE-16029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-16029: Status: Patch Available (was: Open) > COLLECT_SET and COLLECT_LIST does not return NULL in the result > --- > > Key: HIVE-16029 > URL: https://issues.apache.org/jira/browse/HIVE-16029 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > Attachments: HIVE-16029.patch > > > See the test case below: > {code} > 0: jdbc:hive2://localhost:1/default> select * from collect_set_test; > +-+ > | collect_set_test.a | > +-+ > | 1 | > | 2 | > | NULL| > | 4 | > | NULL| > +-+ > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,4] | > +---+ > {code} > The correct result should be: > {code} > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,null,4] | > +---+ > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16029) COLLECT_SET and COLLECT_LIST does not return NULL in the result
[ https://issues.apache.org/jira/browse/HIVE-16029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-16029: Description: See the test case below: {code} 0: jdbc:hive2://localhost:1/default> select * from collect_set_test; +-+ | collect_set_test.a | +-+ | 1 | | 2 | | NULL| | 4 | | NULL| +-+ 0: jdbc:hive2://localhost:1/default> select collect_set(a) from collect_set_test; +---+ | _c0 | +---+ | [1,2,4] | +---+ {code} The correct result should be: {code} 0: jdbc:hive2://localhost:1/default> select collect_set(a) from collect_set_test; +---+ | _c0 | +---+ | [1,2,null,4] | +---+ {code} was: See the test case below: 0: jdbc:hive2://localhost:1/default> select * from collect_set_test; +-+ | collect_set_test.a | +-+ | 1 | | 2 | | NULL| | 4 | | NULL| +-+ 0: jdbc:hive2://localhost:1/default> select collect_set(a) from collect_set_test; +---+ | _c0 | +---+ | [1,2,4] | +---+ The correct result should be: 0: jdbc:hive2://localhost:1/default> select collect_set(a) from collect_set_test; +---+ | _c0 | +---+ | [1,2,null,4] | +---+ > COLLECT_SET and COLLECT_LIST does not return NULL in the result > --- > > Key: HIVE-16029 > URL: https://issues.apache.org/jira/browse/HIVE-16029 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > > See the test case below: > {code} > 0: jdbc:hive2://localhost:1/default> select * from collect_set_test; > +-+ > | collect_set_test.a | > +-+ > | 1 | > | 2 | > | NULL| > | 4 | > | NULL| > +-+ > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,4] | > +---+ > {code} > The correct result should be: > {code} > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,null,4] | > +---+ > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16029) COLLECT_SET and COLLECT_LIST does not return NULL in the result
[ https://issues.apache.org/jira/browse/HIVE-16029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin reassigned HIVE-16029: --- > COLLECT_SET and COLLECT_LIST does not return NULL in the result > --- > > Key: HIVE-16029 > URL: https://issues.apache.org/jira/browse/HIVE-16029 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > > See the test case below: > 0: jdbc:hive2://localhost:1/default> select * from collect_set_test; > +-+ > | collect_set_test.a | > +-+ > | 1 | > | 2 | > | NULL| > | 4 | > | NULL| > +-+ > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,4] | > +---+ > The correct result should be: > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,null,4] | > +---+ -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15166) Provide beeline option to set the jline history max size
[ https://issues.apache.org/jira/browse/HIVE-15166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-15166: Attachment: HIVE-15166.2.patch Re-attach patch > Provide beeline option to set the jline history max size > > > Key: HIVE-15166 > URL: https://issues.apache.org/jira/browse/HIVE-15166 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 2.1.0 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > Attachments: HIVE-15166.2.patch, HIVE-15166.patch > > > Currently Beeline does not provide an option to limit the max size for > beeline history file, in the case that each query is very big, it will flood > the history file and slow down beeline on start up and shutdown. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15166) Provide beeline option to set the jline history max size
[ https://issues.apache.org/jira/browse/HIVE-15166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-15166: Attachment: (was: HIVE-15166.2.patch) > Provide beeline option to set the jline history max size > > > Key: HIVE-15166 > URL: https://issues.apache.org/jira/browse/HIVE-15166 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 2.1.0 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > Attachments: HIVE-15166.patch > > > Currently Beeline does not provide an option to limit the max size for > beeline history file, in the case that each query is very big, it will flood > the history file and slow down beeline on start up and shutdown. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15166) Provide beeline option to set the jline history max size
[ https://issues.apache.org/jira/browse/HIVE-15166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15825584#comment-15825584 ] Eric Lin commented on HIVE-15166: - Hi [~aihuaxu], I have created review: https://reviews.apache.org/r/55605/ Thanks > Provide beeline option to set the jline history max size > > > Key: HIVE-15166 > URL: https://issues.apache.org/jira/browse/HIVE-15166 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 2.1.0 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > Attachments: HIVE-15166.2.patch, HIVE-15166.patch > > > Currently Beeline does not provide an option to limit the max size for > beeline history file, in the case that each query is very big, it will flood > the history file and slow down beeline on start up and shutdown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15166) Provide beeline option to set the jline history max size
[ https://issues.apache.org/jira/browse/HIVE-15166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-15166: Attachment: HIVE-15166.2.patch Hi [~aihuaxu], I have updated the patch using latest master branch. However, I am not able to build Hive due to the following error: {code} [INFO] Scanning for projects... [ERROR] [ERROR] Some problems were encountered while processing the POMs: [ERROR] 'dependencies.dependency.version' for org.powermock:powermock-module-junit4:jar must be a valid version but is '${powermock.version}'. @ line 782, column 16 [ERROR] 'dependencies.dependency.version' for org.powermock:powermock-api-mockito:jar must be a valid version but is '${powermock.version}'. @ line 788, column 16 @ [ERROR] The build could not read 1 project -> [Help 1] [ERROR] [ERROR] The project org.apache.hive:hive:2.2.0-SNAPSHOT (/Users/ericlin/hadoop/hive/pom.xml) has 2 errors [ERROR] 'dependencies.dependency.version' for org.powermock:powermock-module-junit4:jar must be a valid version but is '${powermock.version}'. @ line 782, column 16 [ERROR] 'dependencies.dependency.version' for org.powermock:powermock-api-mockito:jar must be a valid version but is '${powermock.version}'. @ line 788, column 16 [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException {code} This happens on master code, so I am not able to test the change. I will see if I can fix it. But in the mean time, can you please help to review the change and provide me further feedback? Thanks > Provide beeline option to set the jline history max size > > > Key: HIVE-15166 > URL: https://issues.apache.org/jira/browse/HIVE-15166 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 2.1.0 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > Attachments: HIVE-15166.2.patch, HIVE-15166.patch > > > Currently Beeline does not provide an option to limit the max size for > beeline history file, in the case that each query is very big, it will flood > the history file and slow down beeline on start up and shutdown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15166) Provide beeline option to set the jline history max size
[ https://issues.apache.org/jira/browse/HIVE-15166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823492#comment-15823492 ] Eric Lin commented on HIVE-15166: - Hi [~aihuaxu], Should I create a JIRA review for you? > Provide beeline option to set the jline history max size > > > Key: HIVE-15166 > URL: https://issues.apache.org/jira/browse/HIVE-15166 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 2.1.0 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > Attachments: HIVE-15166.patch > > > Currently Beeline does not provide an option to limit the max size for > beeline history file, in the case that each query is very big, it will flood > the history file and slow down beeline on start up and shutdown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15166) Provide beeline option to set the jline history max size
[ https://issues.apache.org/jira/browse/HIVE-15166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823486#comment-15823486 ] Eric Lin commented on HIVE-15166: - [~aihuaxu], Thanks for the comment. Please give me sometime to review it. It has been a while since I submitted the patch. I will provide a new patch soon. Thanks > Provide beeline option to set the jline history max size > > > Key: HIVE-15166 > URL: https://issues.apache.org/jira/browse/HIVE-15166 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 2.1.0 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > Attachments: HIVE-15166.patch > > > Currently Beeline does not provide an option to limit the max size for > beeline history file, in the case that each query is very big, it will flood > the history file and slow down beeline on start up and shutdown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15166) Provide beeline option to set the jline history max size
[ https://issues.apache.org/jira/browse/HIVE-15166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-15166: Priority: Minor (was: Major) > Provide beeline option to set the jline history max size > > > Key: HIVE-15166 > URL: https://issues.apache.org/jira/browse/HIVE-15166 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 2.1.0 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > > Currently Beeline does not provide an option to limit the max size for > beeline history file, in the case that each query is very big, it will flood > the history file and slow down beeline on start up and shutdown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15166) Provide beeline option to set the jline history max size
[ https://issues.apache.org/jira/browse/HIVE-15166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-15166: Assignee: Eric Lin > Provide beeline option to set the jline history max size > > > Key: HIVE-15166 > URL: https://issues.apache.org/jira/browse/HIVE-15166 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 2.1.0 >Reporter: Eric Lin >Assignee: Eric Lin > > Currently Beeline does not provide an option to limit the max size for > beeline history file, in the case that each query is very big, it will flood > the history file and slow down beeline on start up and shutdown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-14482) Add and drop table partition is not audit logged in HMS
[ https://issues.apache.org/jira/browse/HIVE-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593695#comment-15593695 ] Eric Lin edited comment on HIVE-14482 at 10/21/16 2:12 AM: --- The error seems to be unrelated (please see the attached screenshot) and there is no existing test coverage for HMS audit logs, also it is pretty hard to test it. Please advise if the patch can be accepted. Thanks was (Author: ericlin): The error seems to be unrelated and there is no existing test coverage for HMS audit logs, also it is pretty hard to test it. Please advise if the patch can be accepted. Thanks > Add and drop table partition is not audit logged in HMS > --- > > Key: HIVE-14482 > URL: https://issues.apache.org/jira/browse/HIVE-14482 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.1.0 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > Attachments: HIVE-14482.2.patch, HIVE-14482.patch, Screen Shot > 2016-10-21 at 1.06.33 pm.png > > > When running: > {code} > ALTER TABLE test DROP PARTITION (b=140); > {code} > I only see the following in the HMS log: > {code} > 2016-08-08 23:12:34,081 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,082 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test > 2016-08-08 23:12:34,082 INFO > org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: > ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default > tbl=test > 2016-08-08 23:12:34,094 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: end=1470723154094 duration=13 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 > retryCount=0 error=false> > 2016-08-08 23:12:34,095 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,095 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_partitions_by_expr : > db=default tbl=test > 2016-08-08 23:12:34,096 INFO > org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: > ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_partitions_by_expr > : db=default tbl=test > 2016-08-08 23:12:34,112 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: start=1470723154095 end=1470723154112 duration=17 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 > retryCount=0 error=false> > 2016-08-08 23:12:34,172 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,173 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test > 2016-08-08 23:12:34,173 INFO > org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: > ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default > tbl=test > 2016-08-08 23:12:34,186 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: end=1470723154186 duration=14 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 > retryCount=0 error=false> > 2016-08-08 23:12:34,186 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,187 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test > 2016-08-08 23:12:34,187 INFO > org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: > ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default > tbl=test > 2016-08-08 23:12:34,199 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: end=1470723154199 duration=13 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 > retryCount=0 error=false> > 2016-08-08 23:12:34,203 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,215 INFO org.apache.hadoop.hive.metastore.ObjectStore: > [pool-4-thread-2]: JDO filter pushdown cannot be used: Filtering is supported > only on partition keys of type string > 2016-08-08 23:12:34,226 ERROR org.apache.hadoop.hdfs.KeyProviderCache: > [pool-4-thread-2]: Could not find uri with key > [dfs.encryption.key.provider.uri] to create a keyProvider !! > 2016-08-08 23:12:34,239 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: >
[jira] [Updated] (HIVE-14482) Add and drop table partition is not audit logged in HMS
[ https://issues.apache.org/jira/browse/HIVE-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-14482: Attachment: Screen Shot 2016-10-21 at 1.06.33 pm.png The error seems to be unrelated and there is no existing test coverage for HMS audit logs, also it is pretty hard to test it. Please advise if the patch can be accepted. Thanks > Add and drop table partition is not audit logged in HMS > --- > > Key: HIVE-14482 > URL: https://issues.apache.org/jira/browse/HIVE-14482 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.1.0 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > Attachments: HIVE-14482.2.patch, HIVE-14482.patch, Screen Shot > 2016-10-21 at 1.06.33 pm.png > > > When running: > {code} > ALTER TABLE test DROP PARTITION (b=140); > {code} > I only see the following in the HMS log: > {code} > 2016-08-08 23:12:34,081 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,082 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test > 2016-08-08 23:12:34,082 INFO > org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: > ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default > tbl=test > 2016-08-08 23:12:34,094 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: end=1470723154094 duration=13 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 > retryCount=0 error=false> > 2016-08-08 23:12:34,095 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,095 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_partitions_by_expr : > db=default tbl=test > 2016-08-08 23:12:34,096 INFO > org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: > ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_partitions_by_expr > : db=default tbl=test > 2016-08-08 23:12:34,112 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: start=1470723154095 end=1470723154112 duration=17 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 > retryCount=0 error=false> > 2016-08-08 23:12:34,172 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,173 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test > 2016-08-08 23:12:34,173 INFO > org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: > ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default > tbl=test > 2016-08-08 23:12:34,186 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: end=1470723154186 duration=14 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 > retryCount=0 error=false> > 2016-08-08 23:12:34,186 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,187 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test > 2016-08-08 23:12:34,187 INFO > org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: > ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default > tbl=test > 2016-08-08 23:12:34,199 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: end=1470723154199 duration=13 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 > retryCount=0 error=false> > 2016-08-08 23:12:34,203 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,215 INFO org.apache.hadoop.hive.metastore.ObjectStore: > [pool-4-thread-2]: JDO filter pushdown cannot be used: Filtering is supported > only on partition keys of type string > 2016-08-08 23:12:34,226 ERROR org.apache.hadoop.hdfs.KeyProviderCache: > [pool-4-thread-2]: Could not find uri with key > [dfs.encryption.key.provider.uri] to create a keyProvider !! > 2016-08-08 23:12:34,239 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: dropPartition() will move partition-directories to > trash-directory. > 2016-08-08 23:12:34,239 INFO hive.metastore.hivemetastoressimpl: > [pool-4-thread-2]: deleting > hdfs://:8020/user/hive/warehouse/default/test/b=140 > 2016-08-08 23:12:34,247 INFO
[jira] [Updated] (HIVE-14482) Add and drop table partition is not audit logged in HMS
[ https://issues.apache.org/jira/browse/HIVE-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-14482: Attachment: HIVE-14482.2.patch > Add and drop table partition is not audit logged in HMS > --- > > Key: HIVE-14482 > URL: https://issues.apache.org/jira/browse/HIVE-14482 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.1.0 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > Attachments: HIVE-14482.2.patch, HIVE-14482.patch > > > When running: > {code} > ALTER TABLE test DROP PARTITION (b=140); > {code} > I only see the following in the HMS log: > {code} > 2016-08-08 23:12:34,081 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,082 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test > 2016-08-08 23:12:34,082 INFO > org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: > ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default > tbl=test > 2016-08-08 23:12:34,094 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: end=1470723154094 duration=13 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 > retryCount=0 error=false> > 2016-08-08 23:12:34,095 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,095 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_partitions_by_expr : > db=default tbl=test > 2016-08-08 23:12:34,096 INFO > org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: > ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_partitions_by_expr > : db=default tbl=test > 2016-08-08 23:12:34,112 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: start=1470723154095 end=1470723154112 duration=17 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 > retryCount=0 error=false> > 2016-08-08 23:12:34,172 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,173 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test > 2016-08-08 23:12:34,173 INFO > org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: > ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default > tbl=test > 2016-08-08 23:12:34,186 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: end=1470723154186 duration=14 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 > retryCount=0 error=false> > 2016-08-08 23:12:34,186 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,187 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test > 2016-08-08 23:12:34,187 INFO > org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: > ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default > tbl=test > 2016-08-08 23:12:34,199 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: end=1470723154199 duration=13 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 > retryCount=0 error=false> > 2016-08-08 23:12:34,203 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,215 INFO org.apache.hadoop.hive.metastore.ObjectStore: > [pool-4-thread-2]: JDO filter pushdown cannot be used: Filtering is supported > only on partition keys of type string > 2016-08-08 23:12:34,226 ERROR org.apache.hadoop.hdfs.KeyProviderCache: > [pool-4-thread-2]: Could not find uri with key > [dfs.encryption.key.provider.uri] to create a keyProvider !! > 2016-08-08 23:12:34,239 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: dropPartition() will move partition-directories to > trash-directory. > 2016-08-08 23:12:34,239 INFO hive.metastore.hivemetastoressimpl: > [pool-4-thread-2]: deleting > hdfs://:8020/user/hive/warehouse/default/test/b=140 > 2016-08-08 23:12:34,247 INFO org.apache.hadoop.fs.TrashPolicyDefault: > [pool-4-thread-2]: Moved: > 'hdfs://:8020/user/hive/warehouse/default/test/b=140' to trash at: > hdfs://:8020/user/hive/.Trash/Current/user/hive/warehouse/default/test/b=140 > 2016-08-08 23:12:34,247 INFO
[jira] [Commented] (HIVE-14903) from_utc_time function issue for CET daylight savings
[ https://issues.apache.org/jira/browse/HIVE-14903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15551569#comment-15551569 ] Eric Lin commented on HIVE-14903: - Both Hive and Impala seem to have the issue, so also created https://issues.cloudera.org/browse/IMPALA-4250 > from_utc_time function issue for CET daylight savings > - > > Key: HIVE-14903 > URL: https://issues.apache.org/jira/browse/HIVE-14903 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Eric Lin >Priority: Minor > > Based on https://en.wikipedia.org/wiki/Central_European_Summer_Time, the > summer time is between 1:00 UTC on the last Sunday of March and 1:00 on the > last Sunday of October, see test case below: > Impala: > {code} > select from_utc_timestamp('2016-10-30 00:30:00','CET'); > Query: select from_utc_timestamp('2016-10-30 00:30:00','CET') > +--+ > | from_utc_timestamp('2016-10-30 00:30:00', 'cet') | > +--+ > | 2016-10-30 01:30:00 | > +--+ > {code} > Hive: > {code} > select from_utc_timestamp('2016-10-30 00:30:00','CET'); > INFO : OK > ++--+ > | _c0 | > ++--+ > | 2016-10-30 01:30:00.0 | > ++--+ > {code} > MySQL: > {code} > mysql> SELECT CONVERT_TZ( '2016-10-30 00:30:00', 'UTC', 'CET' ); > +---+ > | CONVERT_TZ( '2016-10-30 00:30:00', 'UTC', 'CET' ) | > +---+ > | 2016-10-30 02:30:00 | > +---+ > {code} > At 00:30AM UTC, the daylight saving has not finished so the time different > should still be 2 hours rather than 1. MySQL returned correct result > At 1:30, results are correct: > Impala: > {code} > Query: select from_utc_timestamp('2016-10-30 01:30:00','CET') > +--+ > | from_utc_timestamp('2016-10-30 01:30:00', 'cet') | > +--+ > | 2016-10-30 02:30:00 | > +--+ > Fetched 1 row(s) in 0.01s > {code} > Hive: > {code} > ++--+ > | _c0 | > ++--+ > | 2016-10-30 02:30:00.0 | > ++--+ > 1 row selected (0.252 seconds) > {code} > MySQL: > {code} > mysql> SELECT CONVERT_TZ( '2016-10-30 01:30:00', 'UTC', 'CET' ); > +---+ > | CONVERT_TZ( '2016-10-30 01:30:00', 'UTC', 'CET' ) | > +---+ > | 2016-10-30 02:30:00 | > +---+ > 1 row in set (0.00 sec) > {code} > Seems like a bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14903) from_utc_time function issue for CET daylight savings
[ https://issues.apache.org/jira/browse/HIVE-14903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-14903: Description: Based on https://en.wikipedia.org/wiki/Central_European_Summer_Time, the summer time is between 1:00 UTC on the last Sunday of March and 1:00 on the last Sunday of October, see test case below: Impala: {code} select from_utc_timestamp('2016-10-30 00:30:00','CET'); Query: select from_utc_timestamp('2016-10-30 00:30:00','CET') +--+ | from_utc_timestamp('2016-10-30 00:30:00', 'cet') | +--+ | 2016-10-30 01:30:00 | +--+ {code} Hive: {code} select from_utc_timestamp('2016-10-30 00:30:00','CET'); INFO : OK ++--+ | _c0 | ++--+ | 2016-10-30 01:30:00.0 | ++--+ {code} MySQL: {code} mysql> SELECT CONVERT_TZ( '2016-10-30 00:30:00', 'UTC', 'CET' ); +---+ | CONVERT_TZ( '2016-10-30 00:30:00', 'UTC', 'CET' ) | +---+ | 2016-10-30 02:30:00 | +---+ {code} At 00:30AM UTC, the daylight saving has not finished so the time different should still be 2 hours rather than 1. MySQL returned correct result At 1:30, results are correct: Impala: {code} Query: select from_utc_timestamp('2016-10-30 01:30:00','CET') +--+ | from_utc_timestamp('2016-10-30 01:30:00', 'cet') | +--+ | 2016-10-30 02:30:00 | +--+ Fetched 1 row(s) in 0.01s {code} Hive: {code} ++--+ | _c0 | ++--+ | 2016-10-30 02:30:00.0 | ++--+ 1 row selected (0.252 seconds) {code} MySQL: {code} mysql> SELECT CONVERT_TZ( '2016-10-30 01:30:00', 'UTC', 'CET' ); +---+ | CONVERT_TZ( '2016-10-30 01:30:00', 'UTC', 'CET' ) | +---+ | 2016-10-30 02:30:00 | +---+ 1 row in set (0.00 sec) {code} Seems like a bug. was: Based on https://en.wikipedia.org/wiki/Central_European_Summer_Time, the summer time is between 1:00 UTC on the last Sunday of March and 1:00 on the last Sunday of October, see test case below: Impala: {code} [host-10-17-101-195.coe.cloudera.com:25003] > select from_utc_timestamp('2016-10-30 00:30:00','CET'); Query: select from_utc_timestamp('2016-10-30 00:30:00','CET') +--+ | from_utc_timestamp('2016-10-30 00:30:00', 'cet') | +--+ | 2016-10-30 01:30:00 | +--+ {code} Hive: {code} 0: jdbc:hive2://host-10-17-101-195.coe.cloude> select from_utc_timestamp('2016-10-30 00:30:00','CET'); INFO : OK ++--+ | _c0 | ++--+ | 2016-10-30 01:30:00.0 | ++--+ {code} MySQL: {code} mysql> SELECT CONVERT_TZ( '2016-10-30 00:30:00', 'UTC', 'CET' ); +---+ | CONVERT_TZ( '2016-10-30 00:30:00', 'UTC', 'CET' ) | +---+ | 2016-10-30 02:30:00 | +---+ {code} At 00:30AM UTC, the daylight saving has not finished so the time different should still be 2 hours rather than 1. MySQL returned correct result At 1:30, results are correct: Impala: {code} Query: select from_utc_timestamp('2016-10-30 01:30:00','CET') +--+ | from_utc_timestamp('2016-10-30 01:30:00', 'cet') | +--+ | 2016-10-30 02:30:00 | +--+ Fetched 1 row(s) in 0.01s {code} Hive: {code} ++--+ | _c0 | ++--+ | 2016-10-30 02:30:00.0 | ++--+ 1 row selected (0.252 seconds) {code} MySQL: {code} mysql> SELECT CONVERT_TZ( '2016-10-30 01:30:00', 'UTC', 'CET' ); +---+ | CONVERT_TZ( '2016-10-30 01:30:00', 'UTC', 'CET' ) | +---+ | 2016-10-30 02:30:00 | +---+ 1 row in set (0.00 sec) {code} Seems like a bug.
[jira] [Updated] (HIVE-14482) Add and drop table partition is not audit logged in HMS
[ https://issues.apache.org/jira/browse/HIVE-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-14482: Status: Patch Available (was: Open) Added audit log for add, drop and rename partitions > Add and drop table partition is not audit logged in HMS > --- > > Key: HIVE-14482 > URL: https://issues.apache.org/jira/browse/HIVE-14482 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.1.0 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > Attachments: HIVE-14482.patch > > > When running: > {code} > ALTER TABLE test DROP PARTITION (b=140); > {code} > I only see the following in the HMS log: > {code} > 2016-08-08 23:12:34,081 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,082 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test > 2016-08-08 23:12:34,082 INFO > org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: > ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default > tbl=test > 2016-08-08 23:12:34,094 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: end=1470723154094 duration=13 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 > retryCount=0 error=false> > 2016-08-08 23:12:34,095 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,095 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_partitions_by_expr : > db=default tbl=test > 2016-08-08 23:12:34,096 INFO > org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: > ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_partitions_by_expr > : db=default tbl=test > 2016-08-08 23:12:34,112 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: start=1470723154095 end=1470723154112 duration=17 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 > retryCount=0 error=false> > 2016-08-08 23:12:34,172 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,173 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test > 2016-08-08 23:12:34,173 INFO > org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: > ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default > tbl=test > 2016-08-08 23:12:34,186 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: end=1470723154186 duration=14 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 > retryCount=0 error=false> > 2016-08-08 23:12:34,186 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,187 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test > 2016-08-08 23:12:34,187 INFO > org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: > ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default > tbl=test > 2016-08-08 23:12:34,199 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: end=1470723154199 duration=13 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 > retryCount=0 error=false> > 2016-08-08 23:12:34,203 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,215 INFO org.apache.hadoop.hive.metastore.ObjectStore: > [pool-4-thread-2]: JDO filter pushdown cannot be used: Filtering is supported > only on partition keys of type string > 2016-08-08 23:12:34,226 ERROR org.apache.hadoop.hdfs.KeyProviderCache: > [pool-4-thread-2]: Could not find uri with key > [dfs.encryption.key.provider.uri] to create a keyProvider !! > 2016-08-08 23:12:34,239 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: dropPartition() will move partition-directories to > trash-directory. > 2016-08-08 23:12:34,239 INFO hive.metastore.hivemetastoressimpl: > [pool-4-thread-2]: deleting > hdfs://:8020/user/hive/warehouse/default/test/b=140 > 2016-08-08 23:12:34,247 INFO org.apache.hadoop.fs.TrashPolicyDefault: > [pool-4-thread-2]: Moved: > 'hdfs://:8020/user/hive/warehouse/default/test/b=140' to trash at: > hdfs://:8020/user/hive/.Trash/Current/user/hive/warehouse/default/test/b=140 >
[jira] [Updated] (HIVE-14482) Add and drop table partition is not audit logged in HMS
[ https://issues.apache.org/jira/browse/HIVE-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-14482: Attachment: HIVE-14482.patch > Add and drop table partition is not audit logged in HMS > --- > > Key: HIVE-14482 > URL: https://issues.apache.org/jira/browse/HIVE-14482 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.1.0 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > Attachments: HIVE-14482.patch > > > When running: > {code} > ALTER TABLE test DROP PARTITION (b=140); > {code} > I only see the following in the HMS log: > {code} > 2016-08-08 23:12:34,081 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,082 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test > 2016-08-08 23:12:34,082 INFO > org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: > ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default > tbl=test > 2016-08-08 23:12:34,094 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: end=1470723154094 duration=13 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 > retryCount=0 error=false> > 2016-08-08 23:12:34,095 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,095 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_partitions_by_expr : > db=default tbl=test > 2016-08-08 23:12:34,096 INFO > org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: > ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_partitions_by_expr > : db=default tbl=test > 2016-08-08 23:12:34,112 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: start=1470723154095 end=1470723154112 duration=17 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 > retryCount=0 error=false> > 2016-08-08 23:12:34,172 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,173 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test > 2016-08-08 23:12:34,173 INFO > org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: > ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default > tbl=test > 2016-08-08 23:12:34,186 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: end=1470723154186 duration=14 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 > retryCount=0 error=false> > 2016-08-08 23:12:34,186 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,187 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test > 2016-08-08 23:12:34,187 INFO > org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: > ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default > tbl=test > 2016-08-08 23:12:34,199 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: end=1470723154199 duration=13 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 > retryCount=0 error=false> > 2016-08-08 23:12:34,203 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,215 INFO org.apache.hadoop.hive.metastore.ObjectStore: > [pool-4-thread-2]: JDO filter pushdown cannot be used: Filtering is supported > only on partition keys of type string > 2016-08-08 23:12:34,226 ERROR org.apache.hadoop.hdfs.KeyProviderCache: > [pool-4-thread-2]: Could not find uri with key > [dfs.encryption.key.provider.uri] to create a keyProvider !! > 2016-08-08 23:12:34,239 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: dropPartition() will move partition-directories to > trash-directory. > 2016-08-08 23:12:34,239 INFO hive.metastore.hivemetastoressimpl: > [pool-4-thread-2]: deleting > hdfs://:8020/user/hive/warehouse/default/test/b=140 > 2016-08-08 23:12:34,247 INFO org.apache.hadoop.fs.TrashPolicyDefault: > [pool-4-thread-2]: Moved: > 'hdfs://:8020/user/hive/warehouse/default/test/b=140' to trash at: > hdfs://:8020/user/hive/.Trash/Current/user/hive/warehouse/default/test/b=140 > 2016-08-08 23:12:34,247 INFO
[jira] [Updated] (HIVE-14482) Add and drop table partition is not audit logged in HMS
[ https://issues.apache.org/jira/browse/HIVE-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-14482: Summary: Add and drop table partition is not audit logged in HMS (was: Drop table partition is not audit logged in HMS) > Add and drop table partition is not audit logged in HMS > --- > > Key: HIVE-14482 > URL: https://issues.apache.org/jira/browse/HIVE-14482 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.1.0 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > > When running: > {code} > ALTER TABLE test DROP PARTITION (b=140); > {code} > I only see the following in the HMS log: > {code} > 2016-08-08 23:12:34,081 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,082 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test > 2016-08-08 23:12:34,082 INFO > org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: > ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default > tbl=test > 2016-08-08 23:12:34,094 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: end=1470723154094 duration=13 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 > retryCount=0 error=false> > 2016-08-08 23:12:34,095 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,095 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_partitions_by_expr : > db=default tbl=test > 2016-08-08 23:12:34,096 INFO > org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: > ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_partitions_by_expr > : db=default tbl=test > 2016-08-08 23:12:34,112 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: start=1470723154095 end=1470723154112 duration=17 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 > retryCount=0 error=false> > 2016-08-08 23:12:34,172 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,173 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test > 2016-08-08 23:12:34,173 INFO > org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: > ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default > tbl=test > 2016-08-08 23:12:34,186 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: end=1470723154186 duration=14 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 > retryCount=0 error=false> > 2016-08-08 23:12:34,186 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,187 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test > 2016-08-08 23:12:34,187 INFO > org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: > ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default > tbl=test > 2016-08-08 23:12:34,199 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: end=1470723154199 duration=13 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 > retryCount=0 error=false> > 2016-08-08 23:12:34,203 INFO org.apache.hadoop.hive.ql.log.PerfLogger: > [pool-4-thread-2]: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2016-08-08 23:12:34,215 INFO org.apache.hadoop.hive.metastore.ObjectStore: > [pool-4-thread-2]: JDO filter pushdown cannot be used: Filtering is supported > only on partition keys of type string > 2016-08-08 23:12:34,226 ERROR org.apache.hadoop.hdfs.KeyProviderCache: > [pool-4-thread-2]: Could not find uri with key > [dfs.encryption.key.provider.uri] to create a keyProvider !! > 2016-08-08 23:12:34,239 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: > [pool-4-thread-2]: dropPartition() will move partition-directories to > trash-directory. > 2016-08-08 23:12:34,239 INFO hive.metastore.hivemetastoressimpl: > [pool-4-thread-2]: deleting > hdfs://:8020/user/hive/warehouse/default/test/b=140 > 2016-08-08 23:12:34,247 INFO org.apache.hadoop.fs.TrashPolicyDefault: > [pool-4-thread-2]: Moved: > 'hdfs://:8020/user/hive/warehouse/default/test/b=140' to trash at: > hdfs://:8020/user/hive/.Trash/Current/user/hive/warehouse/default/test/b=140 > 2016-08-08
[jira] [Commented] (HIVE-14537) Please add HMS audit logs for ADD and CHANGE COLUMN operations
[ https://issues.apache.org/jira/browse/HIVE-14537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420528#comment-15420528 ] Eric Lin commented on HIVE-14537: - This one seems a bit more complicated than HIVE-14482, so I created this separately. > Please add HMS audit logs for ADD and CHANGE COLUMN operations > -- > > Key: HIVE-14537 > URL: https://issues.apache.org/jira/browse/HIVE-14537 > Project: Hive > Issue Type: Improvement >Reporter: Eric Lin >Priority: Minor > > Currently if you ALTER TABLE test ADD COLUMNS (c int), the only audit log we > can see is: > {code} > 2016-08-09T13:29:56,411 INFO [pool-6-thread-2]: metastore.HiveMetaStore > (HiveMetaStore.java:logInfo(754)) - 2: source:127.0.0.1 alter_table: > db=default tbl=test newtbl=test > {code} > This is not enough to tell which columns are added or changed. It would be > useful to add such information. > Ideally we could see: > {code} > 2016-08-09T13:29:56,411 INFO [pool-6-thread-2]: metastore.HiveMetaStore > (HiveMetaStore.java:logInfo(754)) - 2: source:127.0.0.1 alter_table: > db=default tbl=test newtbl=test newCol=c[int] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14482) Drop table partition is not audit logged in HMS
[ https://issues.apache.org/jira/browse/HIVE-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-14482: Description: When running: {code} ALTER TABLE test DROP PARTITION (b=140); {code} I only see the following in the HMS log: {code} 2016-08-08 23:12:34,081 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [pool-4-thread-2]: 2016-08-08 23:12:34,082 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test 2016-08-08 23:12:34,082 INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default tbl=test 2016-08-08 23:12:34,094 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [pool-4-thread-2]: 2016-08-08 23:12:34,095 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [pool-4-thread-2]: 2016-08-08 23:12:34,095 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_partitions_by_expr : db=default tbl=test 2016-08-08 23:12:34,096 INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_partitions_by_expr : db=default tbl=test 2016-08-08 23:12:34,112 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [pool-4-thread-2]: 2016-08-08 23:12:34,172 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [pool-4-thread-2]: 2016-08-08 23:12:34,173 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test 2016-08-08 23:12:34,173 INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default tbl=test 2016-08-08 23:12:34,186 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [pool-4-thread-2]: 2016-08-08 23:12:34,186 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [pool-4-thread-2]: 2016-08-08 23:12:34,187 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test 2016-08-08 23:12:34,187 INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default tbl=test 2016-08-08 23:12:34,199 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [pool-4-thread-2]: 2016-08-08 23:12:34,203 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [pool-4-thread-2]: 2016-08-08 23:12:34,215 INFO org.apache.hadoop.hive.metastore.ObjectStore: [pool-4-thread-2]: JDO filter pushdown cannot be used: Filtering is supported only on partition keys of type string 2016-08-08 23:12:34,226 ERROR org.apache.hadoop.hdfs.KeyProviderCache: [pool-4-thread-2]: Could not find uri with key [dfs.encryption.key.provider.uri] to create a keyProvider !! 2016-08-08 23:12:34,239 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: [pool-4-thread-2]: dropPartition() will move partition-directories to trash-directory. 2016-08-08 23:12:34,239 INFO hive.metastore.hivemetastoressimpl: [pool-4-thread-2]: deleting hdfs://:8020/user/hive/warehouse/default/test/b=140 2016-08-08 23:12:34,247 INFO org.apache.hadoop.fs.TrashPolicyDefault: [pool-4-thread-2]: Moved: 'hdfs://:8020/user/hive/warehouse/default/test/b=140' to trash at: hdfs://:8020/user/hive/.Trash/Current/user/hive/warehouse/default/test/b=140 2016-08-08 23:12:34,247 INFO hive.metastore.hivemetastoressimpl: [pool-4-thread-2]: Moved to trash: hdfs://:8020/user/hive/warehouse/default/test/b=140 2016-08-08 23:12:34,247 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [pool-4-thread-2]: {code} There is no entry in the "HiveMetaStore.audit" to show that partition b=140 was dropped. When we add a new partition, we can see the following: {code} 2016-08-08 23:04:48,534 INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx append_partition : db=default tbl=test[130] {code} Ideally we should see the similar message when dropping partitions. was: When running: {code} ALTER TABLE test DROP PARTITION (b=140); {code} I only see the following in the HMS log: {code} 2016-08-08 23:12:34,081 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [pool-4-thread-2]: 2016-08-08 23:12:34,082 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=case_104408 tbl=test 2016-08-08 23:12:34,082 INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=case_104408 tbl=test 2016-08-08 23:12:34,094 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [pool-4-thread-2]: 2016-08-08 23:12:34,095 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [pool-4-thread-2]: 2016-08-08 23:12:34,095 INFO
[jira] [Updated] (HIVE-13160) HS2 unable to load UDFs on startup when HMS is not ready
[ https://issues.apache.org/jira/browse/HIVE-13160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-13160: Description: The error looks like this: {code} 2016-02-18 14:43:54,251 INFO hive.metastore: [main]: Trying to connect to metastore with URI thrift://host-10-17-81-201.coe.cloudera.com:9083 2016-02-18 14:48:54,692 WARN hive.metastore: [main]: Failed to connect to the MetaStore Server... 2016-02-18 14:48:54,692 INFO hive.metastore: [main]: Waiting 1 seconds before next connection attempt. 2016-02-18 14:48:55,692 INFO hive.metastore: [main]: Trying to connect to metastore with URI thrift://host-10-17-81-201.coe.cloudera.com:9083 2016-02-18 14:53:55,800 WARN hive.metastore: [main]: Failed to connect to the MetaStore Server... 2016-02-18 14:53:55,800 INFO hive.metastore: [main]: Waiting 1 seconds before next connection attempt. 2016-02-18 14:53:56,801 INFO hive.metastore: [main]: Trying to connect to metastore with URI thrift://host-10-17-81-201.coe.cloudera.com:9083 2016-02-18 14:58:56,967 WARN hive.metastore: [main]: Failed to connect to the MetaStore Server... 2016-02-18 14:58:56,967 INFO hive.metastore: [main]: Waiting 1 seconds before next connection attempt. 2016-02-18 14:58:57,994 WARN hive.ql.metadata.Hive: [main]: Failed to register all functions. java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1492) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:64) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:74) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2915) ... 016-02-18 14:58:57,997 INFO hive.metastore: [main]: Trying to connect to metastore with URI thrift://host-10-17-81-201.coe.cloudera.com:9083 2016-02-18 15:03:58,094 WARN hive.metastore: [main]: Failed to connect to the MetaStore Server... 2016-02-18 15:03:58,095 INFO hive.metastore: [main]: Waiting 1 seconds before next connection attempt. 2016-02-18 15:03:59,095 INFO hive.metastore: [main]: Trying to connect to metastore with URI thrift://host-10-17-81-201.coe.cloudera.com:9083 2016-02-18 15:08:59,203 WARN hive.metastore: [main]: Failed to connect to the MetaStore Server... 2016-02-18 15:08:59,203 INFO hive.metastore: [main]: Waiting 1 seconds before next connection attempt. 2016-02-18 15:09:00,203 INFO hive.metastore: [main]: Trying to connect to metastore with URI thrift://host-10-17-81-201.coe.cloudera.com:9083 2016-02-18 15:14:00,304 WARN hive.metastore: [main]: Failed to connect to the MetaStore Server... 2016-02-18 15:14:00,304 INFO hive.metastore: [main]: Waiting 1 seconds before next connection attempt. 2016-02-18 15:14:01,306 INFO org.apache.hive.service.server.HiveServer2: [main]: Shutting down HiveServer2 2016-02-18 15:14:01,308 INFO org.apache.hive.service.server.HiveServer2: [main]: Exception caught when calling stop of HiveServer2 before retrying start java.lang.NullPointerException at org.apache.hive.service.server.HiveServer2.stop(HiveServer2.java:283) at org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:351) at org.apache.hive.service.server.HiveServer2.access$400(HiveServer2.java:69) at org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:545) {code} And then none of the functions will be available for use as HS2 does not re-register them after HMS is up and ready. This is not desired behaviour, we shouldn't allow HS2 to be in a servicing state if function list is not ready. Or, maybe instead of initialize the function list when HS2 starts, try to load the function list when each Hive session is created. Of course we can have a cache of function list somewhere for better performance, but we would better decouple it from class Hive. was: The error looks like this: {code} 2016-02-24 21:16:09,901 INFO hive.metastore: [main]: Trying to connect to metastore with URI thrift://host-10-17-81-201.coe.cloudera.com:9083 2016-02-24 21:16:09,971 WARN hive.metastore: [main]: Failed to connect to the MetaStore Server... 2016-02-24 21:16:09,971 INFO hive.metastore: [main]: Waiting 1 seconds before next connection attempt. 2016-02-24 21:16:10,971 INFO hive.metastore: [main]: Trying to connect to metastore with URI thrift://host-10-17-81-201.coe.cloudera.com:9083 2016-02-24 21:16:10,975 WARN hive.metastore: [main]: Failed to connect to the MetaStore Server... 2016-02-24 21:16:10,976 INFO hive.metastore: [main]: Waiting 1 seconds before next connection attempt. 2016-02-24 21:16:11,976 INFO hive.metastore: [main]: Trying to connect to metastore with URI
[jira] [Updated] (HIVE-12788) Setting hive.optimize.union.remove to TRUE will break UNION ALL with aggregate functions
[ https://issues.apache.org/jira/browse/HIVE-12788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-12788: Description: See the test case below: {code} 0: jdbc:hive2://localhost:1/default> create table test (a int); 0: jdbc:hive2://localhost:1/default> insert overwrite table test values (1); 0: jdbc:hive2://localhost:1/default> set hive.optimize.union.remove=true; No rows affected (0.01 seconds) 0: jdbc:hive2://localhost:1/default> set hive.mapred.supports.subdirectories=true; No rows affected (0.007 seconds) 0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL SELECT COUNT(1) FROM test; +--+--+ | _u1._c0 | +--+--+ +--+--+ {code} Run the same query without setting hive.mapred.supports.subdirectories and hive.optimize.union.remove to true will give correct result: {code} 0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL SELECT COUNT(1) FROM test; +--+--+ | _u1._c0 | +--+--+ | 1| | 1| +--+--+ {code} UNION ALL without COUNT function will work as expected: {code} 0: jdbc:hive2://localhost:1/default> select * from test UNION ALL SELECT * FROM test; ++--+ | _u1.a | ++--+ | 1 | | 1 | ++--+ {code} was: See the test case below: {code} 0: jdbc:hive2://localhost:1/default> create table test (a int); 0: jdbc:hive2://localhost:1/default> set hive.optimize.union.remove=true; No rows affected (0.01 seconds) 0: jdbc:hive2://localhost:1/default> set hive.mapred.supports.subdirectories=true; No rows affected (0.007 seconds) 0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL SELECT COUNT(1) FROM test; +--+--+ | _u1._c0 | +--+--+ +--+--+ {code} Run the same query without setting hive.mapred.supports.subdirectories and hive.optimize.union.remove to true will give correct result: {code} 0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL SELECT COUNT(1) FROM test; +--+--+ | _u1._c0 | +--+--+ | 1| | 1| +--+--+ {code} UNION ALL without COUNT function will work as expected: {code} 0: jdbc:hive2://localhost:1/default> select * from test UNION ALL SELECT * FROM test; ++--+ | _u1.a | ++--+ | 1 | | 1 | ++--+ {code} > Setting hive.optimize.union.remove to TRUE will break UNION ALL with > aggregate functions > > > Key: HIVE-12788 > URL: https://issues.apache.org/jira/browse/HIVE-12788 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.1.1 >Reporter: Eric Lin > > See the test case below: > {code} > 0: jdbc:hive2://localhost:1/default> create table test (a int); > 0: jdbc:hive2://localhost:1/default> insert overwrite table test values > (1); > 0: jdbc:hive2://localhost:1/default> set hive.optimize.union.remove=true; > No rows affected (0.01 seconds) > 0: jdbc:hive2://localhost:1/default> set > hive.mapred.supports.subdirectories=true; > No rows affected (0.007 seconds) > 0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL > SELECT COUNT(1) FROM test; > +--+--+ > | _u1._c0 | > +--+--+ > +--+--+ > {code} > Run the same query without setting hive.mapred.supports.subdirectories and > hive.optimize.union.remove to true will give correct result: > {code} > 0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL > SELECT COUNT(1) FROM test; > +--+--+ > | _u1._c0 | > +--+--+ > | 1| > | 1| > +--+--+ > {code} > UNION ALL without COUNT function will work as expected: > {code} > 0: jdbc:hive2://localhost:1/default> select * from test UNION ALL SELECT > * FROM test; > ++--+ > | _u1.a | > ++--+ > | 1 | > | 1 | > ++--+ > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12788) Setting hive.optimize.union.remove to TRUE will break UNION ALL with aggregate functions
[ https://issues.apache.org/jira/browse/HIVE-12788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-12788: Description: See the test case below: {code} 0: jdbc:hive2://localhost:1/default> create table test (a int); 0: jdbc:hive2://localhost:1/default> insert overwrite table test values (1); 0: jdbc:hive2://localhost:1/default> set hive.optimize.union.remove=true; No rows affected (0.01 seconds) 0: jdbc:hive2://localhost:1/default> set hive.mapred.supports.subdirectories=true; No rows affected (0.007 seconds) 0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL SELECT COUNT(1) FROM test; +--+--+ | _u1._c0 | +--+--+ +--+--+ {code} UNION ALL without COUNT function will work as expected: {code} 0: jdbc:hive2://localhost:1/default> select * from test UNION ALL SELECT * FROM test; ++--+ | _u1.a | ++--+ | 1 | | 1 | ++--+ {code} Run the same query without setting hive.mapred.supports.subdirectories and hive.optimize.union.remove to true will give correct result: {code} 0: jdbc:hive2://localhost:1/default> set hive.optimize.union.remove; +---+--+ |set| +---+--+ | hive.optimize.union.remove=false | +---+--+ 0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL SELECT COUNT(1) FROM test; +--+--+ | _u1._c0 | +--+--+ | 1| | 1| +--+--+ {code} was: See the test case below: {code} 0: jdbc:hive2://localhost:1/default> create table test (a int); 0: jdbc:hive2://localhost:1/default> insert overwrite table test values (1); 0: jdbc:hive2://localhost:1/default> set hive.optimize.union.remove=true; No rows affected (0.01 seconds) 0: jdbc:hive2://localhost:1/default> set hive.mapred.supports.subdirectories=true; No rows affected (0.007 seconds) 0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL SELECT COUNT(1) FROM test; +--+--+ | _u1._c0 | +--+--+ +--+--+ {code} Run the same query without setting hive.mapred.supports.subdirectories and hive.optimize.union.remove to true will give correct result: {code} 0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL SELECT COUNT(1) FROM test; +--+--+ | _u1._c0 | +--+--+ | 1| | 1| +--+--+ {code} UNION ALL without COUNT function will work as expected: {code} 0: jdbc:hive2://localhost:1/default> select * from test UNION ALL SELECT * FROM test; ++--+ | _u1.a | ++--+ | 1 | | 1 | ++--+ {code} > Setting hive.optimize.union.remove to TRUE will break UNION ALL with > aggregate functions > > > Key: HIVE-12788 > URL: https://issues.apache.org/jira/browse/HIVE-12788 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.1.1 >Reporter: Eric Lin > > See the test case below: > {code} > 0: jdbc:hive2://localhost:1/default> create table test (a int); > 0: jdbc:hive2://localhost:1/default> insert overwrite table test values > (1); > 0: jdbc:hive2://localhost:1/default> set hive.optimize.union.remove=true; > No rows affected (0.01 seconds) > 0: jdbc:hive2://localhost:1/default> set > hive.mapred.supports.subdirectories=true; > No rows affected (0.007 seconds) > 0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL > SELECT COUNT(1) FROM test; > +--+--+ > | _u1._c0 | > +--+--+ > +--+--+ > {code} > UNION ALL without COUNT function will work as expected: > {code} > 0: jdbc:hive2://localhost:1/default> select * from test UNION ALL SELECT > * FROM test; > ++--+ > | _u1.a | > ++--+ > | 1 | > | 1 | > ++--+ > {code} > Run the same query without setting hive.mapred.supports.subdirectories and > hive.optimize.union.remove to true will give correct result: > {code} > 0: jdbc:hive2://localhost:1/default> set hive.optimize.union.remove; > +---+--+ > |set| > +---+--+ > | hive.optimize.union.remove=false | > +---+--+ > 0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL > SELECT COUNT(1) FROM test; > +--+--+ > | _u1._c0 | > +--+--+ > | 1| > | 1| > +--+--+ > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12506) SHOW CREATE TABLE command creates a table that does not work for RCFile format
[ https://issues.apache.org/jira/browse/HIVE-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-12506: Description: See the following test case: 1) Create a table with RCFile format: {code} DROP TABLE IF EXISTS test; CREATE TABLE test (a int) PARTITIONED BY (p int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS RCFILE; {code} 2) run "DESC FORMATTED test" {code} # Storage Information SerDe Library: org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe InputFormat:org.apache.hadoop.hive.ql.io.RCFileInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.RCFileOutputFormat {code} shows that SerDe used is "ColumnarSerDe" 3) run "SHOW CREATE TABLE" and get the output: {code} CREATE TABLE `test`( `a` int) PARTITIONED BY ( `p` int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.RCFileInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.RCFileOutputFormat' LOCATION 'hdfs://node5.lab.cloudera.com:8020/user/hive/warehouse/case_78732.db/test' TBLPROPERTIES ( 'transient_lastDdlTime'='1448343875') {code} Note that there is no mention of "ColumnarSerDe" 4) Drop the table and then create the table again using the output from 3) 5) Check the output of "DESC FORMATTED test" {code} # Storage Information SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe InputFormat:org.apache.hadoop.hive.ql.io.RCFileInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.RCFileOutputFormat {code} The SerDe falls back to "LazySimpleSerDe", which is not correct. Any further query tries to INSERT or SELECT this table will fail with errors I suspect that we can't specify ROW FORMAT DELIMITED with ROW FORMAT SERDE at the same time at table creation, this causes confusion to end users as copy table structure using "SHOW CREATE TABLE" will not work. was: See the following test case: 1) Create a table with RCFile format: {code} DROP TABLE IF EXISTS test; CREATE TABLE test (a int) PARTITIONED BY (p int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS RCFILE; {code} 2) run "DESC FORMATTED test" {code} # Storage Information SerDe Library: org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe InputFormat:org.apache.hadoop.hive.ql.io.RCFileInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.RCFileOutputFormat {code} shows that SerDe used is "ColumnarSerDe" 3) run "SHOW CREATE TABLE" and get the output: {code} CREATE TABLE `test`( `a` int) PARTITIONED BY ( `p` int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.RCFileInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.RCFileOutputFormat' LOCATION 'hdfs://node5.lab.cloudera.com:8020/user/hive/warehouse/case_78732.db/test' TBLPROPERTIES ( 'transient_lastDdlTime'='1448343875') {code} Note that there is no mention of "ColumnarSerDe" 4) Drop the table and then create the table again using the output from 3) 5) Check the output of "DESC FORMATTED test" {code} # Storage Information SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe InputFormat:org.apache.hadoop.hive.ql.io.RCFileInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.RCFileOutputFormat {code} The SerDe falls back to "LazySimpleSerDe", which is not correct. Any further query tries to INSERT or SELECT this table will fail with errors > SHOW CREATE TABLE command creates a table that does not work for RCFile format > -- > > Key: HIVE-12506 > URL: https://issues.apache.org/jira/browse/HIVE-12506 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Affects Versions: 1.1.1 >Reporter: Eric Lin > > See the following test case: > 1) Create a table with RCFile format: > {code} > DROP TABLE IF EXISTS test; > CREATE TABLE test (a int) PARTITIONED BY (p int) > ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' > STORED AS RCFILE; > {code} > 2) run "DESC FORMATTED test" > {code} > # Storage Information > SerDe Library:org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe > InputFormat: org.apache.hadoop.hive.ql.io.RCFileInputFormat > OutputFormat: org.apache.hadoop.hive.ql.io.RCFileOutputFormat > {code} > shows that SerDe used is "ColumnarSerDe" > 3) run "SHOW CREATE TABLE" and get the output: > {code} > CREATE TABLE `test`( > `a` int) > PARTITIONED BY ( > `p` int) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '|' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.RCFileInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.RCFileOutputFormat' > LOCATION >
[jira] [Updated] (HIVE-12368) Provide support for different versions of same JAR files for loading UDFs
[ https://issues.apache.org/jira/browse/HIVE-12368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-12368: Priority: Minor (was: Major) > Provide support for different versions of same JAR files for loading UDFs > - > > Key: HIVE-12368 > URL: https://issues.apache.org/jira/browse/HIVE-12368 > Project: Hive > Issue Type: New Feature > Components: HiveServer2 >Reporter: Eric Lin >Assignee: Vaibhav Gumashta >Priority: Minor > > If we want to setup one cluster to support multiple environments, namely DEV, > QA, PRE-PROD etc, this is done in the way that data from different > environment will be generated into different locations in the HDFS and Hive > Databases. > This works fine, however, when need to deploy UDF classes for different > environments, it becomes tricky, as each class has the same namespace, even > though we have created udf-dev.jar, udf-qa.jar etc. > Creating each HS2 per environment is another option, however, with LB setup, > it becomes harder. > The request is to have HS2 support loading UDFs in such environment, the > implementation is open to discussion. > I know that this setup is no ideal, as the better approach is to have one > cluster per environment, however, in the case that you have limited number of > nodes in the setup, this might be the only option and I believe many people > can benefit from it. > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12368) Provide support for different versions of same JAR files for loading UDFs
[ https://issues.apache.org/jira/browse/HIVE-12368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-12368: Description: If we want to setup one cluster to support multiple environments, namely DEV, QA, PRE-PROD etc, this is done in the way that data from different environments can be generated into different locations in the HDFS and Hive Databases. This works fine, however, when need to deploy UDF classes for different environments, it becomes tricky, as each class has the same namespace, even though we have created udf-dev.jar, udf-qa.jar etc. Creating each HS2 per environment is another option, however, with LB setup, it becomes harder. The request is to have HS2 support loading UDFs in such environment, the implementation is open to discussion. I know that this setup is no ideal, as the better approach is to have one cluster per environment, however, in the case that you have limited number of nodes in the setup, this might be the only option and I believe many people can benefit from it. Thanks was: If we want to setup one cluster to support multiple environments, namely DEV, QA, PRE-PROD etc, this is done in the way that data from different environment will be generated into different locations in the HDFS and Hive Databases. This works fine, however, when need to deploy UDF classes for different environments, it becomes tricky, as each class has the same namespace, even though we have created udf-dev.jar, udf-qa.jar etc. Creating each HS2 per environment is another option, however, with LB setup, it becomes harder. The request is to have HS2 support loading UDFs in such environment, the implementation is open to discussion. I know that this setup is no ideal, as the better approach is to have one cluster per environment, however, in the case that you have limited number of nodes in the setup, this might be the only option and I believe many people can benefit from it. Thanks > Provide support for different versions of same JAR files for loading UDFs > - > > Key: HIVE-12368 > URL: https://issues.apache.org/jira/browse/HIVE-12368 > Project: Hive > Issue Type: New Feature > Components: HiveServer2 >Reporter: Eric Lin >Assignee: Vaibhav Gumashta >Priority: Minor > > If we want to setup one cluster to support multiple environments, namely DEV, > QA, PRE-PROD etc, this is done in the way that data from different > environments can be generated into different locations in the HDFS and Hive > Databases. > This works fine, however, when need to deploy UDF classes for different > environments, it becomes tricky, as each class has the same namespace, even > though we have created udf-dev.jar, udf-qa.jar etc. > Creating each HS2 per environment is another option, however, with LB setup, > it becomes harder. > The request is to have HS2 support loading UDFs in such environment, the > implementation is open to discussion. > I know that this setup is no ideal, as the better approach is to have one > cluster per environment, however, in the case that you have limited number of > nodes in the setup, this might be the only option and I believe many people > can benefit from it. > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)