[jira] [Work logged] (HIVE-25246) Fix the clean up of open repl created transactions
[ https://issues.apache.org/jira/browse/HIVE-25246?focusedWorklogId=622859=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-622859 ] ASF GitHub Bot logged work on HIVE-25246: - Author: ASF GitHub Bot Created on: 15/Jul/21 05:54 Start Date: 15/Jul/21 05:54 Worklog Time Spent: 10m Work Description: pkumarsinha merged pull request #2396: URL: https://github.com/apache/hive/pull/2396 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 622859) Time Spent: 5h 10m (was: 5h) > Fix the clean up of open repl created transactions > -- > > Key: HIVE-25246 > URL: https://issues.apache.org/jira/browse/HIVE-25246 > Project: Hive > Issue Type: Improvement >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > Labels: pull-request-available > Time Spent: 5h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25246) Fix the clean up of open repl created transactions
[ https://issues.apache.org/jira/browse/HIVE-25246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Sinha resolved HIVE-25246. - Resolution: Fixed Committed to master. Thanks for the patch, [~haymant] !! > Fix the clean up of open repl created transactions > -- > > Key: HIVE-25246 > URL: https://issues.apache.org/jira/browse/HIVE-25246 > Project: Hive > Issue Type: Improvement >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > Labels: pull-request-available > Time Spent: 5h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25246) Fix the clean up of open repl created transactions
[ https://issues.apache.org/jira/browse/HIVE-25246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17381053#comment-17381053 ] Pravin Sinha commented on HIVE-25246: - +1 > Fix the clean up of open repl created transactions > -- > > Key: HIVE-25246 > URL: https://issues.apache.org/jira/browse/HIVE-25246 > Project: Hive > Issue Type: Improvement >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > Labels: pull-request-available > Time Spent: 5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25331) Create database query doesn't create MANAGEDLOCATION directory
[ https://issues.apache.org/jira/browse/HIVE-25331?focusedWorklogId=622858=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-622858 ] ASF GitHub Bot logged work on HIVE-25331: - Author: ASF GitHub Bot Created on: 15/Jul/21 05:44 Start Date: 15/Jul/21 05:44 Worklog Time Spent: 10m Work Description: ujc714 opened a new pull request #2478: URL: https://github.com/apache/hive/pull/2478 ### What changes were proposed in this pull request? Use a default directory for MANAGEDLOCATION if it's not assigned in CREATE DATABASE query. ### Why are the changes needed? HMS doesn't create MANAGEDLOCATION directory if it's NULL. If we run a CTAS query immediately after the CREATE DATABASE query and the staging directory is not under the MANAGEDLOCATION directory, the CTAS query will fail in MOVE task. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? mvn test -Dtest=TestMiniTezCliDriver -Dqfile=create_database.q -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 622858) Remaining Estimate: 0h Time Spent: 10m > Create database query doesn't create MANAGEDLOCATION directory > -- > > Key: HIVE-25331 > URL: https://issues.apache.org/jira/browse/HIVE-25331 > Project: Hive > Issue Type: Bug >Reporter: Robbie Zhang >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > If we don't assign MANAGEDLOCATION in a "create database" query, the > MANAGEDLOCATION will be NULL so HMS doesn't create the directory. In this > case, a CTAS query immediately after the CREATE DATABASE query might fail in > MOVE task due to "destination's parent does not exist". I can use the > following script to reproduce this issue: > {code:java} > set hive.support.concurrency=true; > set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; > create database testdb location '/tmp/testdb.db'; > create table testdb.test as select 1; > {code} > If the staging directory is under the MANAGEDLOCATION directory, the CTAS > query is fine as the MANAGEDLOCATION directory is created while creating the > staging directory. Since we set LOCATION to a default directory when LOCATION > is not assigned in the CREATE DATABASE query, I believe it's worth to set > MANAGEDLOCATION to a default directory, too. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25331) Create database query doesn't create MANAGEDLOCATION directory
[ https://issues.apache.org/jira/browse/HIVE-25331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-25331: -- Labels: pull-request-available (was: ) > Create database query doesn't create MANAGEDLOCATION directory > -- > > Key: HIVE-25331 > URL: https://issues.apache.org/jira/browse/HIVE-25331 > Project: Hive > Issue Type: Bug >Reporter: Robbie Zhang >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > If we don't assign MANAGEDLOCATION in a "create database" query, the > MANAGEDLOCATION will be NULL so HMS doesn't create the directory. In this > case, a CTAS query immediately after the CREATE DATABASE query might fail in > MOVE task due to "destination's parent does not exist". I can use the > following script to reproduce this issue: > {code:java} > set hive.support.concurrency=true; > set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; > create database testdb location '/tmp/testdb.db'; > create table testdb.test as select 1; > {code} > If the staging directory is under the MANAGEDLOCATION directory, the CTAS > query is fine as the MANAGEDLOCATION directory is created while creating the > staging directory. Since we set LOCATION to a default directory when LOCATION > is not assigned in the CREATE DATABASE query, I believe it's worth to set > MANAGEDLOCATION to a default directory, too. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25330) Make FS calls in CopyUtils retryable
[ https://issues.apache.org/jira/browse/HIVE-25330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Sinha reassigned HIVE-25330: --- > Make FS calls in CopyUtils retryable > > > Key: HIVE-25330 > URL: https://issues.apache.org/jira/browse/HIVE-25330 > Project: Hive > Issue Type: Improvement >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-22626) Fix Replication related tests
[ https://issues.apache.org/jira/browse/HIVE-22626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena updated HIVE-22626: Description: For TestStatsReplicationScenariosACIDNoAutogather: this test is running "alone" because but still; it sometimes runs more than 40m which results in a timeout a jira search reveals that was pretty common: [https://issues.apache.org/jira/issues/?jql=text%20~%20%22TestStatsReplicationScenariosACIDNoAutogather%22%20order%20by%20updated%20desc] from the hive logs: * it seems like after a few minutes this test starts there is an exception: {code:java} 2019-12-10T22:43:19,594 DEBUG [Finalizer] metastore.HiveMetaStoreClient: Unable to shutdown metastore client. Will try closing transport directly. org.apache.thrift.transport.TTransportException: java.net.SocketException: Socket closed at org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:161) ~[libthrift-0.9.3-1.jar:0.9.3-1] at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:73) ~[libthrift-0.9.3-1.jar:0.9.3-1] at org.apache.thrift.TServiceClient.sendBaseOneway(TServiceClient.java:66) ~[libthrift-0.9.3-1.jar:0.9.3-1] at com.facebook.fb303.FacebookService$Client.send_shutdown(FacebookService.java:436) ~[libfb303-0.9.3.jar:?] at com.facebook.fb303.FacebookService$Client.shutdown(FacebookService.java:430) ~[libfb303-0.9.3.jar:?] at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.close(HiveMetaStoreClient.java:776) [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_102] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_102] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_102] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_102] at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212) [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at com.sun.proxy.$Proxy62.close(Unknown Source) [?:?] at org.apache.hadoop.hive.ql.metadata.Hive.close(Hive.java:542) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.metadata.Hive.finalize(Hive.java:514) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at java.lang.System$2.invokeFinalize(System.java:1270) [?:1.8.0_102] at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:98) [?:1.8.0_102] at java.lang.ref.Finalizer.access$100(Finalizer.java:34) [?:1.8.0_102] at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:210) [?:1.8.0_102] Caused by: java.net.SocketException: Socket closed at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:116) ~[?:1.8.0_102] at java.net.SocketOutputStream.write(SocketOutputStream.java:153) ~[?:1.8.0_102] at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) ~[?:1.8.0_102] at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) ~[?:1.8.0_102] at org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:159) ~[libthrift-0.9.3-1.jar:0.9.3-1] {code} * after that some NoSuchObjectExceptions follow * and then some replications seems to happen I don't fully understand this; I'll attach the logs... was: this test is running "alone" because but still; it sometimes runs more than 40m which results in a timeout a jira search reveals that was pretty common: https://issues.apache.org/jira/issues/?jql=text%20~%20%22TestStatsReplicationScenariosACIDNoAutogather%22%20order%20by%20updated%20desc from the hive logs: * it seems like after a few minutes this test starts there is an exception: {code} 2019-12-10T22:43:19,594 DEBUG [Finalizer] metastore.HiveMetaStoreClient: Unable to shutdown metastore client. Will try closing transport directly. org.apache.thrift.transport.TTransportException: java.net.SocketException: Socket closed at org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:161) ~[libthrift-0.9.3-1.jar:0.9.3-1] at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:73) ~[libthrift-0.9.3-1.jar:0.9.3-1] at org.apache.thrift.TServiceClient.sendBaseOneway(TServiceClient.java:66) ~[libthrift-0.9.3-1.jar:0.9.3-1] at com.facebook.fb303.FacebookService$Client.send_shutdown(FacebookService.java:436) ~[libfb303-0.9.3.jar:?] at com.facebook.fb303.FacebookService$Client.shutdown(FacebookService.java:430) ~[libfb303-0.9.3.jar:?] at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.close(HiveMetaStoreClient.java:776) [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at
[jira] [Updated] (HIVE-22626) Fix Replication related tests
[ https://issues.apache.org/jira/browse/HIVE-22626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena updated HIVE-22626: Summary: Fix Replication related tests (was: Fix TestStatsReplicationScenariosACIDNoAutogather) > Fix Replication related tests > - > > Key: HIVE-22626 > URL: https://issues.apache.org/jira/browse/HIVE-22626 > Project: Hive > Issue Type: Sub-task > Components: Test >Reporter: Zoltan Haindrich >Assignee: Ayush Saxena >Priority: Major > Labels: pull-request-available > Attachments: qalogs.tgz > > Time Spent: 1h > Remaining Estimate: 0h > > this test is running "alone" because but still; it sometimes runs more than > 40m which results in a timeout > a jira search reveals that was pretty common: > https://issues.apache.org/jira/issues/?jql=text%20~%20%22TestStatsReplicationScenariosACIDNoAutogather%22%20order%20by%20updated%20desc > from the hive logs: > * it seems like after a few minutes this test starts there is an exception: > {code} > 2019-12-10T22:43:19,594 DEBUG [Finalizer] metastore.HiveMetaStoreClient: > Unable to shutdown metastore client. Will try closing transport directly. > org.apache.thrift.transport.TTransportException: java.net.SocketException: > Socket closed > at > org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:161) > ~[libthrift-0.9.3-1.jar:0.9.3-1] > at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:73) > ~[libthrift-0.9.3-1.jar:0.9.3-1] > at > org.apache.thrift.TServiceClient.sendBaseOneway(TServiceClient.java:66) > ~[libthrift-0.9.3-1.jar:0.9.3-1] > at > com.facebook.fb303.FacebookService$Client.send_shutdown(FacebookService.java:436) > ~[libfb303-0.9.3.jar:?] > at > com.facebook.fb303.FacebookService$Client.shutdown(FacebookService.java:430) > ~[libfb303-0.9.3.jar:?] > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.close(HiveMetaStoreClient.java:776) > [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_102] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_102] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_102] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_102] > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212) > [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at com.sun.proxy.$Proxy62.close(Unknown Source) [?:?] > at org.apache.hadoop.hive.ql.metadata.Hive.close(Hive.java:542) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.metadata.Hive.finalize(Hive.java:514) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at java.lang.System$2.invokeFinalize(System.java:1270) [?:1.8.0_102] > at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:98) > [?:1.8.0_102] > at java.lang.ref.Finalizer.access$100(Finalizer.java:34) [?:1.8.0_102] > at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:210) > [?:1.8.0_102] > Caused by: java.net.SocketException: Socket closed > at > java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:116) > ~[?:1.8.0_102] > at java.net.SocketOutputStream.write(SocketOutputStream.java:153) > ~[?:1.8.0_102] > at > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) > ~[?:1.8.0_102] > at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) > ~[?:1.8.0_102] > at > org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:159) > ~[libthrift-0.9.3-1.jar:0.9.3-1] > {code} > * after that some NoSuchObjectExceptions follow > * and then some replications seems to happen > I don't fully understand this; I'll attach the logs... -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Issue Comment Deleted] (HIVE-25329) CTAS creates a managed table as non-ACID table
[ https://issues.apache.org/jira/browse/HIVE-25329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sai Hemanth Gantasala updated HIVE-25329: - Comment: was deleted (was: I think HIVE-24625 is causing this issue. [~robbiezhang] Can you please verify if that is the case?) > CTAS creates a managed table as non-ACID table > -- > > Key: HIVE-25329 > URL: https://issues.apache.org/jira/browse/HIVE-25329 > Project: Hive > Issue Type: Bug >Reporter: Robbie Zhang >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > According to HIVE-22158, MANAGED tables should be ACID tables only. When we > set hive.create.as.external.legacy to true, the query like 'create managed > table as select 1' creates a non-ACID table. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25329) CTAS creates a managed table as non-ACID table
[ https://issues.apache.org/jira/browse/HIVE-25329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380921#comment-17380921 ] Sai Hemanth Gantasala commented on HIVE-25329: -- I think HIVE-24625 is causing this issue. [~robbiezhang] Can you please verify if that is the case? > CTAS creates a managed table as non-ACID table > -- > > Key: HIVE-25329 > URL: https://issues.apache.org/jira/browse/HIVE-25329 > Project: Hive > Issue Type: Bug >Reporter: Robbie Zhang >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > According to HIVE-22158, MANAGED tables should be ACID tables only. When we > set hive.create.as.external.legacy to true, the query like 'create managed > table as select 1' creates a non-ACID table. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25329) CTAS creates a managed table as non-ACID table
[ https://issues.apache.org/jira/browse/HIVE-25329?focusedWorklogId=622810=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-622810 ] ASF GitHub Bot logged work on HIVE-25329: - Author: ASF GitHub Bot Created on: 15/Jul/21 00:58 Start Date: 15/Jul/21 00:58 Worklog Time Spent: 10m Work Description: ujc714 opened a new pull request #2477: URL: https://github.com/apache/hive/pull/2477 ### What changes were proposed in this pull request? This change makes "create managed table" query to create a managed table regardless of "hive.create.as.external.legacy=true". ### Why are the changes needed? According to HIVE-22158, MANAGED tables should be ACID tables only. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? mvn test -Dtest=TestMiniTezCliDriver -Dqfile=create_table.q -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 622810) Remaining Estimate: 0h Time Spent: 10m > CTAS creates a managed table as non-ACID table > -- > > Key: HIVE-25329 > URL: https://issues.apache.org/jira/browse/HIVE-25329 > Project: Hive > Issue Type: Bug >Reporter: Robbie Zhang >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > According to HIVE-22158, MANAGED tables should be ACID tables only. When we > set hive.create.as.external.legacy to true, the query like 'create managed > table as select 1' creates a non-ACID table. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25329) CTAS creates a managed table as non-ACID table
[ https://issues.apache.org/jira/browse/HIVE-25329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-25329: -- Labels: pull-request-available (was: ) > CTAS creates a managed table as non-ACID table > -- > > Key: HIVE-25329 > URL: https://issues.apache.org/jira/browse/HIVE-25329 > Project: Hive > Issue Type: Bug >Reporter: Robbie Zhang >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > According to HIVE-22158, MANAGED tables should be ACID tables only. When we > set hive.create.as.external.legacy to true, the query like 'create managed > table as select 1' creates a non-ACID table. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25282) Drop/Alter table in REMOTE db should fail
[ https://issues.apache.org/jira/browse/HIVE-25282?focusedWorklogId=622737=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-622737 ] ASF GitHub Bot logged work on HIVE-25282: - Author: ASF GitHub Bot Created on: 14/Jul/21 21:07 Start Date: 14/Jul/21 21:07 Worklog Time Spent: 10m Work Description: dantongdong closed pull request #2450: URL: https://github.com/apache/hive/pull/2450 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 622737) Time Spent: 0.5h (was: 20m) > Drop/Alter table in REMOTE db should fail > - > > Key: HIVE-25282 > URL: https://issues.apache.org/jira/browse/HIVE-25282 > Project: Hive > Issue Type: Sub-task >Reporter: Dantong Dong >Assignee: Dantong Dong >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Drop/Alter table statement should be explicitly rejected in REMOTE database. > In consistency with HIVE-24425: Create table in REMOTE db should fail. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25282) Drop/Alter table in REMOTE db should fail
[ https://issues.apache.org/jira/browse/HIVE-25282?focusedWorklogId=622738=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-622738 ] ASF GitHub Bot logged work on HIVE-25282: - Author: ASF GitHub Bot Created on: 14/Jul/21 21:07 Start Date: 14/Jul/21 21:07 Worklog Time Spent: 10m Work Description: dantongdong commented on pull request #2450: URL: https://github.com/apache/hive/pull/2450#issuecomment-880210387 Fix has been committed to master. Closing this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 622738) Time Spent: 40m (was: 0.5h) > Drop/Alter table in REMOTE db should fail > - > > Key: HIVE-25282 > URL: https://issues.apache.org/jira/browse/HIVE-25282 > Project: Hive > Issue Type: Sub-task >Reporter: Dantong Dong >Assignee: Dantong Dong >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > Drop/Alter table statement should be explicitly rejected in REMOTE database. > In consistency with HIVE-24425: Create table in REMOTE db should fail. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25323) Fix TestVectorCastStatement
[ https://issues.apache.org/jira/browse/HIVE-25323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380779#comment-17380779 ] Adesh Kumar Rao commented on HIVE-25323: [~klcopp] Debugged it and figured out that the issue is happening because vectorization is using older java libraries (`java.sql.timestamp` instead of using `java.time.Instant/ZonedDateTime` etc) and are not considering timezones while converting to/from timestamp. This is not the case with non-vectorized execution anymore. The test which is failing is comparing the result of non-vectorized execution with vectorized execution and hence failing (because of different results). I discussed this with [~mmccline], Fixing the tests will actually require fixing the vectorized timestamp conversion. I will go ahead and raise a PR to comment out the tests and create another jira for the proper fix for vectorized execution. > Fix TestVectorCastStatement > --- > > Key: HIVE-25323 > URL: https://issues.apache.org/jira/browse/HIVE-25323 > Project: Hive > Issue Type: Task >Reporter: Karen Coppage >Assignee: Adesh Kumar Rao >Priority: Major > > org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorCastStatement > tests were timing out after 5 hours. > [http://ci.hive.apache.org/job/hive-flaky-check/307/] > First failure: > [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/master/749/pipeline/242] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25282) Drop/Alter table in REMOTE db should fail
[ https://issues.apache.org/jira/browse/HIVE-25282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam resolved HIVE-25282. -- Resolution: Fixed Fix has been committed to master. Closing the jira. Thank you for the contribution [~dantongdong] > Drop/Alter table in REMOTE db should fail > - > > Key: HIVE-25282 > URL: https://issues.apache.org/jira/browse/HIVE-25282 > Project: Hive > Issue Type: Sub-task >Reporter: Dantong Dong >Assignee: Dantong Dong >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Drop/Alter table statement should be explicitly rejected in REMOTE database. > In consistency with HIVE-24425: Create table in REMOTE db should fail. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25307) Hive Server 2 crashes when Thrift library encounters particular security protocol issue
[ https://issues.apache.org/jira/browse/HIVE-25307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-25307: Attachment: hive-thrift-fix2-02-3_1.patch > Hive Server 2 crashes when Thrift library encounters particular security > protocol issue > --- > > Key: HIVE-25307 > URL: https://issues.apache.org/jira/browse/HIVE-25307 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Labels: pull-request-available > Attachments: hive-thrift-fix2-01-3_1.patch, > hive-thrift-fix2-02-3_1.patch > > Time Spent: 10m > Remaining Estimate: 0h > > A RuntimeException is thrown by the Thrift library that causes Hive Server 2 > to crash on our customer's machine. If you Google this the exception has been > reported a couple of times over the years but not fixed. A blog (see > references below) says it is an occasional security protocol issue between > Hive Server 2 and a proxy like a Gateway. > One challenge in the older 0.9.3 Thrift version was the Thrift > TTransportFactory getTransport method declaration had throws no Exceptions. > Hence the likely choice of RuntimeException. But that Exception is fatal to > Hive Server 2. > The proposed fix is a work around is we catch RuntimeException in the inner > class TUGIAssumingTransportFactory of the HadoopThriftAuthBridge class in > Hive Server 2. And throw a throw the RuntimeException's (inner) cause (e.g. > TSaslTransportException) as a TTransportException. > Once the Thrift library stops throwing RuntimeException or we catch fatal > Throwable exceptions in the Thrift library's TThreadPoolServer's inner class > WorkerProcess run method and display them, the RuntimeException try/catch > clause can be removed. > ExceptionClassName: > java.lang.RuntimeException > ExceptionStackTrace: > java.lang.RuntimeException: > org.apache.thrift.transport.TSaslTransportException: No data or no sasl data > in the stream > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) > at > org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:694) > at > org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:691) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:360) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710) > at > org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:691) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.thrift.transport.TSaslTransportException: No data or no > sasl data in the stream > at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:326) > at > org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) > ... 10 more > > References: > [Hive server 2 thrift error - Cloudera Community - > 34293|https://community.cloudera.com/t5/Support-Questions/Hive-server-2-thrift-error/td-p/34293] > Eric Lin blog "“NO DATA OR NO SASL DATA IN THE STREAM” ERROR IN HIVESERVER2 > LOG" > HIVE-12754 AuthTypes.NONE cause exception after HS2 start - ASF JIRA > (apache.org) > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25137) getAllWriteEventInfo should go through the HMS client instead of using RawStore directly
[ https://issues.apache.org/jira/browse/HIVE-25137?focusedWorklogId=622622=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-622622 ] ASF GitHub Bot logged work on HIVE-25137: - Author: ASF GitHub Bot Created on: 14/Jul/21 17:36 Start Date: 14/Jul/21 17:36 Worklog Time Spent: 10m Work Description: hsnusonic commented on a change in pull request #2457: URL: https://github.com/apache/hive/pull/2457#discussion_r669819264 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java ## @@ -10447,4 +10447,22 @@ public void drop_package(DropPackageRequest request) throws MetaException { endFunction("drop_package", ex == null, ex); } } + + @Override + public List get_all_write_event_info(GetAllWriteEventInfoRequest request) + throws MetaException { +startFunction("get_all_write_event_info"); +Exception ex = null; +try { + List writeEventInfoList = + getMS().getAllWriteEventInfo(request.getTxnId(), request.getDbName(), request.getTableName()); + return writeEventInfoList == null ? Collections.emptyList() : writeEventInfoList; +} catch (Exception e) { + LOG.error("Caught exception", e); + ex = e; Review comment: `ex` is passed into `endFunction`. I guess the most obvious advantage is to get the function audited so that we can get some metrics (time, counter). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 622622) Time Spent: 0.5h (was: 20m) > getAllWriteEventInfo should go through the HMS client instead of using > RawStore directly > > > Key: HIVE-25137 > URL: https://issues.apache.org/jira/browse/HIVE-25137 > Project: Hive > Issue Type: Improvement >Reporter: Pratyush Madhukar >Assignee: Yu-Wen Lai >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > {code:java} > private List getAllWriteEventInfo(Context withinContext) > throws Exception { > String contextDbName = > StringUtils.normalizeIdentifier(withinContext.replScope.getDbName()); > RawStore rawStore = > HiveMetaStore.HMSHandler.getMSForConf(withinContext.hiveConf); > List writeEventInfoList > = rawStore.getAllWriteEventInfo(eventMessage.getTxnId(), > contextDbName, null); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25307) Hive Server 2 crashes when Thrift library encounters particular security protocol issue
[ https://issues.apache.org/jira/browse/HIVE-25307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-25307: Attachment: (was: hive-thrift-fix-02-3_1.patch) > Hive Server 2 crashes when Thrift library encounters particular security > protocol issue > --- > > Key: HIVE-25307 > URL: https://issues.apache.org/jira/browse/HIVE-25307 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Labels: pull-request-available > Attachments: hive-thrift-fix2-01-3_1.patch > > Time Spent: 10m > Remaining Estimate: 0h > > A RuntimeException is thrown by the Thrift library that causes Hive Server 2 > to crash on our customer's machine. If you Google this the exception has been > reported a couple of times over the years but not fixed. A blog (see > references below) says it is an occasional security protocol issue between > Hive Server 2 and a proxy like a Gateway. > One challenge in the older 0.9.3 Thrift version was the Thrift > TTransportFactory getTransport method declaration had throws no Exceptions. > Hence the likely choice of RuntimeException. But that Exception is fatal to > Hive Server 2. > The proposed fix is a work around is we catch RuntimeException in the inner > class TUGIAssumingTransportFactory of the HadoopThriftAuthBridge class in > Hive Server 2. And throw a throw the RuntimeException's (inner) cause (e.g. > TSaslTransportException) as a TTransportException. > Once the Thrift library stops throwing RuntimeException or we catch fatal > Throwable exceptions in the Thrift library's TThreadPoolServer's inner class > WorkerProcess run method and display them, the RuntimeException try/catch > clause can be removed. > ExceptionClassName: > java.lang.RuntimeException > ExceptionStackTrace: > java.lang.RuntimeException: > org.apache.thrift.transport.TSaslTransportException: No data or no sasl data > in the stream > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) > at > org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:694) > at > org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:691) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:360) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710) > at > org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:691) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.thrift.transport.TSaslTransportException: No data or no > sasl data in the stream > at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:326) > at > org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) > ... 10 more > > References: > [Hive server 2 thrift error - Cloudera Community - > 34293|https://community.cloudera.com/t5/Support-Questions/Hive-server-2-thrift-error/td-p/34293] > Eric Lin blog "“NO DATA OR NO SASL DATA IN THE STREAM” ERROR IN HIVESERVER2 > LOG" > HIVE-12754 AuthTypes.NONE cause exception after HS2 start - ASF JIRA > (apache.org) > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25307) Hive Server 2 crashes when Thrift library encounters particular security protocol issue
[ https://issues.apache.org/jira/browse/HIVE-25307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-25307: Attachment: hive-thrift-fix-02-3_1.patch > Hive Server 2 crashes when Thrift library encounters particular security > protocol issue > --- > > Key: HIVE-25307 > URL: https://issues.apache.org/jira/browse/HIVE-25307 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Labels: pull-request-available > Attachments: hive-thrift-fix-02-3_1.patch, > hive-thrift-fix2-01-3_1.patch > > Time Spent: 10m > Remaining Estimate: 0h > > A RuntimeException is thrown by the Thrift library that causes Hive Server 2 > to crash on our customer's machine. If you Google this the exception has been > reported a couple of times over the years but not fixed. A blog (see > references below) says it is an occasional security protocol issue between > Hive Server 2 and a proxy like a Gateway. > One challenge in the older 0.9.3 Thrift version was the Thrift > TTransportFactory getTransport method declaration had throws no Exceptions. > Hence the likely choice of RuntimeException. But that Exception is fatal to > Hive Server 2. > The proposed fix is a work around is we catch RuntimeException in the inner > class TUGIAssumingTransportFactory of the HadoopThriftAuthBridge class in > Hive Server 2. And throw a throw the RuntimeException's (inner) cause (e.g. > TSaslTransportException) as a TTransportException. > Once the Thrift library stops throwing RuntimeException or we catch fatal > Throwable exceptions in the Thrift library's TThreadPoolServer's inner class > WorkerProcess run method and display them, the RuntimeException try/catch > clause can be removed. > ExceptionClassName: > java.lang.RuntimeException > ExceptionStackTrace: > java.lang.RuntimeException: > org.apache.thrift.transport.TSaslTransportException: No data or no sasl data > in the stream > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) > at > org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:694) > at > org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:691) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:360) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710) > at > org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:691) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.thrift.transport.TSaslTransportException: No data or no > sasl data in the stream > at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:326) > at > org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) > ... 10 more > > References: > [Hive server 2 thrift error - Cloudera Community - > 34293|https://community.cloudera.com/t5/Support-Questions/Hive-server-2-thrift-error/td-p/34293] > Eric Lin blog "“NO DATA OR NO SASL DATA IN THE STREAM” ERROR IN HIVESERVER2 > LOG" > HIVE-12754 AuthTypes.NONE cause exception after HS2 start - ASF JIRA > (apache.org) > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25256) Support ALTER TABLE CHANGE COLUMN for Iceberg
[ https://issues.apache.org/jira/browse/HIVE-25256?focusedWorklogId=622543=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-622543 ] ASF GitHub Bot logged work on HIVE-25256: - Author: ASF GitHub Bot Created on: 14/Jul/21 15:12 Start Date: 14/Jul/21 15:12 Worklog Time Spent: 10m Work Description: marton-bod commented on a change in pull request #2463: URL: https://github.com/apache/hive/pull/2463#discussion_r669708908 ## File path: iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java ## @@ -505,19 +512,83 @@ private void handleReplaceColumns(org.apache.hadoop.hive.metastore.api.Table hms } for (FieldSchema updatedCol : schemaDifference.getTypeChanged()) { - Type newType = HiveSchemaUtil.convert(TypeInfoUtils.getTypeInfoFromTypeString(updatedCol.getType())); - if (!(newType instanceof Type.PrimitiveType)) { -throw new MetaException(String.format("Cannot promote type of column: '%s' to a non-primitive type: %s.", -updatedCol.getName(), newType)); - } - updateSchema.updateColumn(updatedCol.getName(), (Type.PrimitiveType) newType, updatedCol.getComment()); + updateSchema.updateColumn(updatedCol.getName(), getPrimitiveTypeOrThrow(updatedCol), updatedCol.getComment()); } for (FieldSchema updatedCol : schemaDifference.getCommentChanged()) { updateSchema.updateColumnDoc(updatedCol.getName(), updatedCol.getComment()); } } + private void handleChangeColumn(org.apache.hadoop.hive.metastore.api.Table hmsTable) throws MetaException { +List hmsCols = hmsTable.getSd().getCols(); +List icebergCols = HiveSchemaUtil.convert(icebergTable.schema()); +// compute schema difference for renames, type/comment changes +HiveSchemaUtil.SchemaDifference schemaDifference = HiveSchemaUtil.getSchemaDiff(hmsCols, icebergCols, true); +// check column reorder (which could happen even in the absence of any rename, type or comment change) +Map renameMapping = ImmutableMap.of(); +if (!schemaDifference.getMissingFromSecond().isEmpty()) { + renameMapping = ImmutableMap.of( + schemaDifference.getMissingFromSecond().get(0).getName(), + schemaDifference.getMissingFromFirst().get(0).getName()); +} +Pair> outOfOrder = HiveSchemaUtil.getFirstOutOfOrderColumn(hmsCols, icebergCols, +renameMapping); + +if (!schemaDifference.isEmpty() || outOfOrder != null) { + updateSchema = icebergTable.updateSchema(); +} else { + // we should get here if the user didn't change anything about the column + // i.e. no changes to the name, type, comment or order + LOG.info("Found no difference between new and old schema for ALTER TABLE CHANGE COLUMN for" + + " table: {}. There will be no Iceberg commit.", hmsTable.getTableName()); + return; +} + +// case 1: column name has been renamed +if (!schemaDifference.getMissingFromSecond().isEmpty()) { + FieldSchema updatedField = schemaDifference.getMissingFromSecond().get(0); + FieldSchema oldField = schemaDifference.getMissingFromFirst().get(0); + updateSchema.renameColumn(oldField.getName(), updatedField.getName()); + + // check if type/comment changed too + if (!Objects.equals(oldField.getType(), updatedField.getType())) { +updateSchema.updateColumn(oldField.getName(), getPrimitiveTypeOrThrow(updatedField), updatedField.getComment()); + } else if (!Objects.equals(oldField.getComment(), updatedField.getComment())) { +updateSchema.updateColumnDoc(oldField.getName(), updatedField.getComment()); + } + +// case 2: only column type and/or comment changed +} else if (!schemaDifference.getTypeChanged().isEmpty()) { + FieldSchema updatedField = schemaDifference.getTypeChanged().get(0); + updateSchema.updateColumn(updatedField.getName(), getPrimitiveTypeOrThrow(updatedField), + updatedField.getComment()); + +// case 3: only comment changed +} else if (!schemaDifference.getCommentChanged().isEmpty()) { + FieldSchema updatedField = schemaDifference.getCommentChanged().get(0); + updateSchema.updateColumnDoc(updatedField.getName(), updatedField.getComment()); +} Review comment: Yes, it should. In that case, we'd have an entry in both the `commentChanged` and the `typeChanged` lists in the `schemaDifference`. There's a unit test covering this called `testAlterTableChangeColumnTypeAndComment` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking
[jira] [Work logged] (HIVE-25256) Support ALTER TABLE CHANGE COLUMN for Iceberg
[ https://issues.apache.org/jira/browse/HIVE-25256?focusedWorklogId=622542=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-622542 ] ASF GitHub Bot logged work on HIVE-25256: - Author: ASF GitHub Bot Created on: 14/Jul/21 15:11 Start Date: 14/Jul/21 15:11 Worklog Time Spent: 10m Work Description: marton-bod commented on a change in pull request #2463: URL: https://github.com/apache/hive/pull/2463#discussion_r669708042 ## File path: iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java ## @@ -86,6 +88,10 @@ private static final Splitter TABLE_NAME_SPLITTER = Splitter.on(".."); private static final String TABLE_NAME_SEPARATOR = ".."; + private static final List ALLOWED_ALTER_OPS = ImmutableList.of( + AlterTableType.ADDPROPS, AlterTableType.DROPPROPS, AlterTableType.ADDCOLS, + AlterTableType.REPLACE_COLUMNS, AlterTableType.RENAME_COLUMN, AlterTableType.SETPARTITIONSPEC); + Review comment: Yes, good idea! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 622542) Time Spent: 1h 50m (was: 1h 40m) > Support ALTER TABLE CHANGE COLUMN for Iceberg > - > > Key: HIVE-25256 > URL: https://issues.apache.org/jira/browse/HIVE-25256 > Project: Hive > Issue Type: New Feature >Reporter: Marton Bod >Assignee: Marton Bod >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > In order to provide support for renaming/changing the data type of a single > column, we should add alter table change column support for Iceberg tables. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25256) Support ALTER TABLE CHANGE COLUMN for Iceberg
[ https://issues.apache.org/jira/browse/HIVE-25256?focusedWorklogId=622541=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-622541 ] ASF GitHub Bot logged work on HIVE-25256: - Author: ASF GitHub Bot Created on: 14/Jul/21 15:08 Start Date: 14/Jul/21 15:08 Worklog Time Spent: 10m Work Description: szlta commented on a change in pull request #2463: URL: https://github.com/apache/hive/pull/2463#discussion_r669696347 ## File path: iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java ## @@ -86,6 +88,10 @@ private static final Splitter TABLE_NAME_SPLITTER = Splitter.on(".."); private static final String TABLE_NAME_SEPARATOR = ".."; + private static final List ALLOWED_ALTER_OPS = ImmutableList.of( + AlterTableType.ADDPROPS, AlterTableType.DROPPROPS, AlterTableType.ADDCOLS, + AlterTableType.REPLACE_COLUMNS, AlterTableType.RENAME_COLUMN, AlterTableType.SETPARTITIONSPEC); + Review comment: Shouldn't we rely on EnumSet SUPPORTED_ALTER_OPS found in meta hook class here too? ## File path: iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java ## @@ -505,19 +512,83 @@ private void handleReplaceColumns(org.apache.hadoop.hive.metastore.api.Table hms } for (FieldSchema updatedCol : schemaDifference.getTypeChanged()) { - Type newType = HiveSchemaUtil.convert(TypeInfoUtils.getTypeInfoFromTypeString(updatedCol.getType())); - if (!(newType instanceof Type.PrimitiveType)) { -throw new MetaException(String.format("Cannot promote type of column: '%s' to a non-primitive type: %s.", -updatedCol.getName(), newType)); - } - updateSchema.updateColumn(updatedCol.getName(), (Type.PrimitiveType) newType, updatedCol.getComment()); + updateSchema.updateColumn(updatedCol.getName(), getPrimitiveTypeOrThrow(updatedCol), updatedCol.getComment()); } for (FieldSchema updatedCol : schemaDifference.getCommentChanged()) { updateSchema.updateColumnDoc(updatedCol.getName(), updatedCol.getComment()); } } + private void handleChangeColumn(org.apache.hadoop.hive.metastore.api.Table hmsTable) throws MetaException { +List hmsCols = hmsTable.getSd().getCols(); +List icebergCols = HiveSchemaUtil.convert(icebergTable.schema()); +// compute schema difference for renames, type/comment changes +HiveSchemaUtil.SchemaDifference schemaDifference = HiveSchemaUtil.getSchemaDiff(hmsCols, icebergCols, true); +// check column reorder (which could happen even in the absence of any rename, type or comment change) +Map renameMapping = ImmutableMap.of(); +if (!schemaDifference.getMissingFromSecond().isEmpty()) { + renameMapping = ImmutableMap.of( + schemaDifference.getMissingFromSecond().get(0).getName(), + schemaDifference.getMissingFromFirst().get(0).getName()); +} +Pair> outOfOrder = HiveSchemaUtil.getFirstOutOfOrderColumn(hmsCols, icebergCols, +renameMapping); + +if (!schemaDifference.isEmpty() || outOfOrder != null) { + updateSchema = icebergTable.updateSchema(); +} else { + // we should get here if the user didn't change anything about the column + // i.e. no changes to the name, type, comment or order + LOG.info("Found no difference between new and old schema for ALTER TABLE CHANGE COLUMN for" + + " table: {}. There will be no Iceberg commit.", hmsTable.getTableName()); + return; +} + +// case 1: column name has been renamed +if (!schemaDifference.getMissingFromSecond().isEmpty()) { + FieldSchema updatedField = schemaDifference.getMissingFromSecond().get(0); + FieldSchema oldField = schemaDifference.getMissingFromFirst().get(0); + updateSchema.renameColumn(oldField.getName(), updatedField.getName()); + + // check if type/comment changed too + if (!Objects.equals(oldField.getType(), updatedField.getType())) { +updateSchema.updateColumn(oldField.getName(), getPrimitiveTypeOrThrow(updatedField), updatedField.getComment()); + } else if (!Objects.equals(oldField.getComment(), updatedField.getComment())) { +updateSchema.updateColumnDoc(oldField.getName(), updatedField.getComment()); + } + +// case 2: only column type and/or comment changed +} else if (!schemaDifference.getTypeChanged().isEmpty()) { + FieldSchema updatedField = schemaDifference.getTypeChanged().get(0); + updateSchema.updateColumn(updatedField.getName(), getPrimitiveTypeOrThrow(updatedField), + updatedField.getComment()); + +// case 3: only comment changed +} else if (!schemaDifference.getCommentChanged().isEmpty()) { + FieldSchema updatedField = schemaDifference.getCommentChanged().get(0); +
[jira] [Work logged] (HIVE-25288) Fix TestMmCompactorOnTez
[ https://issues.apache.org/jira/browse/HIVE-25288?focusedWorklogId=622539=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-622539 ] ASF GitHub Bot logged work on HIVE-25288: - Author: ASF GitHub Bot Created on: 14/Jul/21 15:07 Start Date: 14/Jul/21 15:07 Worklog Time Spent: 10m Work Description: deniskuzZ opened a new pull request #2476: URL: https://github.com/apache/hive/pull/2476 ### What changes were proposed in this pull request? ### Why are the changes needed? Fixed a bug caused by open txn commit timeout ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? http://ci.hive.apache.org/job/hive-flaky-check/321 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 622539) Remaining Estimate: 0h Time Spent: 10m > Fix TestMmCompactorOnTez > > > Key: HIVE-25288 > URL: https://issues.apache.org/jira/browse/HIVE-25288 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Denys Kuzmenko >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > http://ci.hive.apache.org/job/hive-flaky-check/240/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25288) Fix TestMmCompactorOnTez
[ https://issues.apache.org/jira/browse/HIVE-25288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-25288: -- Labels: pull-request-available (was: ) > Fix TestMmCompactorOnTez > > > Key: HIVE-25288 > URL: https://issues.apache.org/jira/browse/HIVE-25288 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > http://ci.hive.apache.org/job/hive-flaky-check/240/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25256) Support ALTER TABLE CHANGE COLUMN for Iceberg
[ https://issues.apache.org/jira/browse/HIVE-25256?focusedWorklogId=622502=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-622502 ] ASF GitHub Bot logged work on HIVE-25256: - Author: ASF GitHub Bot Created on: 14/Jul/21 13:28 Start Date: 14/Jul/21 13:28 Worklog Time Spent: 10m Work Description: marton-bod commented on a change in pull request #2463: URL: https://github.com/apache/hive/pull/2463#discussion_r669615214 ## File path: iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java ## @@ -505,19 +512,82 @@ private void handleReplaceColumns(org.apache.hadoop.hive.metastore.api.Table hms } for (FieldSchema updatedCol : schemaDifference.getTypeChanged()) { - Type newType = HiveSchemaUtil.convert(TypeInfoUtils.getTypeInfoFromTypeString(updatedCol.getType())); - if (!(newType instanceof Type.PrimitiveType)) { -throw new MetaException(String.format("Cannot promote type of column: '%s' to a non-primitive type: %s.", -updatedCol.getName(), newType)); - } - updateSchema.updateColumn(updatedCol.getName(), (Type.PrimitiveType) newType, updatedCol.getComment()); + updateSchema.updateColumn(updatedCol.getName(), getPrimitiveTypeOrThrow(updatedCol), updatedCol.getComment()); } for (FieldSchema updatedCol : schemaDifference.getCommentChanged()) { updateSchema.updateColumnDoc(updatedCol.getName(), updatedCol.getComment()); } } + private void handleChangeColumn(org.apache.hadoop.hive.metastore.api.Table hmsTable) throws MetaException { +List hmsCols = hmsTable.getSd().getCols(); +List icebergCols = HiveSchemaUtil.convert(icebergTable.schema()); +// compute schema difference for renames, type/comment changes +HiveSchemaUtil.SchemaDifference schemaDifference = HiveSchemaUtil.getSchemaDiff(hmsCols, icebergCols, true); +// check column reorder (which could happen even in the absence of any rename, type or comment change) +Map renameMapping = ImmutableMap.of(); +if (!schemaDifference.getMissingFromSecond().isEmpty()) { + renameMapping = ImmutableMap.of( + schemaDifference.getMissingFromSecond().get(0).getName(), + schemaDifference.getMissingFromFirst().get(0).getName()); +} +Pair> outOfOrder = HiveSchemaUtil.getFirstOutOfOrderColumn(hmsCols, icebergCols, +renameMapping); + +if (!schemaDifference.isEmpty() || outOfOrder != null) { + updateSchema = icebergTable.updateSchema(); +} else { + // we should get here if the user restated the exactly the existing column in the CHANGE COLUMN command Review comment: Updated the comment, let me know if this clarifies it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 622502) Time Spent: 1.5h (was: 1h 20m) > Support ALTER TABLE CHANGE COLUMN for Iceberg > - > > Key: HIVE-25256 > URL: https://issues.apache.org/jira/browse/HIVE-25256 > Project: Hive > Issue Type: New Feature >Reporter: Marton Bod >Assignee: Marton Bod >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > In order to provide support for renaming/changing the data type of a single > column, we should add alter table change column support for Iceberg tables. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25249) Fix TestWorker
[ https://issues.apache.org/jira/browse/HIVE-25249?focusedWorklogId=622501=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-622501 ] ASF GitHub Bot logged work on HIVE-25249: - Author: ASF GitHub Bot Created on: 14/Jul/21 13:25 Start Date: 14/Jul/21 13:25 Worklog Time Spent: 10m Work Description: klcopp commented on pull request #2474: URL: https://github.com/apache/hive/pull/2474#issuecomment-879889772 I remember when I was refactoring query-based compaction, I left out the drop temp table operations. I only realized my mistake because the visibility ids were hard-coded in the tests. I think if we want these tests to run in parallel we should reset the DB before each test (instead of before each test class), no? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 622501) Time Spent: 50m (was: 40m) > Fix TestWorker > -- > > Key: HIVE-25249 > URL: https://issues.apache.org/jira/browse/HIVE-25249 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > http://ci.hive.apache.org/job/hive-precommit/job/PR-2381/1/ > http://ci.hive.apache.org/job/hive-flaky-check/236/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25328) Limit scope of REPLACE COLUMNS for Iceberg tables
[ https://issues.apache.org/jira/browse/HIVE-25328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-25328: -- Labels: pull-request-available (was: ) > Limit scope of REPLACE COLUMNS for Iceberg tables > - > > Key: HIVE-25328 > URL: https://issues.apache.org/jira/browse/HIVE-25328 > Project: Hive > Issue Type: Improvement >Reporter: Marton Bod >Assignee: Marton Bod >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Replace columns is a rather wildcard operation which can do heavy-weight > schema changes. We would only want to allow this operation for dropping > columns for Iceberg tables. For other changes (adding cols, renaming, type > promotion etc.), we should use the CHANGE COLUMN command. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25328) Limit scope of REPLACE COLUMNS for Iceberg tables
[ https://issues.apache.org/jira/browse/HIVE-25328?focusedWorklogId=622496=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-622496 ] ASF GitHub Bot logged work on HIVE-25328: - Author: ASF GitHub Bot Created on: 14/Jul/21 13:12 Start Date: 14/Jul/21 13:12 Worklog Time Spent: 10m Work Description: marton-bod opened a new pull request #2475: URL: https://github.com/apache/hive/pull/2475 ### What changes were proposed in this pull request? Use `REPLACE COLUMNS` only for dropping one or more columns. If there's any column reorder, column rename, adding columns, changing column type or changing column comment, we throw an exception. ### Why are the changes needed? Limit the damage the user can do with this command :) ### Does this PR introduce _any_ user-facing change? Yes, limitation of the REPLACE command (but only for Iceberg tables; Hive tables are unaffected). ### How was this patch tested? unit tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 622496) Remaining Estimate: 0h Time Spent: 10m > Limit scope of REPLACE COLUMNS for Iceberg tables > - > > Key: HIVE-25328 > URL: https://issues.apache.org/jira/browse/HIVE-25328 > Project: Hive > Issue Type: Improvement >Reporter: Marton Bod >Assignee: Marton Bod >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Replace columns is a rather wildcard operation which can do heavy-weight > schema changes. We would only want to allow this operation for dropping > columns for Iceberg tables. For other changes (adding cols, renaming, type > promotion etc.), we should use the CHANGE COLUMN command. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25328) Limit scope of REPLACE COLUMNS for Iceberg tables
[ https://issues.apache.org/jira/browse/HIVE-25328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Bod reassigned HIVE-25328: - > Limit scope of REPLACE COLUMNS for Iceberg tables > - > > Key: HIVE-25328 > URL: https://issues.apache.org/jira/browse/HIVE-25328 > Project: Hive > Issue Type: Improvement >Reporter: Marton Bod >Assignee: Marton Bod >Priority: Major > > Replace columns is a rather wildcard operation which can do heavy-weight > schema changes. We would only want to allow this operation for dropping > columns for Iceberg tables. For other changes (adding cols, renaming, type > promotion etc.), we should use the CHANGE COLUMN command. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25249) Fix TestWorker
[ https://issues.apache.org/jira/browse/HIVE-25249?focusedWorklogId=622492=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-622492 ] ASF GitHub Bot logged work on HIVE-25249: - Author: ASF GitHub Bot Created on: 14/Jul/21 13:04 Start Date: 14/Jul/21 13:04 Worklog Time Spent: 10m Work Description: deniskuzZ commented on pull request #2474: URL: https://github.com/apache/hive/pull/2474#issuecomment-879872969 > So we expect the visibility id to always be different? No, but it could be if tests are executed in parallel. Also, we shouldn't hardcode it anyways but rather extract from the compaction queue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 622492) Time Spent: 40m (was: 0.5h) > Fix TestWorker > -- > > Key: HIVE-25249 > URL: https://issues.apache.org/jira/browse/HIVE-25249 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > http://ci.hive.apache.org/job/hive-precommit/job/PR-2381/1/ > http://ci.hive.apache.org/job/hive-flaky-check/236/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25256) Support ALTER TABLE CHANGE COLUMN for Iceberg
[ https://issues.apache.org/jira/browse/HIVE-25256?focusedWorklogId=622485=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-622485 ] ASF GitHub Bot logged work on HIVE-25256: - Author: ASF GitHub Bot Created on: 14/Jul/21 12:48 Start Date: 14/Jul/21 12:48 Worklog Time Spent: 10m Work Description: marton-bod commented on a change in pull request #2463: URL: https://github.com/apache/hive/pull/2463#discussion_r669567104 ## File path: hbase-handler/src/test/results/negative/hbase_ddl.q.out ## @@ -26,4 +26,4 @@ key int It is a column key value string It is the column string value A masked pattern was here -FAILED: SemanticException [Error 10134]: ALTER TABLE can only be used for [ADDPROPS, DROPPROPS, ADDCOLS, REPLACE_COLUMNS, SETPARTITIONSPEC] to a non-native table hbase_table_1 +FAILED: SemanticException [Error 10134]: ALTER TABLE can only be used for [ADDPROPS, DROPPROPS, ADDCOLS] to a non-native table hbase_table_1 Review comment: Hbase would get this SemanticException: ``` ALTER TABLE can only be used for [ADDPROPS, DROPPROPS, ADDCOLS] to a non-native table hbase_table_1 ``` HBase (and all other storage handlers except for Iceberg at the moment), should get this exception for all alter commands other than SET/UNSET TBLPROPERTIES and ADD COLUMNS. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 622485) Time Spent: 1h 20m (was: 1h 10m) > Support ALTER TABLE CHANGE COLUMN for Iceberg > - > > Key: HIVE-25256 > URL: https://issues.apache.org/jira/browse/HIVE-25256 > Project: Hive > Issue Type: New Feature >Reporter: Marton Bod >Assignee: Marton Bod >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > In order to provide support for renaming/changing the data type of a single > column, we should add alter table change column support for Iceberg tables. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25256) Support ALTER TABLE CHANGE COLUMN for Iceberg
[ https://issues.apache.org/jira/browse/HIVE-25256?focusedWorklogId=622482=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-622482 ] ASF GitHub Bot logged work on HIVE-25256: - Author: ASF GitHub Bot Created on: 14/Jul/21 12:30 Start Date: 14/Jul/21 12:30 Worklog Time Spent: 10m Work Description: marton-bod commented on a change in pull request #2463: URL: https://github.com/apache/hive/pull/2463#discussion_r669569756 ## File path: hbase-handler/src/test/results/negative/hbase_ddl.q.out ## @@ -26,4 +26,4 @@ key int It is a column key value string It is the column string value A masked pattern was here -FAILED: SemanticException [Error 10134]: ALTER TABLE can only be used for [ADDPROPS, DROPPROPS, ADDCOLS, REPLACE_COLUMNS, SETPARTITIONSPEC] to a non-native table hbase_table_1 +FAILED: SemanticException [Error 10134]: ALTER TABLE can only be used for [ADDPROPS, DROPPROPS, ADDCOLS] to a non-native table hbase_table_1 Review comment: Previously, when we were working on adding new alter commands for Iceberg, we kept adding these new operation types (rename columns, etc.) to the allowed list. However, there was only one global allowed list for all storage handler types. Now, the allowed list has been moved into the storage handler, so I've reverted the global list to its original form (before all our Iceberg changes started flowing in) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 622482) Time Spent: 1h 10m (was: 1h) > Support ALTER TABLE CHANGE COLUMN for Iceberg > - > > Key: HIVE-25256 > URL: https://issues.apache.org/jira/browse/HIVE-25256 > Project: Hive > Issue Type: New Feature >Reporter: Marton Bod >Assignee: Marton Bod >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > In order to provide support for renaming/changing the data type of a single > column, we should add alter table change column support for Iceberg tables. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25256) Support ALTER TABLE CHANGE COLUMN for Iceberg
[ https://issues.apache.org/jira/browse/HIVE-25256?focusedWorklogId=622479=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-622479 ] ASF GitHub Bot logged work on HIVE-25256: - Author: ASF GitHub Bot Created on: 14/Jul/21 12:26 Start Date: 14/Jul/21 12:26 Worklog Time Spent: 10m Work Description: marton-bod commented on a change in pull request #2463: URL: https://github.com/apache/hive/pull/2463#discussion_r669567104 ## File path: hbase-handler/src/test/results/negative/hbase_ddl.q.out ## @@ -26,4 +26,4 @@ key int It is a column key value string It is the column string value A masked pattern was here -FAILED: SemanticException [Error 10134]: ALTER TABLE can only be used for [ADDPROPS, DROPPROPS, ADDCOLS, REPLACE_COLUMNS, SETPARTITIONSPEC] to a non-native table hbase_table_1 +FAILED: SemanticException [Error 10134]: ALTER TABLE can only be used for [ADDPROPS, DROPPROPS, ADDCOLS] to a non-native table hbase_table_1 Review comment: They get this SemanticException: ``` ALTER TABLE can only be used for [ADDPROPS, DROPPROPS, ADDCOLS] to a non-native table hbase_table_1 ``` HBase (and all other storage handlers except for Iceberg at the moment), should get this exception for alter commands other than SET/UNSET TBLPROPERTIES and ADD COLUMNS. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 622479) Time Spent: 1h (was: 50m) > Support ALTER TABLE CHANGE COLUMN for Iceberg > - > > Key: HIVE-25256 > URL: https://issues.apache.org/jira/browse/HIVE-25256 > Project: Hive > Issue Type: New Feature >Reporter: Marton Bod >Assignee: Marton Bod >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > In order to provide support for renaming/changing the data type of a single > column, we should add alter table change column support for Iceberg tables. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HIVE-21237) [JDK 11] SessionState can't be initialized due to classloader problem
[ https://issues.apache.org/jira/browse/HIVE-21237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380516#comment-17380516 ] Sumit Kumar edited comment on HIVE-21237 at 7/14/21, 11:44 AM: --- apache-hive-2.3.9 works with java 11 was (Author: sharan12): apache2.3.9 works with java 11 > [JDK 11] SessionState can't be initialized due to classloader problem > - > > Key: HIVE-21237 > URL: https://issues.apache.org/jira/browse/HIVE-21237 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.1.1 > Environment: JDK11, Hadoop-3, Hive 3.1.1 >Reporter: Uma Maheswara Rao G >Priority: Major > > When I start Hive with JDK11 > {{2019-02-08 22:29:51,500 INFO SessionState: Hive Session ID = > cecd9c34-d61a-44d0-9e52-a0a7d6413e49 > Exception in thread "main" java.lang.ClassCastException: class > jdk.internal.loader.ClassLoaders$AppClassLoader cannot be cast to class > java.net.URLClassLoader (jdk.internal.loader.ClassLoaders$AppClassLoader and > java.net.URLClassLoader are in module java.base of loader 'bootstrap') > at > org.apache.hadoop.hive.ql.session.SessionState.(SessionState.java:410) > at > org.apache.hadoop.hive.ql.session.SessionState.(SessionState.java:386) > at > org.apache.hadoop.hive.cli.CliSessionState.(CliSessionState.java:60) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:705) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at org.apache.hadoop.util.RunJar.run(RunJar.java:323) > at org.apache.hadoop.util.RunJar.main(RunJar.java:236)}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21237) [JDK 11] SessionState can't be initialized due to classloader problem
[ https://issues.apache.org/jira/browse/HIVE-21237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380516#comment-17380516 ] Sumit Kumar commented on HIVE-21237: apache2.3.9 works with java 11 > [JDK 11] SessionState can't be initialized due to classloader problem > - > > Key: HIVE-21237 > URL: https://issues.apache.org/jira/browse/HIVE-21237 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.1.1 > Environment: JDK11, Hadoop-3, Hive 3.1.1 >Reporter: Uma Maheswara Rao G >Priority: Major > > When I start Hive with JDK11 > {{2019-02-08 22:29:51,500 INFO SessionState: Hive Session ID = > cecd9c34-d61a-44d0-9e52-a0a7d6413e49 > Exception in thread "main" java.lang.ClassCastException: class > jdk.internal.loader.ClassLoaders$AppClassLoader cannot be cast to class > java.net.URLClassLoader (jdk.internal.loader.ClassLoaders$AppClassLoader and > java.net.URLClassLoader are in module java.base of loader 'bootstrap') > at > org.apache.hadoop.hive.ql.session.SessionState.(SessionState.java:410) > at > org.apache.hadoop.hive.ql.session.SessionState.(SessionState.java:386) > at > org.apache.hadoop.hive.cli.CliSessionState.(CliSessionState.java:60) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:705) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at org.apache.hadoop.util.RunJar.run(RunJar.java:323) > at org.apache.hadoop.util.RunJar.main(RunJar.java:236)}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25256) Support ALTER TABLE CHANGE COLUMN for Iceberg
[ https://issues.apache.org/jira/browse/HIVE-25256?focusedWorklogId=622453=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-622453 ] ASF GitHub Bot logged work on HIVE-25256: - Author: ASF GitHub Bot Created on: 14/Jul/21 11:11 Start Date: 14/Jul/21 11:11 Worklog Time Spent: 10m Work Description: pvary commented on a change in pull request #2463: URL: https://github.com/apache/hive/pull/2463#discussion_r669514565 ## File path: hbase-handler/src/test/results/negative/hbase_ddl.q.out ## @@ -26,4 +26,4 @@ key int It is a column key value string It is the column string value A masked pattern was here -FAILED: SemanticException [Error 10134]: ALTER TABLE can only be used for [ADDPROPS, DROPPROPS, ADDCOLS, REPLACE_COLUMNS, SETPARTITIONSPEC] to a non-native table hbase_table_1 +FAILED: SemanticException [Error 10134]: ALTER TABLE can only be used for [ADDPROPS, DROPPROPS, ADDCOLS] to a non-native table hbase_table_1 Review comment: What happens with HBase tables if we try replacing columns and setting partition spec? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 622453) Time Spent: 50m (was: 40m) > Support ALTER TABLE CHANGE COLUMN for Iceberg > - > > Key: HIVE-25256 > URL: https://issues.apache.org/jira/browse/HIVE-25256 > Project: Hive > Issue Type: New Feature >Reporter: Marton Bod >Assignee: Marton Bod >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > In order to provide support for renaming/changing the data type of a single > column, we should add alter table change column support for Iceberg tables. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25256) Support ALTER TABLE CHANGE COLUMN for Iceberg
[ https://issues.apache.org/jira/browse/HIVE-25256?focusedWorklogId=622434=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-622434 ] ASF GitHub Bot logged work on HIVE-25256: - Author: ASF GitHub Bot Created on: 14/Jul/21 09:58 Start Date: 14/Jul/21 09:58 Worklog Time Spent: 10m Work Description: marton-bod commented on a change in pull request #2463: URL: https://github.com/apache/hive/pull/2463#discussion_r669468024 ## File path: iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java ## @@ -505,19 +512,82 @@ private void handleReplaceColumns(org.apache.hadoop.hive.metastore.api.Table hms } for (FieldSchema updatedCol : schemaDifference.getTypeChanged()) { - Type newType = HiveSchemaUtil.convert(TypeInfoUtils.getTypeInfoFromTypeString(updatedCol.getType())); - if (!(newType instanceof Type.PrimitiveType)) { -throw new MetaException(String.format("Cannot promote type of column: '%s' to a non-primitive type: %s.", -updatedCol.getName(), newType)); - } - updateSchema.updateColumn(updatedCol.getName(), (Type.PrimitiveType) newType, updatedCol.getComment()); + updateSchema.updateColumn(updatedCol.getName(), getPrimitiveTypeOrThrow(updatedCol), updatedCol.getComment()); } for (FieldSchema updatedCol : schemaDifference.getCommentChanged()) { updateSchema.updateColumnDoc(updatedCol.getName(), updatedCol.getComment()); } } + private void handleChangeColumn(org.apache.hadoop.hive.metastore.api.Table hmsTable) throws MetaException { +List hmsCols = hmsTable.getSd().getCols(); +List icebergCols = HiveSchemaUtil.convert(icebergTable.schema()); +// compute schema difference for renames, type/comment changes +HiveSchemaUtil.SchemaDifference schemaDifference = HiveSchemaUtil.getSchemaDiff(hmsCols, icebergCols, true); +// check column reorder (which could happen even in the absence of any rename, type or comment change) +Map renameMapping = ImmutableMap.of(); +if (!schemaDifference.getMissingFromSecond().isEmpty()) { + renameMapping = ImmutableMap.of( + schemaDifference.getMissingFromSecond().get(0).getName(), + schemaDifference.getMissingFromFirst().get(0).getName()); +} +Pair> outOfOrder = HiveSchemaUtil.getFirstOutOfOrderColumn(hmsCols, icebergCols, +renameMapping); + +if (!schemaDifference.isEmpty() || outOfOrder != null) { + updateSchema = icebergTable.updateSchema(); +} else { + // we should get here if the user restated the exactly the existing column in the CHANGE COLUMN command Review comment: If the comment is not clear to you, it needs to be fixed :) Will do it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 622434) Time Spent: 40m (was: 0.5h) > Support ALTER TABLE CHANGE COLUMN for Iceberg > - > > Key: HIVE-25256 > URL: https://issues.apache.org/jira/browse/HIVE-25256 > Project: Hive > Issue Type: New Feature >Reporter: Marton Bod >Assignee: Marton Bod >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > In order to provide support for renaming/changing the data type of a single > column, we should add alter table change column support for Iceberg tables. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25256) Support ALTER TABLE CHANGE COLUMN for Iceberg
[ https://issues.apache.org/jira/browse/HIVE-25256?focusedWorklogId=622432=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-622432 ] ASF GitHub Bot logged work on HIVE-25256: - Author: ASF GitHub Bot Created on: 14/Jul/21 09:47 Start Date: 14/Jul/21 09:47 Worklog Time Spent: 10m Work Description: pvary commented on a change in pull request #2463: URL: https://github.com/apache/hive/pull/2463#discussion_r669460802 ## File path: iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java ## @@ -505,19 +512,82 @@ private void handleReplaceColumns(org.apache.hadoop.hive.metastore.api.Table hms } for (FieldSchema updatedCol : schemaDifference.getTypeChanged()) { - Type newType = HiveSchemaUtil.convert(TypeInfoUtils.getTypeInfoFromTypeString(updatedCol.getType())); - if (!(newType instanceof Type.PrimitiveType)) { -throw new MetaException(String.format("Cannot promote type of column: '%s' to a non-primitive type: %s.", -updatedCol.getName(), newType)); - } - updateSchema.updateColumn(updatedCol.getName(), (Type.PrimitiveType) newType, updatedCol.getComment()); + updateSchema.updateColumn(updatedCol.getName(), getPrimitiveTypeOrThrow(updatedCol), updatedCol.getComment()); } for (FieldSchema updatedCol : schemaDifference.getCommentChanged()) { updateSchema.updateColumnDoc(updatedCol.getName(), updatedCol.getComment()); } } + private void handleChangeColumn(org.apache.hadoop.hive.metastore.api.Table hmsTable) throws MetaException { +List hmsCols = hmsTable.getSd().getCols(); +List icebergCols = HiveSchemaUtil.convert(icebergTable.schema()); +// compute schema difference for renames, type/comment changes +HiveSchemaUtil.SchemaDifference schemaDifference = HiveSchemaUtil.getSchemaDiff(hmsCols, icebergCols, true); +// check column reorder (which could happen even in the absence of any rename, type or comment change) +Map renameMapping = ImmutableMap.of(); +if (!schemaDifference.getMissingFromSecond().isEmpty()) { + renameMapping = ImmutableMap.of( + schemaDifference.getMissingFromSecond().get(0).getName(), + schemaDifference.getMissingFromFirst().get(0).getName()); +} +Pair> outOfOrder = HiveSchemaUtil.getFirstOutOfOrderColumn(hmsCols, icebergCols, +renameMapping); + +if (!schemaDifference.isEmpty() || outOfOrder != null) { + updateSchema = icebergTable.updateSchema(); +} else { + // we should get here if the user restated the exactly the existing column in the CHANGE COLUMN command Review comment: Please fix the comment, I do not get it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 622432) Time Spent: 20m (was: 10m) > Support ALTER TABLE CHANGE COLUMN for Iceberg > - > > Key: HIVE-25256 > URL: https://issues.apache.org/jira/browse/HIVE-25256 > Project: Hive > Issue Type: New Feature >Reporter: Marton Bod >Assignee: Marton Bod >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > In order to provide support for renaming/changing the data type of a single > column, we should add alter table change column support for Iceberg tables. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25256) Support ALTER TABLE CHANGE COLUMN for Iceberg
[ https://issues.apache.org/jira/browse/HIVE-25256?focusedWorklogId=622433=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-622433 ] ASF GitHub Bot logged work on HIVE-25256: - Author: ASF GitHub Bot Created on: 14/Jul/21 09:47 Start Date: 14/Jul/21 09:47 Worklog Time Spent: 10m Work Description: pvary commented on a change in pull request #2463: URL: https://github.com/apache/hive/pull/2463#discussion_r669460952 ## File path: iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java ## @@ -505,19 +512,82 @@ private void handleReplaceColumns(org.apache.hadoop.hive.metastore.api.Table hms } for (FieldSchema updatedCol : schemaDifference.getTypeChanged()) { - Type newType = HiveSchemaUtil.convert(TypeInfoUtils.getTypeInfoFromTypeString(updatedCol.getType())); - if (!(newType instanceof Type.PrimitiveType)) { -throw new MetaException(String.format("Cannot promote type of column: '%s' to a non-primitive type: %s.", -updatedCol.getName(), newType)); - } - updateSchema.updateColumn(updatedCol.getName(), (Type.PrimitiveType) newType, updatedCol.getComment()); + updateSchema.updateColumn(updatedCol.getName(), getPrimitiveTypeOrThrow(updatedCol), updatedCol.getComment()); } for (FieldSchema updatedCol : schemaDifference.getCommentChanged()) { updateSchema.updateColumnDoc(updatedCol.getName(), updatedCol.getComment()); } } + private void handleChangeColumn(org.apache.hadoop.hive.metastore.api.Table hmsTable) throws MetaException { +List hmsCols = hmsTable.getSd().getCols(); +List icebergCols = HiveSchemaUtil.convert(icebergTable.schema()); +// compute schema difference for renames, type/comment changes +HiveSchemaUtil.SchemaDifference schemaDifference = HiveSchemaUtil.getSchemaDiff(hmsCols, icebergCols, true); +// check column reorder (which could happen even in the absence of any rename, type or comment change) +Map renameMapping = ImmutableMap.of(); +if (!schemaDifference.getMissingFromSecond().isEmpty()) { + renameMapping = ImmutableMap.of( + schemaDifference.getMissingFromSecond().get(0).getName(), + schemaDifference.getMissingFromFirst().get(0).getName()); +} +Pair> outOfOrder = HiveSchemaUtil.getFirstOutOfOrderColumn(hmsCols, icebergCols, +renameMapping); + +if (!schemaDifference.isEmpty() || outOfOrder != null) { + updateSchema = icebergTable.updateSchema(); +} else { + // we should get here if the user restated the exactly the existing column in the CHANGE COLUMN command Review comment: Or fix me -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 622433) Time Spent: 0.5h (was: 20m) > Support ALTER TABLE CHANGE COLUMN for Iceberg > - > > Key: HIVE-25256 > URL: https://issues.apache.org/jira/browse/HIVE-25256 > Project: Hive > Issue Type: New Feature >Reporter: Marton Bod >Assignee: Marton Bod >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > In order to provide support for renaming/changing the data type of a single > column, we should add alter table change column support for Iceberg tables. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25249) Fix TestWorker
[ https://issues.apache.org/jira/browse/HIVE-25249?focusedWorklogId=622397=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-622397 ] ASF GitHub Bot logged work on HIVE-25249: - Author: ASF GitHub Bot Created on: 14/Jul/21 07:25 Start Date: 14/Jul/21 07:25 Worklog Time Spent: 10m Work Description: deniskuzZ commented on pull request #2474: URL: https://github.com/apache/hive/pull/2474#issuecomment-879661251 recheck -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 622397) Time Spent: 0.5h (was: 20m) > Fix TestWorker > -- > > Key: HIVE-25249 > URL: https://issues.apache.org/jira/browse/HIVE-25249 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > http://ci.hive.apache.org/job/hive-precommit/job/PR-2381/1/ > http://ci.hive.apache.org/job/hive-flaky-check/236/ -- This message was sent by Atlassian Jira (v8.3.4#803005)