[jira] [Resolved] (HIVE-24426) Spark job fails with fixed LlapTaskUmbilicalServer port
[ https://issues.apache.org/jira/browse/HIVE-24426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran resolved HIVE-24426. -- Fix Version/s: 4.0.0 Resolution: Fixed Merged the PR. Thanks [~ayushtkn] for the contribution! > Spark job fails with fixed LlapTaskUmbilicalServer port > --- > > Key: HIVE-24426 > URL: https://issues.apache.org/jira/browse/HIVE-24426 > Project: Hive > Issue Type: Sub-task >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Critical > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 50m > Remaining Estimate: 0h > > In case of cloud deployments, multiple executors are launched on name node, > and incase a fixed umbilical port is specified using > {{spark.hadoop.hive.llap.daemon.umbilical.port=30006}} > The job fails with BindException. > {noformat} > Caused by: java.net.BindException: Problem binding to [0.0.0.0:30006] > java.net.BindException: Address already in use; For more details see: > http://wiki.apache.org/hadoop/BindException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:840) > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:741) > at org.apache.hadoop.ipc.Server.bind(Server.java:605) > at org.apache.hadoop.ipc.Server$Listener.(Server.java:1169) > at org.apache.hadoop.ipc.Server.(Server.java:3032) > at org.apache.hadoop.ipc.RPC$Server.(RPC.java:1039) > at > org.apache.hadoop.ipc.WritableRpcEngine$Server.(WritableRpcEngine.java:438) > at > org.apache.hadoop.ipc.WritableRpcEngine.getServer(WritableRpcEngine.java:332) > at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:848) > at > org.apache.hadoop.hive.llap.tezplugins.helpers.LlapTaskUmbilicalServer.(LlapTaskUmbilicalServer.java:67) > at > org.apache.hadoop.hive.llap.ext.LlapTaskUmbilicalExternalClient$SharedUmbilicalServer.(LlapTaskUmbilicalExternalClient.java:122) > ... 26 more > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:433) > at sun.nio.ch.Net.bind(Net.java:425) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:220) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:85) > at org.apache.hadoop.ipc.Server.bind(Server.java:588) > ... 34 more{noformat} > To counter this, better to provide a range of ports -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24426) Spark job fails with fixed LlapTaskUmbilicalServer port
[ https://issues.apache.org/jira/browse/HIVE-24426?focusedWorklogId=518314=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518314 ] ASF GitHub Bot logged work on HIVE-24426: - Author: ASF GitHub Bot Created on: 01/Dec/20 07:13 Start Date: 01/Dec/20 07:13 Worklog Time Spent: 10m Work Description: prasanthj merged pull request #1705: URL: https://github.com/apache/hive/pull/1705 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518314) Time Spent: 50m (was: 40m) > Spark job fails with fixed LlapTaskUmbilicalServer port > --- > > Key: HIVE-24426 > URL: https://issues.apache.org/jira/browse/HIVE-24426 > Project: Hive > Issue Type: Sub-task >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Critical > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > In case of cloud deployments, multiple executors are launched on name node, > and incase a fixed umbilical port is specified using > {{spark.hadoop.hive.llap.daemon.umbilical.port=30006}} > The job fails with BindException. > {noformat} > Caused by: java.net.BindException: Problem binding to [0.0.0.0:30006] > java.net.BindException: Address already in use; For more details see: > http://wiki.apache.org/hadoop/BindException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:840) > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:741) > at org.apache.hadoop.ipc.Server.bind(Server.java:605) > at org.apache.hadoop.ipc.Server$Listener.(Server.java:1169) > at org.apache.hadoop.ipc.Server.(Server.java:3032) > at org.apache.hadoop.ipc.RPC$Server.(RPC.java:1039) > at > org.apache.hadoop.ipc.WritableRpcEngine$Server.(WritableRpcEngine.java:438) > at > org.apache.hadoop.ipc.WritableRpcEngine.getServer(WritableRpcEngine.java:332) > at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:848) > at > org.apache.hadoop.hive.llap.tezplugins.helpers.LlapTaskUmbilicalServer.(LlapTaskUmbilicalServer.java:67) > at > org.apache.hadoop.hive.llap.ext.LlapTaskUmbilicalExternalClient$SharedUmbilicalServer.(LlapTaskUmbilicalExternalClient.java:122) > ... 26 more > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:433) > at sun.nio.ch.Net.bind(Net.java:425) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:220) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:85) > at org.apache.hadoop.ipc.Server.bind(Server.java:588) > ... 34 more{noformat} > To counter this, better to provide a range of ports -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24426) Spark job fails with fixed LlapTaskUmbilicalServer port
[ https://issues.apache.org/jira/browse/HIVE-24426?focusedWorklogId=518311=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518311 ] ASF GitHub Bot logged work on HIVE-24426: - Author: ASF GitHub Bot Created on: 01/Dec/20 07:07 Start Date: 01/Dec/20 07:07 Worklog Time Spent: 10m Work Description: ayushtkn commented on pull request #1705: URL: https://github.com/apache/hive/pull/1705#issuecomment-736269768 Thanx @prasanthj for the review, I have added the success log as well. Please have a check. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518311) Time Spent: 40m (was: 0.5h) > Spark job fails with fixed LlapTaskUmbilicalServer port > --- > > Key: HIVE-24426 > URL: https://issues.apache.org/jira/browse/HIVE-24426 > Project: Hive > Issue Type: Sub-task >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Critical > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > In case of cloud deployments, multiple executors are launched on name node, > and incase a fixed umbilical port is specified using > {{spark.hadoop.hive.llap.daemon.umbilical.port=30006}} > The job fails with BindException. > {noformat} > Caused by: java.net.BindException: Problem binding to [0.0.0.0:30006] > java.net.BindException: Address already in use; For more details see: > http://wiki.apache.org/hadoop/BindException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:840) > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:741) > at org.apache.hadoop.ipc.Server.bind(Server.java:605) > at org.apache.hadoop.ipc.Server$Listener.(Server.java:1169) > at org.apache.hadoop.ipc.Server.(Server.java:3032) > at org.apache.hadoop.ipc.RPC$Server.(RPC.java:1039) > at > org.apache.hadoop.ipc.WritableRpcEngine$Server.(WritableRpcEngine.java:438) > at > org.apache.hadoop.ipc.WritableRpcEngine.getServer(WritableRpcEngine.java:332) > at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:848) > at > org.apache.hadoop.hive.llap.tezplugins.helpers.LlapTaskUmbilicalServer.(LlapTaskUmbilicalServer.java:67) > at > org.apache.hadoop.hive.llap.ext.LlapTaskUmbilicalExternalClient$SharedUmbilicalServer.(LlapTaskUmbilicalExternalClient.java:122) > ... 26 more > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:433) > at sun.nio.ch.Net.bind(Net.java:425) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:220) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:85) > at org.apache.hadoop.ipc.Server.bind(Server.java:588) > ... 34 more{noformat} > To counter this, better to provide a range of ports -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24456) Column masking/hashing function in hive should use SH512 if FIPS mode is enabled
[ https://issues.apache.org/jira/browse/HIVE-24456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17241303#comment-17241303 ] Anishek Agarwal commented on HIVE-24456: just wondering why not just by default move these hash functions, is there significant performance overhead here ? > Column masking/hashing function in hive should use SH512 if FIPS mode is > enabled > > > Key: HIVE-24456 > URL: https://issues.apache.org/jira/browse/HIVE-24456 > Project: Hive > Issue Type: Wish > Components: HiveServer2 >Reporter: Sai Hemanth Gantasala >Assignee: Sai Hemanth Gantasala >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > hive-site.xml should have the following property to indicate that FIPS mode > is enabled. > > hive.masking.algo > sha512 > > If this property is present, then GenericUDFMaskHash should use SHA512 > instead of SHA256 encoding for column masking. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24450) DbNotificationListener Request Notification IDs in Batches
[ https://issues.apache.org/jira/browse/HIVE-24450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17241238#comment-17241238 ] Anishek Agarwal commented on HIVE-24450: [~belugabehr] you cant get sequence id's in blocks, replication will not work. it has to be one at a time. cc [~thejas]/[~aasha]/[~pkumarsinha] > DbNotificationListener Request Notification IDs in Batches > -- > > Key: HIVE-24450 > URL: https://issues.apache.org/jira/browse/HIVE-24450 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Every time a new notification event is logged into the database, the sequence > number for the ID of the even is incremented by one. It is very standard in > database design to instead request a block of IDs for each fetch from the > database. The sequence numbers are then handed out locally until the block > of IDs is exhausted. This allows for fewer database round-trips and > transactions, at the expense of perhaps burning a few IDs. > Burning of IDs happens when the server is restarted in the middle of a block > of sequence IDs. That is, if the HMS requests a block of 10 ids, and only > three have been assigned, after the restart, the HMS will request another > block of 10, burning (wasting) 7 IDs. As long as the blocks are not too > small, and restarts are infrequent, then few IDs are lost. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24397) Add the projection specification to the table request object and add placeholders in ObjectStore.java
[ https://issues.apache.org/jira/browse/HIVE-24397?focusedWorklogId=518272=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518272 ] ASF GitHub Bot logged work on HIVE-24397: - Author: ASF GitHub Bot Created on: 01/Dec/20 03:55 Start Date: 01/Dec/20 03:55 Worklog Time Spent: 10m Work Description: vnhive commented on pull request #1681: URL: https://github.com/apache/hive/pull/1681#issuecomment-736201182 @vihangk1 Addressed all your comments in this push. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518272) Time Spent: 40m (was: 0.5h) > Add the projection specification to the table request object and add > placeholders in ObjectStore.java > - > > Key: HIVE-24397 > URL: https://issues.apache.org/jira/browse/HIVE-24397 > Project: Hive > Issue Type: Sub-task > Components: Hive >Reporter: Narayanan Venkateswaran >Assignee: Narayanan Venkateswaran >Priority: Minor > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23980) Shade guava from existing Hive versions
[ https://issues.apache.org/jira/browse/HIVE-23980?focusedWorklogId=518262=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518262 ] ASF GitHub Bot logged work on HIVE-23980: - Author: ASF GitHub Bot Created on: 01/Dec/20 03:23 Start Date: 01/Dec/20 03:23 Worklog Time Spent: 10m Work Description: viirya commented on pull request #1356: URL: https://github.com/apache/hive/pull/1356#issuecomment-736192230 > Yes, my main question is whether it is safe to skip the changes on `HiveSubQueryRemoveRule` and `HiveRelDecorrelator`. It looks fine to me since we've already shaded calcite within hive/ql. Yeah, it should be fine. Calcite uses guava API so shading guava causes no method error if we don't include calcite in shaded jar of hive/ql. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518262) Time Spent: 5h 40m (was: 5.5h) > Shade guava from existing Hive versions > --- > > Key: HIVE-23980 > URL: https://issues.apache.org/jira/browse/HIVE-23980 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.7 >Reporter: L. C. Hsieh >Assignee: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23980.01.branch-2.3.patch > > Time Spent: 5h 40m > Remaining Estimate: 0h > > I'm trying to upgrade Guava version in Spark. The JIRA ticket is SPARK-32502. > Running test hits an error: > {code} > sbt.ForkMain$ForkError: sbt.ForkMain$ForkError: java.lang.IllegalAccessError: > tried to access method > com.google.common.collect.Iterators.emptyIterator()Lcom/google/common/collect/UnmodifiableIterator; > from class org.apache.hadoop.hive.ql.exec.FetchOperator > at > org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:108) > at > org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:87) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:541) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227) > {code} > I know that hive-exec doesn't shade Guava until HIVE-22126 but that work > targets 4.0.0. I'm wondering if there is a solution for current Hive > versions, e.g. Hive 2.3.7? Any ideas? > Thanks. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23980) Shade guava from existing Hive versions
[ https://issues.apache.org/jira/browse/HIVE-23980?focusedWorklogId=518258=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518258 ] ASF GitHub Bot logged work on HIVE-23980: - Author: ASF GitHub Bot Created on: 01/Dec/20 03:13 Start Date: 01/Dec/20 03:13 Worklog Time Spent: 10m Work Description: sunchao commented on pull request #1356: URL: https://github.com/apache/hive/pull/1356#issuecomment-736189137 Yes, my main question is whether it is safe to skip the changes on `HiveSubQueryRemoveRule` and `HiveRelDecorrelator`. It looks fine to me since we've already shaded calcite within hive/ql. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518258) Time Spent: 5.5h (was: 5h 20m) > Shade guava from existing Hive versions > --- > > Key: HIVE-23980 > URL: https://issues.apache.org/jira/browse/HIVE-23980 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.7 >Reporter: L. C. Hsieh >Assignee: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23980.01.branch-2.3.patch > > Time Spent: 5.5h > Remaining Estimate: 0h > > I'm trying to upgrade Guava version in Spark. The JIRA ticket is SPARK-32502. > Running test hits an error: > {code} > sbt.ForkMain$ForkError: sbt.ForkMain$ForkError: java.lang.IllegalAccessError: > tried to access method > com.google.common.collect.Iterators.emptyIterator()Lcom/google/common/collect/UnmodifiableIterator; > from class org.apache.hadoop.hive.ql.exec.FetchOperator > at > org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:108) > at > org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:87) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:541) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227) > {code} > I know that hive-exec doesn't shade Guava until HIVE-22126 but that work > targets 4.0.0. I'm wondering if there is a solution for current Hive > versions, e.g. Hive 2.3.7? Any ideas? > Thanks. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23980) Shade guava from existing Hive versions
[ https://issues.apache.org/jira/browse/HIVE-23980?focusedWorklogId=518257=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518257 ] ASF GitHub Bot logged work on HIVE-23980: - Author: ASF GitHub Bot Created on: 01/Dec/20 03:11 Start Date: 01/Dec/20 03:11 Worklog Time Spent: 10m Work Description: sunchao commented on a change in pull request #1356: URL: https://github.com/apache/hive/pull/1356#discussion_r533044970 ## File path: itests/hive-blobstore/pom.xml ## @@ -55,33 +55,33 @@ org.apache.hive - hive-metastore + hive-exec ${project.version} test org.apache.hive hive-metastore ${project.version} - tests test org.apache.hive - hive-it-unit + hive-metastore Review comment: Oh cool. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518257) Time Spent: 5h 20m (was: 5h 10m) > Shade guava from existing Hive versions > --- > > Key: HIVE-23980 > URL: https://issues.apache.org/jira/browse/HIVE-23980 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.7 >Reporter: L. C. Hsieh >Assignee: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23980.01.branch-2.3.patch > > Time Spent: 5h 20m > Remaining Estimate: 0h > > I'm trying to upgrade Guava version in Spark. The JIRA ticket is SPARK-32502. > Running test hits an error: > {code} > sbt.ForkMain$ForkError: sbt.ForkMain$ForkError: java.lang.IllegalAccessError: > tried to access method > com.google.common.collect.Iterators.emptyIterator()Lcom/google/common/collect/UnmodifiableIterator; > from class org.apache.hadoop.hive.ql.exec.FetchOperator > at > org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:108) > at > org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:87) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:541) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227) > {code} > I know that hive-exec doesn't shade Guava until HIVE-22126 but that work > targets 4.0.0. I'm wondering if there is a solution for current Hive > versions, e.g. Hive 2.3.7? Any ideas? > Thanks. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24458) Allow access to SArgs without converting to disjunctive normal form
[ https://issues.apache.org/jira/browse/HIVE-24458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley reassigned HIVE-24458: > Allow access to SArgs without converting to disjunctive normal form > --- > > Key: HIVE-24458 > URL: https://issues.apache.org/jira/browse/HIVE-24458 > Project: Hive > Issue Type: Improvement >Reporter: Owen O'Malley >Assignee: Owen O'Malley >Priority: Major > > For some use cases, it is useful to have access to the SArg expression in a > non-normalized form. Currently, the SArg only provides the fully normalized > expression. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24436) Fix Avro NULL_DEFAULT_VALUE compatibility issue
[ https://issues.apache.org/jira/browse/HIVE-24436?focusedWorklogId=518189=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518189 ] ASF GitHub Bot logged work on HIVE-24436: - Author: ASF GitHub Bot Created on: 01/Dec/20 00:14 Start Date: 01/Dec/20 00:14 Worklog Time Spent: 10m Work Description: wangyum commented on pull request #1715: URL: https://github.com/apache/hive/pull/1715#issuecomment-736131810 This is for master branch: https://github.com/apache/hive/pull/1722 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518189) Time Spent: 1h 20m (was: 1h 10m) > Fix Avro NULL_DEFAULT_VALUE compatibility issue > --- > > Key: HIVE-24436 > URL: https://issues.apache.org/jira/browse/HIVE-24436 > Project: Hive > Issue Type: Improvement > Components: Avro >Affects Versions: 2.3.8 >Reporter: Yuming Wang >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > Exception1: > {noformat} > - create hive serde table with Catalog > *** RUN ABORTED *** > java.lang.NoSuchMethodError: 'void > org.apache.avro.Schema$Field.(java.lang.String, org.apache.avro.Schema, > java.lang.String, org.codehaus.jackson.JsonNode)' > at > org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.createAvroField(TypeInfoToSchema.java:76) > at > org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.convert(TypeInfoToSchema.java:61) > at > org.apache.hadoop.hive.serde2.avro.AvroSerDe.getSchemaFromCols(AvroSerDe.java:170) > at > org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:114) > at > org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:83) > at > org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:533) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:450) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:437) > at > org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:281) > at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:263) > {noformat} > Exception2: > {noformat} > - alter hive serde table add columns -- partitioned - AVRO *** FAILED *** > org.apache.spark.sql.AnalysisException: > org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.avro.AvroRuntimeException: Unknown datum class: class > org.codehaus.jackson.node.NullNode; > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:112) > at > org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:245) > at > org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.createTable(ExternalCatalogWithListener.scala:94) > at > org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:346) > at > org.apache.spark.sql.execution.command.CreateTableCommand.run(tables.scala:166) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) > at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228) > at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3680) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24456) Column masking/hashing function in hive should use SH512 if FIPS mode is enabled
[ https://issues.apache.org/jira/browse/HIVE-24456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24456: -- Labels: pull-request-available (was: ) > Column masking/hashing function in hive should use SH512 if FIPS mode is > enabled > > > Key: HIVE-24456 > URL: https://issues.apache.org/jira/browse/HIVE-24456 > Project: Hive > Issue Type: Wish > Components: HiveServer2 >Reporter: Sai Hemanth Gantasala >Assignee: Sai Hemanth Gantasala >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > hive-site.xml should have the following property to indicate that FIPS mode > is enabled. > > hive.masking.algo > sha512 > > If this property is present, then GenericUDFMaskHash should use SHA512 > instead of SHA256 encoding for column masking. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24436) Fix Avro NULL_DEFAULT_VALUE compatibility issue
[ https://issues.apache.org/jira/browse/HIVE-24436?focusedWorklogId=518188=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518188 ] ASF GitHub Bot logged work on HIVE-24436: - Author: ASF GitHub Bot Created on: 01/Dec/20 00:11 Start Date: 01/Dec/20 00:11 Worklog Time Spent: 10m Work Description: wangyum opened a new pull request #1722: URL: https://github.com/apache/hive/pull/1722 ### What changes were proposed in this pull request? This pr replace `null` with `JsonProperties.NULL_VALUE` to fix compatibility issue: 1. java.lang.NoSuchMethodError: 'void org.apache.avro.Schema$Field.(java.lang.String, org.apache.avro.Schema, java.lang.String, org.codehaus.jackson.JsonNode)' ``` - create hive serde table with Catalog *** RUN ABORTED *** java.lang.NoSuchMethodError: 'void org.apache.avro.Schema$Field.(java.lang.String, org.apache.avro.Schema, java.lang.String, org.codehaus.jackson.JsonNode)' at org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.createAvroField(TypeInfoToSchema.java:76) at org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.convert(TypeInfoToSchema.java:61) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.getSchemaFromCols(AvroSerDe.java:170) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:114) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:83) at org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:533) at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:450) at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:437) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:281) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:263) ``` 2. org.apache.avro.AvroRuntimeException: Unknown datum class: class org.codehaus.jackson.node.NullNode ``` - alter hive serde table add columns -- partitioned - AVRO *** FAILED *** org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.avro.AvroRuntimeException: Unknown datum class: class org.codehaus.jackson.node.NullNode; at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:112) at org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:245) at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.createTable(ExternalCatalogWithListener.scala:94) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:346) at org.apache.spark.sql.execution.command.CreateTableCommand.run(tables.scala:166) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228) at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3680) ``` ### Why are the changes needed? For compatibility with Avro 1.9.x and Avro 1.10.0. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Build and run Spark test: ``` mvn -Dtest=none -DwildcardSuites=org.apache.spark.sql.hive.execution.HiveDDLSuite test -pl sql/hive ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518188) Time Spent: 1h 10m (was: 1h) > Fix Avro NULL_DEFAULT_VALUE compatibility issue > --- > > Key: HIVE-24436 > URL: https://issues.apache.org/jira/browse/HIVE-24436 > Project: Hive > Issue Type: Improvement > Components: Avro >Affects Versions: 2.3.8 >Reporter: Yuming Wang >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > Exception1: > {noformat} > - create hive serde table with Catalog > *** RUN ABORTED *** > java.lang.NoSuchMethodError: 'void > org.apache.avro.Schema$Field.(java.lang.String, org.apache.avro.Schema, > java.lang.String,
[jira] [Work logged] (HIVE-24456) Column masking/hashing function in hive should use SH512 if FIPS mode is enabled
[ https://issues.apache.org/jira/browse/HIVE-24456?focusedWorklogId=518187=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518187 ] ASF GitHub Bot logged work on HIVE-24456: - Author: ASF GitHub Bot Created on: 01/Dec/20 00:11 Start Date: 01/Dec/20 00:11 Worklog Time Spent: 10m Work Description: saihemanth-cloudera opened a new pull request #1721: URL: https://github.com/apache/hive/pull/1721 …lumn masking should be done with SHA512. ### What changes were proposed in this pull request? Column masking encoding changed to SHA512 if FIPS mode is enabled ### Why are the changes needed? For better security in FIPS mode. ### Does this PR introduce _any_ user-facing change? Yes. User should include the following property in hive-site.xml to indicate that FIPS mode is enabled in Hive. hive.masking.algo sha512 ### How was this patch tested? Local cluster. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518187) Remaining Estimate: 0h Time Spent: 10m > Column masking/hashing function in hive should use SH512 if FIPS mode is > enabled > > > Key: HIVE-24456 > URL: https://issues.apache.org/jira/browse/HIVE-24456 > Project: Hive > Issue Type: Wish > Components: HiveServer2 >Reporter: Sai Hemanth Gantasala >Assignee: Sai Hemanth Gantasala >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > hive-site.xml should have the following property to indicate that FIPS mode > is enabled. > > hive.masking.algo > sha512 > > If this property is present, then GenericUDFMaskHash should use SHA512 > instead of SHA256 encoding for column masking. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24456) Column masking/hashing function in hive should use SH512 if FIPS mode is enabled
[ https://issues.apache.org/jira/browse/HIVE-24456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sai Hemanth Gantasala updated HIVE-24456: - Description: hive-site.xml should have the following property to indicate that FIPS mode is enabled. hive.masking.algo sha512 If this property is present, then GenericUDFMaskHash should use SHA512 instead of SHA256 encoding for column masking. was: hive-site.xml should have the following property to indicate that FIPS mode is enabled. hive.masking.algo sha256 If this property is present, then GenericUDFMaskHash should use SHA512 instead of SHA256 encoding for column masking. > Column masking/hashing function in hive should use SH512 if FIPS mode is > enabled > > > Key: HIVE-24456 > URL: https://issues.apache.org/jira/browse/HIVE-24456 > Project: Hive > Issue Type: Wish > Components: HiveServer2 >Reporter: Sai Hemanth Gantasala >Assignee: Sai Hemanth Gantasala >Priority: Major > > hive-site.xml should have the following property to indicate that FIPS mode > is enabled. > > hive.masking.algo > sha512 > > If this property is present, then GenericUDFMaskHash should use SHA512 > instead of SHA256 encoding for column masking. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24456) Column masking/hashing function in hive should use SH512 if FIPS mode is enabled
[ https://issues.apache.org/jira/browse/HIVE-24456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sai Hemanth Gantasala reassigned HIVE-24456: > Column masking/hashing function in hive should use SH512 if FIPS mode is > enabled > > > Key: HIVE-24456 > URL: https://issues.apache.org/jira/browse/HIVE-24456 > Project: Hive > Issue Type: Wish > Components: HiveServer2 >Reporter: Sai Hemanth Gantasala >Assignee: Sai Hemanth Gantasala >Priority: Major > > hive-site.xml should have the following property to indicate that FIPS mode > is enabled. > > hive.masking.algo > sha256 > > If this property is present, then GenericUDFMaskHash should use SHA512 > instead of SHA256 encoding for column masking. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24455) Fix broken junit framework in storage-api
[ https://issues.apache.org/jira/browse/HIVE-24455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24455: -- Labels: pull-request-available (was: ) > Fix broken junit framework in storage-api > - > > Key: HIVE-24455 > URL: https://issues.apache.org/jira/browse/HIVE-24455 > Project: Hive > Issue Type: Bug >Reporter: Owen O'Malley >Assignee: Owen O'Malley >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > The use of junit is broken in storage-api. It results in no tests being found. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24455) Fix broken junit framework in storage-api
[ https://issues.apache.org/jira/browse/HIVE-24455?focusedWorklogId=518157=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518157 ] ASF GitHub Bot logged work on HIVE-24455: - Author: ASF GitHub Bot Created on: 30/Nov/20 22:38 Start Date: 30/Nov/20 22:38 Worklog Time Spent: 10m Work Description: omalley opened a new pull request #1720: URL: https://github.com/apache/hive/pull/1720 Update the storage-api surefire plugin version to match the rest of hive. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518157) Remaining Estimate: 0h Time Spent: 10m > Fix broken junit framework in storage-api > - > > Key: HIVE-24455 > URL: https://issues.apache.org/jira/browse/HIVE-24455 > Project: Hive > Issue Type: Bug >Reporter: Owen O'Malley >Assignee: Owen O'Malley >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > The use of junit is broken in storage-api. It results in no tests being found. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24455) Fix broken junit framework in storage-api
[ https://issues.apache.org/jira/browse/HIVE-24455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley reassigned HIVE-24455: > Fix broken junit framework in storage-api > - > > Key: HIVE-24455 > URL: https://issues.apache.org/jira/browse/HIVE-24455 > Project: Hive > Issue Type: Bug >Reporter: Owen O'Malley >Assignee: Owen O'Malley >Priority: Major > > The use of junit is broken in storage-api. It results in no tests being found. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24144) getIdentifierQuoteString in HiveDatabaseMetaData returns incorrect value
[ https://issues.apache.org/jira/browse/HIVE-24144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-24144: --- Fix Version/s: 4.0.0 Resolution: Fixed Status: Resolved (was: Patch Available) > getIdentifierQuoteString in HiveDatabaseMetaData returns incorrect value > > > Key: HIVE-24144 > URL: https://issues.apache.org/jira/browse/HIVE-24144 > Project: Hive > Issue Type: Bug > Components: JDBC, JDBC storage handler >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1h > Remaining Estimate: 0h > > {code} > public String getIdentifierQuoteString() throws SQLException { > return " "; > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24450) DbNotificationListener Request Notification IDs in Batches
[ https://issues.apache.org/jira/browse/HIVE-24450?focusedWorklogId=518138=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518138 ] ASF GitHub Bot logged work on HIVE-24450: - Author: ASF GitHub Bot Created on: 30/Nov/20 21:53 Start Date: 30/Nov/20 21:53 Worklog Time Spent: 10m Work Description: pvargacl commented on pull request #1718: URL: https://github.com/apache/hive/pull/1718#issuecomment-736078852 @belugabehr Although I am not very familiar with this area, but what happens if multiple HMS are running in HA? Wouldn't this solution mean, that potentially the order of the notification events will change? Two HMS are running HMS 1 gets id range 1-10, HMS 2 gets 11-20. Then openTxn notification goes to HMS 2 and allocateWriteId notification goes to HMS1. The sequence of the ids will not represent the sequence of the events. Wouldn't this mess up acid table replication? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518138) Time Spent: 20m (was: 10m) > DbNotificationListener Request Notification IDs in Batches > -- > > Key: HIVE-24450 > URL: https://issues.apache.org/jira/browse/HIVE-24450 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Every time a new notification event is logged into the database, the sequence > number for the ID of the even is incremented by one. It is very standard in > database design to instead request a block of IDs for each fetch from the > database. The sequence numbers are then handed out locally until the block > of IDs is exhausted. This allows for fewer database round-trips and > transactions, at the expense of perhaps burning a few IDs. > Burning of IDs happens when the server is restarted in the middle of a block > of sequence IDs. That is, if the HMS requests a block of 10 ids, and only > three have been assigned, after the restart, the HMS will request another > block of 10, burning (wasting) 7 IDs. As long as the blocks are not too > small, and restarts are infrequent, then few IDs are lost. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24144) getIdentifierQuoteString in HiveDatabaseMetaData returns incorrect value
[ https://issues.apache.org/jira/browse/HIVE-24144?focusedWorklogId=518139=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518139 ] ASF GitHub Bot logged work on HIVE-24144: - Author: ASF GitHub Bot Created on: 30/Nov/20 21:53 Start Date: 30/Nov/20 21:53 Worklog Time Spent: 10m Work Description: jcamachor merged pull request #1487: URL: https://github.com/apache/hive/pull/1487 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518139) Time Spent: 1h (was: 50m) > getIdentifierQuoteString in HiveDatabaseMetaData returns incorrect value > > > Key: HIVE-24144 > URL: https://issues.apache.org/jira/browse/HIVE-24144 > Project: Hive > Issue Type: Bug > Components: JDBC, JDBC storage handler >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > {code} > public String getIdentifierQuoteString() throws SQLException { > return " "; > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24073) Execution exception in sort-merge semijoin
[ https://issues.apache.org/jira/browse/HIVE-24073?focusedWorklogId=518137=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518137 ] ASF GitHub Bot logged work on HIVE-24073: - Author: ASF GitHub Bot Created on: 30/Nov/20 21:52 Start Date: 30/Nov/20 21:52 Worklog Time Spent: 10m Work Description: jcamachor commented on pull request #1476: URL: https://github.com/apache/hive/pull/1476#issuecomment-736078379 @maheshk114 , is there any work remaining here? It seems there are result changes. Are these correct? Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518137) Time Spent: 0.5h (was: 20m) > Execution exception in sort-merge semijoin > -- > > Key: HIVE-24073 > URL: https://issues.apache.org/jira/browse/HIVE-24073 > Project: Hive > Issue Type: Bug > Components: Operators >Reporter: Jesus Camacho Rodriguez >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Working on HIVE-24041, we trigger an additional SJ conversion that leads to > this exception at execution time: > {code} > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.hive.ql.metadata.HiveException: Attempting to overwrite > nextKeyWritables[1] > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1063) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:685) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:707) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:707) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:707) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:707) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:462) > ... 16 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.hive.ql.metadata.HiveException: Attempting to overwrite > nextKeyWritables[1] > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.flush(GroupByOperator.java:1037) > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1060) > ... 22 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Attempting to > overwrite nextKeyWritables[1] > at > org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.processKey(CommonMergeJoinOperator.java:564) > at > org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:243) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:887) > at > org.apache.hadoop.hive.ql.exec.TezDummyStoreOperator.process(TezDummyStoreOperator.java:49) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:887) > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1003) > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.flush(GroupByOperator.java:1020) > ... 23 more > {code} > To reproduce, just set {{hive.auto.convert.sortmerge.join}} to {{true}} in > the last query in {{auto_sortmerge_join_10.q}} after HIVE-24041 has been > merged. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24453) Direct SQL error when parsing create_time value for database
[ https://issues.apache.org/jira/browse/HIVE-24453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-24453: --- Status: Patch Available (was: Open) > Direct SQL error when parsing create_time value for database > > > Key: HIVE-24453 > URL: https://issues.apache.org/jira/browse/HIVE-24453 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > HIVE-21077 introduced a {{create_time}} field for {{DBS}} table in HMS. > Although the value for that field is always set after that patch, the value > could be null if the database was created before the feature went in. > DirectSQL should check for null value before parsing the integer, otherwise > we hit an exception and fallback to ORM path: > {code} > 2020-11-28 09:06:05,414 WARN org.apache.hadoop.hive.metastore.ObjectStore: > [pool-8-thread-194]: Falling back to ORM path due to direct SQL failure (this > is not an error): null at > org.apache.hadoop.hive.metastore.MetastoreDirectSqlUtils.extractSqlInt(MetastoreDirectSqlUtils.java:251) > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getDatabase(MetaStoreDirectSql.java:420) > at > org.apache.hadoop.hive.metastore.ObjectStore$1.getSqlResult(ObjectStore.java:839) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24453) Direct SQL error when parsing create_time value for database
[ https://issues.apache.org/jira/browse/HIVE-24453?focusedWorklogId=518130=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518130 ] ASF GitHub Bot logged work on HIVE-24453: - Author: ASF GitHub Bot Created on: 30/Nov/20 21:22 Start Date: 30/Nov/20 21:22 Worklog Time Spent: 10m Work Description: jcamachor opened a new pull request #1719: URL: https://github.com/apache/hive/pull/1719 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518130) Remaining Estimate: 0h Time Spent: 10m > Direct SQL error when parsing create_time value for database > > > Key: HIVE-24453 > URL: https://issues.apache.org/jira/browse/HIVE-24453 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > HIVE-21077 introduced a {{create_time}} field for {{DBS}} table in HMS. > Although the value for that field is always set after that patch, the value > could be null if the database was created before the feature went in. > DirectSQL should check for null value before parsing the integer, otherwise > we hit an exception and fallback to ORM path: > {code} > 2020-11-28 09:06:05,414 WARN org.apache.hadoop.hive.metastore.ObjectStore: > [pool-8-thread-194]: Falling back to ORM path due to direct SQL failure (this > is not an error): null at > org.apache.hadoop.hive.metastore.MetastoreDirectSqlUtils.extractSqlInt(MetastoreDirectSqlUtils.java:251) > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getDatabase(MetaStoreDirectSql.java:420) > at > org.apache.hadoop.hive.metastore.ObjectStore$1.getSqlResult(ObjectStore.java:839) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24453) Direct SQL error when parsing create_time value for database
[ https://issues.apache.org/jira/browse/HIVE-24453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24453: -- Labels: pull-request-available (was: ) > Direct SQL error when parsing create_time value for database > > > Key: HIVE-24453 > URL: https://issues.apache.org/jira/browse/HIVE-24453 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > HIVE-21077 introduced a {{create_time}} field for {{DBS}} table in HMS. > Although the value for that field is always set after that patch, the value > could be null if the database was created before the feature went in. > DirectSQL should check for null value before parsing the integer, otherwise > we hit an exception and fallback to ORM path: > {code} > 2020-11-28 09:06:05,414 WARN org.apache.hadoop.hive.metastore.ObjectStore: > [pool-8-thread-194]: Falling back to ORM path due to direct SQL failure (this > is not an error): null at > org.apache.hadoop.hive.metastore.MetastoreDirectSqlUtils.extractSqlInt(MetastoreDirectSqlUtils.java:251) > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getDatabase(MetaStoreDirectSql.java:420) > at > org.apache.hadoop.hive.metastore.ObjectStore$1.getSqlResult(ObjectStore.java:839) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24453) DirectSQL error when parsing create_time value for database
[ https://issues.apache.org/jira/browse/HIVE-24453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-24453: --- Description: HIVE-21077 introduced a {{create_time}} field for {{DBS}} table in HMS. Although the value for that field is always set after that patch, the value could be null if the database was created before the feature went in. DirectSQL should check for null value before parsing the integer, otherwise we hit an exception and fallback to ORM path: {code} 2020-11-28 09:06:05,414 WARN org.apache.hadoop.hive.metastore.ObjectStore: [pool-8-thread-194]: Falling back to ORM path due to direct SQL failure (this is not an error): null at org.apache.hadoop.hive.metastore.MetastoreDirectSqlUtils.extractSqlInt(MetastoreDirectSqlUtils.java:251) at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getDatabase(MetaStoreDirectSql.java:420) at org.apache.hadoop.hive.metastore.ObjectStore$1.getSqlResult(ObjectStore.java:839) {code} was: HIVE-21077 introduced a {{create_time}} field for {{DBS}} table in HMS. Although the value for that field is always set after that patch, the value could be null if the database was created before the feature went in. DirectSQL should check for null value before parsing the integer, otherwise we hit an exception and fallback to ORM path: {noformat} 2020-11-28 09:06:05,414 WARN org.apache.hadoop.hive.metastore.ObjectStore: [pool-8-thread-194]: Falling back to ORM path due to direct SQL failure (this is not an error): null at org.apache.hadoop.hive.metastore.MetastoreDirectSqlUtils.extractSqlInt(MetastoreDirectSqlUtils.java:251) at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getDatabase(MetaStoreDirectSql.java:420) at org.apache.hadoop.hive.metastore.ObjectStore$1.getSqlResult(ObjectStore.java:839) {noformat} > DirectSQL error when parsing create_time value for database > --- > > Key: HIVE-24453 > URL: https://issues.apache.org/jira/browse/HIVE-24453 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > > HIVE-21077 introduced a {{create_time}} field for {{DBS}} table in HMS. > Although the value for that field is always set after that patch, the value > could be null if the database was created before the feature went in. > DirectSQL should check for null value before parsing the integer, otherwise > we hit an exception and fallback to ORM path: > {code} > 2020-11-28 09:06:05,414 WARN org.apache.hadoop.hive.metastore.ObjectStore: > [pool-8-thread-194]: Falling back to ORM path due to direct SQL failure (this > is not an error): null at > org.apache.hadoop.hive.metastore.MetastoreDirectSqlUtils.extractSqlInt(MetastoreDirectSqlUtils.java:251) > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getDatabase(MetaStoreDirectSql.java:420) > at > org.apache.hadoop.hive.metastore.ObjectStore$1.getSqlResult(ObjectStore.java:839) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24453) Direct SQL error when parsing create_time value for database
[ https://issues.apache.org/jira/browse/HIVE-24453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-24453: --- Summary: Direct SQL error when parsing create_time value for database (was: DirectSQL error when parsing create_time value for database) > Direct SQL error when parsing create_time value for database > > > Key: HIVE-24453 > URL: https://issues.apache.org/jira/browse/HIVE-24453 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > > HIVE-21077 introduced a {{create_time}} field for {{DBS}} table in HMS. > Although the value for that field is always set after that patch, the value > could be null if the database was created before the feature went in. > DirectSQL should check for null value before parsing the integer, otherwise > we hit an exception and fallback to ORM path: > {code} > 2020-11-28 09:06:05,414 WARN org.apache.hadoop.hive.metastore.ObjectStore: > [pool-8-thread-194]: Falling back to ORM path due to direct SQL failure (this > is not an error): null at > org.apache.hadoop.hive.metastore.MetastoreDirectSqlUtils.extractSqlInt(MetastoreDirectSqlUtils.java:251) > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getDatabase(MetaStoreDirectSql.java:420) > at > org.apache.hadoop.hive.metastore.ObjectStore$1.getSqlResult(ObjectStore.java:839) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24453) DirectSQL error when parsing create_time value for database
[ https://issues.apache.org/jira/browse/HIVE-24453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez reassigned HIVE-24453: -- > DirectSQL error when parsing create_time value for database > --- > > Key: HIVE-24453 > URL: https://issues.apache.org/jira/browse/HIVE-24453 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > > HIVE-21077 introduced a {{create_time}} field for {{DBS}} table in HMS. > Although the value for that field is always set after that patch, the value > could be null if the database was created before the feature went in. > DirectSQL should check for null value before parsing the integer, otherwise > we hit an exception and fallback to ORM path: > {noformat} > 2020-11-28 09:06:05,414 WARN org.apache.hadoop.hive.metastore.ObjectStore: > [pool-8-thread-194]: Falling back to ORM path due to direct SQL failure (this > is not an error): null at > org.apache.hadoop.hive.metastore.MetastoreDirectSqlUtils.extractSqlInt(MetastoreDirectSqlUtils.java:251) > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getDatabase(MetaStoreDirectSql.java:420) > at > org.apache.hadoop.hive.metastore.ObjectStore$1.getSqlResult(ObjectStore.java:839) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23980) Shade guava from existing Hive versions
[ https://issues.apache.org/jira/browse/HIVE-23980?focusedWorklogId=518119=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518119 ] ASF GitHub Bot logged work on HIVE-23980: - Author: ASF GitHub Bot Created on: 30/Nov/20 20:26 Start Date: 30/Nov/20 20:26 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #1356: URL: https://github.com/apache/hive/pull/1356#discussion_r532881891 ## File path: itests/hive-blobstore/pom.xml ## @@ -55,33 +55,33 @@ org.apache.hive - hive-metastore + hive-exec ${project.version} test org.apache.hive hive-metastore ${project.version} - tests test org.apache.hive - hive-it-unit + hive-metastore Review comment: Oh, I don't change it actually. This diff looks like I add new `hive-metastore` dependency but in the original pom.xml, it already includes two `hive-metastore`, one is without classifier and one is with `tests` classifier. Just the way git showing diff confusing readers. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518119) Time Spent: 5h 10m (was: 5h) > Shade guava from existing Hive versions > --- > > Key: HIVE-23980 > URL: https://issues.apache.org/jira/browse/HIVE-23980 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.7 >Reporter: L. C. Hsieh >Assignee: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23980.01.branch-2.3.patch > > Time Spent: 5h 10m > Remaining Estimate: 0h > > I'm trying to upgrade Guava version in Spark. The JIRA ticket is SPARK-32502. > Running test hits an error: > {code} > sbt.ForkMain$ForkError: sbt.ForkMain$ForkError: java.lang.IllegalAccessError: > tried to access method > com.google.common.collect.Iterators.emptyIterator()Lcom/google/common/collect/UnmodifiableIterator; > from class org.apache.hadoop.hive.ql.exec.FetchOperator > at > org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:108) > at > org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:87) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:541) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227) > {code} > I know that hive-exec doesn't shade Guava until HIVE-22126 but that work > targets 4.0.0. I'm wondering if there is a solution for current Hive > versions, e.g. Hive 2.3.7? Any ideas? > Thanks. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23980) Shade guava from existing Hive versions
[ https://issues.apache.org/jira/browse/HIVE-23980?focusedWorklogId=518111=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518111 ] ASF GitHub Bot logged work on HIVE-23980: - Author: ASF GitHub Bot Created on: 30/Nov/20 20:17 Start Date: 30/Nov/20 20:17 Worklog Time Spent: 10m Work Description: viirya commented on pull request #1356: URL: https://github.com/apache/hive/pull/1356#issuecomment-736018282 > Thanks @viirya ! the new PR looks almost good to me except one nit. > > Also comparing to the original patch, we don't have changes to `HiveRelDecorrelator`, `HiveAggregate` and `HiveSubQueryRemoveRule`. This is unnecessary because we've shaded Guava within `hive-exec`? (some of the APIs like `operandJ` do not exist in the Calcite version used by branch-2.3 also). The change to `HiveAggregate` just to remove unused parameter `groupSets` in `deriveRowType`. Not related to shading guava, so I don't apply it. The change from `operand` to `operandJ` in `HiveSubQueryRemoveRule` and `HiveRelDecorrelator`, cannot apply to branch-2.3 because `operandJ` is not in calcite 1.10.0. The API was add since calcite 1.17.0 (https://github.com/apache/calcite/commit/d59b639d/). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518111) Time Spent: 5h (was: 4h 50m) > Shade guava from existing Hive versions > --- > > Key: HIVE-23980 > URL: https://issues.apache.org/jira/browse/HIVE-23980 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.7 >Reporter: L. C. Hsieh >Assignee: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23980.01.branch-2.3.patch > > Time Spent: 5h > Remaining Estimate: 0h > > I'm trying to upgrade Guava version in Spark. The JIRA ticket is SPARK-32502. > Running test hits an error: > {code} > sbt.ForkMain$ForkError: sbt.ForkMain$ForkError: java.lang.IllegalAccessError: > tried to access method > com.google.common.collect.Iterators.emptyIterator()Lcom/google/common/collect/UnmodifiableIterator; > from class org.apache.hadoop.hive.ql.exec.FetchOperator > at > org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:108) > at > org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:87) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:541) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227) > {code} > I know that hive-exec doesn't shade Guava until HIVE-22126 but that work > targets 4.0.0. I'm wondering if there is a solution for current Hive > versions, e.g. Hive 2.3.7? Any ideas? > Thanks. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23980) Shade guava from existing Hive versions
[ https://issues.apache.org/jira/browse/HIVE-23980?focusedWorklogId=518102=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518102 ] ASF GitHub Bot logged work on HIVE-23980: - Author: ASF GitHub Bot Created on: 30/Nov/20 19:52 Start Date: 30/Nov/20 19:52 Worklog Time Spent: 10m Work Description: sunchao commented on a change in pull request #1356: URL: https://github.com/apache/hive/pull/1356#discussion_r532848370 ## File path: itests/hive-blobstore/pom.xml ## @@ -55,33 +55,33 @@ org.apache.hive - hive-metastore + hive-exec ${project.version} test org.apache.hive hive-metastore ${project.version} - tests test org.apache.hive - hive-it-unit + hive-metastore Review comment: hmm why we need two `hive-metastore` dependencies (one with tests classifier)? I don't see this in the original patch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518102) Time Spent: 4h 50m (was: 4h 40m) > Shade guava from existing Hive versions > --- > > Key: HIVE-23980 > URL: https://issues.apache.org/jira/browse/HIVE-23980 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.7 >Reporter: L. C. Hsieh >Assignee: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23980.01.branch-2.3.patch > > Time Spent: 4h 50m > Remaining Estimate: 0h > > I'm trying to upgrade Guava version in Spark. The JIRA ticket is SPARK-32502. > Running test hits an error: > {code} > sbt.ForkMain$ForkError: sbt.ForkMain$ForkError: java.lang.IllegalAccessError: > tried to access method > com.google.common.collect.Iterators.emptyIterator()Lcom/google/common/collect/UnmodifiableIterator; > from class org.apache.hadoop.hive.ql.exec.FetchOperator > at > org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:108) > at > org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:87) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:541) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227) > {code} > I know that hive-exec doesn't shade Guava until HIVE-22126 but that work > targets 4.0.0. I'm wondering if there is a solution for current Hive > versions, e.g. Hive 2.3.7? Any ideas? > Thanks. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24452) Add a generic JDBC implementation that can be used to other JDBC DBs
[ https://issues.apache.org/jira/browse/HIVE-24452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17240994#comment-17240994 ] David Mollitor commented on HIVE-24452: --- Please consider using an abstraction layer that deals with the different vendors for you. http://www.jooq.org/ https://blog.mybatis.org/ https://www.eclipse.org/eclipselink/ https://hibernate.org/ > Add a generic JDBC implementation that can be used to other JDBC DBs > > > Key: HIVE-24452 > URL: https://issues.apache.org/jira/browse/HIVE-24452 > Project: Hive > Issue Type: Sub-task >Reporter: Naveen Gangam >Priority: Major > > Currently, we added a custom provider for each of the JDBC DBs supported by > hive (MySQL, Postgres, MSSQL(pending), Oracle(pending) and Derby (pending)). > But if there are other JDBC providers we want to add support for, adding a > generic JDBC provider would be useful that hive can default to. > This means > 1) We have to support means to indicate that a connector is for a JDBC > datasource. So maybe add a property in DCPROPERTIES on connector to indicate > that the datasource supports JDBC. > 2) If there is no custom connector for a data source, use the > GenericJDBCDatasource connector that is to be added as part of this jira. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24448) Support case-sensitivity for tables in REMOTE database.
[ https://issues.apache.org/jira/browse/HIVE-24448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17240973#comment-17240973 ] Naveen Gangam commented on HIVE-24448: -- I had made a test fix in SemanticAnalyzer.processTable() method to remove the conversion toLowerCase(). That fix resulted in 4 test failures. {noformat} Testing / split-08 / Archive / testCliDriver[reduce_deduplicate_null_keys] – org.apache.hadoop.hive.cli.split19.TestMiniLlapLocalCliDriver Testing / split-20 / Archive / testActiveSessionTimeMetrics – org.apache.hive.service.cli.session.TestSessionManagerMetrics Testing / split-17 / Archive / testCliDriver[cte_6] – org.apache.hadoop.hive.cli.split5.TestMiniLlapLocalCliDriver Testing / split-07 / Archive / testCliDriver[dynpart_sort_optimization] – org.apache.hadoop.hive.cli.split7.TestMiniLlapLocalCliDriver {noformat} All the 3 failures from the llap test driver are because of this assertion in the code. {noformat} java.lang.AssertionError at org.apache.hadoop.hive.ql.parse.QB.rewriteViewToSubq(QB.java:256) at org.apache.hadoop.hive.ql.parse.QB.rewriteCTEToSubq(QB.java:264) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.addCTEAsSubQuery(SemanticAnalyzer.java:1337) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:2202) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:2142) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:12403) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12507) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:443) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:302) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:171) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:302) at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:223) at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:469) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:421) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:385) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:379) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:125) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:229) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:203) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:129) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:355) at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:744) at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:714) at org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:170) at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157) at org.apache.hadoop.hive.cli.split19.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:62) at sun.reflect.GeneratedMethodAccessor171.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:135) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at
[jira] [Work logged] (HIVE-24433) AutoCompaction is not getting triggered for CamelCase Partition Values
[ https://issues.apache.org/jira/browse/HIVE-24433?focusedWorklogId=518081=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518081 ] ASF GitHub Bot logged work on HIVE-24433: - Author: ASF GitHub Bot Created on: 30/Nov/20 18:49 Start Date: 30/Nov/20 18:49 Worklog Time Spent: 10m Work Description: nareshpr commented on a change in pull request #1712: URL: https://github.com/apache/hive/pull/1712#discussion_r532821552 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -2725,7 +2725,7 @@ private void insertTxnComponents(long txnid, LockRequest rqst, Connection dbConn } String dbName = normalizeCase(lc.getDbname()); String tblName = normalizeCase(lc.getTablename()); - String partName = normalizeCase(lc.getPartitionname()); + String partName = lc.getPartitionname(); Review comment: Yes, i validated below 4 SQL's with my patch, partition(key=value) is already normalized insert into table abc PARTITION(CitY='Bangalore') values('Dan'); insert overwrite table abc partition(CiTy='Bangalore') select Name from abc; update table abc set Name='xy' where CiTy='Bangalore'; delete from abc where CiTy='Bangalore'; This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518081) Time Spent: 1h 10m (was: 1h) > AutoCompaction is not getting triggered for CamelCase Partition Values > -- > > Key: HIVE-24433 > URL: https://issues.apache.org/jira/browse/HIVE-24433 > Project: Hive > Issue Type: Bug >Reporter: Naresh P R >Assignee: Naresh P R >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > PartionKeyValue is getting converted into lowerCase in below 2 places. > [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L2728] > [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L2851] > Because of which TXN_COMPONENTS & HIVE_LOCKS tables are not having entries > from proper partition values. > When query completes, the entry moves from TXN_COMPONENTS to > COMPLETED_TXN_COMPONENTS. Hive AutoCompaction will not recognize the > partition & considers it as invalid partition > {code:java} > create table abc(name string) partitioned by(city string) stored as orc > tblproperties('transactional'='true'); > insert into abc partition(city='Bangalore') values('aaa'); > {code} > Example entry in COMPLETED_TXN_COMPONENTS > {noformat} > +---+--++---+-+-+---+ > | CTC_TXNID | CTC_DATABASE | CTC_TABLE | CTC_PARTITION | > CTC_TIMESTAMP | CTC_WRITEID | CTC_UPDATE_DELETE | > +---+--++---+-+-+---+ > | 2 | default | abc | city=bangalore | 2020-11-25 09:26:59 > | 1 | N | > +---+--++---+-+-+---+ > {noformat} > > AutoCompaction fails to get triggered with below error > {code:java} > 2020-11-25T09:35:10,364 INFO [Thread-9]: compactor.Initiator > (Initiator.java:run(98)) - Checking to see if we should compact > default.abc.city=bangalore > 2020-11-25T09:35:10,380 INFO [Thread-9]: compactor.Initiator > (Initiator.java:run(155)) - Can't find partition > default.compaction_test.city=bangalore, assuming it has been dropped and > moving on{code} > I verifed below 4 SQL's with my PR, those all produced correct > PartitionKeyValue > i.e, COMPLETED_TXN_COMPONENTS.CTC_PARTITION="city=Bangalore" > {code:java} > insert into table abc PARTITION(CitY='Bangalore') values('Dan'); > insert overwrite table abc partition(CiTy='Bangalore') select Name from abc; > update table abc set Name='xy' where CiTy='Bangalore'; > delete from abc where CiTy='Bangalore';{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24450) DbNotificationListener Request Notification IDs in Batches
[ https://issues.apache.org/jira/browse/HIVE-24450?focusedWorklogId=518075=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518075 ] ASF GitHub Bot logged work on HIVE-24450: - Author: ASF GitHub Bot Created on: 30/Nov/20 18:36 Start Date: 30/Nov/20 18:36 Worklog Time Spent: 10m Work Description: belugabehr opened a new pull request #1718: URL: https://github.com/apache/hive/pull/1718 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518075) Remaining Estimate: 0h Time Spent: 10m > DbNotificationListener Request Notification IDs in Batches > -- > > Key: HIVE-24450 > URL: https://issues.apache.org/jira/browse/HIVE-24450 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Every time a new notification event is logged into the database, the sequence > number for the ID of the even is incremented by one. It is very standard in > database design to instead request a block of IDs for each fetch from the > database. The sequence numbers are then handed out locally until the block > of IDs is exhausted. This allows for fewer database round-trips and > transactions, at the expense of perhaps burning a few IDs. > Burning of IDs happens when the server is restarted in the middle of a block > of sequence IDs. That is, if the HMS requests a block of 10 ids, and only > three have been assigned, after the restart, the HMS will request another > block of 10, burning (wasting) 7 IDs. As long as the blocks are not too > small, and restarts are infrequent, then few IDs are lost. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24450) DbNotificationListener Request Notification IDs in Batches
[ https://issues.apache.org/jira/browse/HIVE-24450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24450: -- Labels: pull-request-available (was: ) > DbNotificationListener Request Notification IDs in Batches > -- > > Key: HIVE-24450 > URL: https://issues.apache.org/jira/browse/HIVE-24450 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Every time a new notification event is logged into the database, the sequence > number for the ID of the even is incremented by one. It is very standard in > database design to instead request a block of IDs for each fetch from the > database. The sequence numbers are then handed out locally until the block > of IDs is exhausted. This allows for fewer database round-trips and > transactions, at the expense of perhaps burning a few IDs. > Burning of IDs happens when the server is restarted in the middle of a block > of sequence IDs. That is, if the HMS requests a block of 10 ids, and only > three have been assigned, after the restart, the HMS will request another > block of 10, burning (wasting) 7 IDs. As long as the blocks are not too > small, and restarts are infrequent, then few IDs are lost. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24450) DbNotificationListener Request Notification IDs in Batches
[ https://issues.apache.org/jira/browse/HIVE-24450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Mollitor reassigned HIVE-24450: - > DbNotificationListener Request Notification IDs in Batches > -- > > Key: HIVE-24450 > URL: https://issues.apache.org/jira/browse/HIVE-24450 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > > Every time a new notification event is logged into the database, the sequence > number for the ID of the even is incremented by one. It is very standard in > database design to instead request a block of IDs for each fetch from the > database. The sequence numbers are then handed out locally until the block > of IDs is exhausted. This allows for fewer database round-trips and > transactions, at the expense of perhaps burning a few IDs. > Burning of IDs happens when the server is restarted in the middle of a block > of sequence IDs. That is, if the HMS requests a block of 10 ids, and only > three have been assigned, after the restart, the HMS will request another > block of 10, burning (wasting) 7 IDs. As long as the blocks are not too > small, and restarts are infrequent, then few IDs are lost. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24436) Fix Avro NULL_DEFAULT_VALUE compatibility issue
[ https://issues.apache.org/jira/browse/HIVE-24436?focusedWorklogId=518070=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518070 ] ASF GitHub Bot logged work on HIVE-24436: - Author: ASF GitHub Bot Created on: 30/Nov/20 18:31 Start Date: 30/Nov/20 18:31 Worklog Time Spent: 10m Work Description: sunchao edited a comment on pull request #1715: URL: https://github.com/apache/hive/pull/1715#issuecomment-735961948 Yes @dongjoon-hyun , these test failures have been there since 2.3.7 release. I do plan to take a look at them later. @wangyum I believe the issue exists in the master branch as well? if so, can we make this PR against the master and backport to branch-2.3/branch-3.1 later once that is merged? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518070) Time Spent: 1h (was: 50m) > Fix Avro NULL_DEFAULT_VALUE compatibility issue > --- > > Key: HIVE-24436 > URL: https://issues.apache.org/jira/browse/HIVE-24436 > Project: Hive > Issue Type: Improvement > Components: Avro >Affects Versions: 2.3.8 >Reporter: Yuming Wang >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Exception1: > {noformat} > - create hive serde table with Catalog > *** RUN ABORTED *** > java.lang.NoSuchMethodError: 'void > org.apache.avro.Schema$Field.(java.lang.String, org.apache.avro.Schema, > java.lang.String, org.codehaus.jackson.JsonNode)' > at > org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.createAvroField(TypeInfoToSchema.java:76) > at > org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.convert(TypeInfoToSchema.java:61) > at > org.apache.hadoop.hive.serde2.avro.AvroSerDe.getSchemaFromCols(AvroSerDe.java:170) > at > org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:114) > at > org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:83) > at > org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:533) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:450) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:437) > at > org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:281) > at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:263) > {noformat} > Exception2: > {noformat} > - alter hive serde table add columns -- partitioned - AVRO *** FAILED *** > org.apache.spark.sql.AnalysisException: > org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.avro.AvroRuntimeException: Unknown datum class: class > org.codehaus.jackson.node.NullNode; > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:112) > at > org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:245) > at > org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.createTable(ExternalCatalogWithListener.scala:94) > at > org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:346) > at > org.apache.spark.sql.execution.command.CreateTableCommand.run(tables.scala:166) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) > at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228) > at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3680) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24436) Fix Avro NULL_DEFAULT_VALUE compatibility issue
[ https://issues.apache.org/jira/browse/HIVE-24436?focusedWorklogId=518067=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518067 ] ASF GitHub Bot logged work on HIVE-24436: - Author: ASF GitHub Bot Created on: 30/Nov/20 18:28 Start Date: 30/Nov/20 18:28 Worklog Time Spent: 10m Work Description: sunchao commented on pull request #1715: URL: https://github.com/apache/hive/pull/1715#issuecomment-735961948 Yes @dongjoon-hyun , these test failures have been there since 2.3.7 release. I do plan to take a look at them later.I we make this PR against the master? @wangyum I believe the issue exists in the master branch as well? if so, can we make this PR against the master and backport to branch-2.3/branch-3.1 later once that is merged? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518067) Time Spent: 50m (was: 40m) > Fix Avro NULL_DEFAULT_VALUE compatibility issue > --- > > Key: HIVE-24436 > URL: https://issues.apache.org/jira/browse/HIVE-24436 > Project: Hive > Issue Type: Improvement > Components: Avro >Affects Versions: 2.3.8 >Reporter: Yuming Wang >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Exception1: > {noformat} > - create hive serde table with Catalog > *** RUN ABORTED *** > java.lang.NoSuchMethodError: 'void > org.apache.avro.Schema$Field.(java.lang.String, org.apache.avro.Schema, > java.lang.String, org.codehaus.jackson.JsonNode)' > at > org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.createAvroField(TypeInfoToSchema.java:76) > at > org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.convert(TypeInfoToSchema.java:61) > at > org.apache.hadoop.hive.serde2.avro.AvroSerDe.getSchemaFromCols(AvroSerDe.java:170) > at > org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:114) > at > org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:83) > at > org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:533) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:450) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:437) > at > org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:281) > at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:263) > {noformat} > Exception2: > {noformat} > - alter hive serde table add columns -- partitioned - AVRO *** FAILED *** > org.apache.spark.sql.AnalysisException: > org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.avro.AvroRuntimeException: Unknown datum class: class > org.codehaus.jackson.node.NullNode; > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:112) > at > org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:245) > at > org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.createTable(ExternalCatalogWithListener.scala:94) > at > org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:346) > at > org.apache.spark.sql.execution.command.CreateTableCommand.run(tables.scala:166) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) > at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228) > at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3680) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-24424) Use PreparedStatements in DbNotificationListener getNextNLId
[ https://issues.apache.org/jira/browse/HIVE-24424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Mollitor resolved HIVE-24424. --- Fix Version/s: 4.0.0 Resolution: Fixed Merged to master. Thank you [~abstractdog] and [~mgergely] for the reviews!! > Use PreparedStatements in DbNotificationListener getNextNLId > > > Key: HIVE-24424 > URL: https://issues.apache.org/jira/browse/HIVE-24424 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Simplify the code, remove debug logging concatenation, and make it more > readable, -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24424) Use PreparedStatements in DbNotificationListener getNextNLId
[ https://issues.apache.org/jira/browse/HIVE-24424?focusedWorklogId=518066=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518066 ] ASF GitHub Bot logged work on HIVE-24424: - Author: ASF GitHub Bot Created on: 30/Nov/20 18:20 Start Date: 30/Nov/20 18:20 Worklog Time Spent: 10m Work Description: belugabehr merged pull request #1704: URL: https://github.com/apache/hive/pull/1704 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518066) Time Spent: 1h 20m (was: 1h 10m) > Use PreparedStatements in DbNotificationListener getNextNLId > > > Key: HIVE-24424 > URL: https://issues.apache.org/jira/browse/HIVE-24424 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > Simplify the code, remove debug logging concatenation, and make it more > readable, -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23980) Shade guava from existing Hive versions
[ https://issues.apache.org/jira/browse/HIVE-23980?focusedWorklogId=518061=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518061 ] ASF GitHub Bot logged work on HIVE-23980: - Author: ASF GitHub Bot Created on: 30/Nov/20 18:07 Start Date: 30/Nov/20 18:07 Worklog Time Spent: 10m Work Description: viirya commented on pull request #1356: URL: https://github.com/apache/hive/pull/1356#issuecomment-735950137 Internally we test this patch and pass all Spark tests. I think it gives us more confidence to have this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518061) Time Spent: 4h 40m (was: 4.5h) > Shade guava from existing Hive versions > --- > > Key: HIVE-23980 > URL: https://issues.apache.org/jira/browse/HIVE-23980 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.7 >Reporter: L. C. Hsieh >Assignee: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23980.01.branch-2.3.patch > > Time Spent: 4h 40m > Remaining Estimate: 0h > > I'm trying to upgrade Guava version in Spark. The JIRA ticket is SPARK-32502. > Running test hits an error: > {code} > sbt.ForkMain$ForkError: sbt.ForkMain$ForkError: java.lang.IllegalAccessError: > tried to access method > com.google.common.collect.Iterators.emptyIterator()Lcom/google/common/collect/UnmodifiableIterator; > from class org.apache.hadoop.hive.ql.exec.FetchOperator > at > org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:108) > at > org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:87) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:541) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227) > {code} > I know that hive-exec doesn't shade Guava until HIVE-22126 but that work > targets 4.0.0. I'm wondering if there is a solution for current Hive > versions, e.g. Hive 2.3.7? Any ideas? > Thanks. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24444) compactor.Cleaner should not set state "mark cleaned" if there are obsolete files in the FS
[ https://issues.apache.org/jira/browse/HIVE-2?focusedWorklogId=518060=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518060 ] ASF GitHub Bot logged work on HIVE-2: - Author: ASF GitHub Bot Created on: 30/Nov/20 18:06 Start Date: 30/Nov/20 18:06 Worklog Time Spent: 10m Work Description: klcopp commented on a change in pull request #1716: URL: https://github.com/apache/hive/pull/1716#discussion_r532794904 ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java ## @@ -265,13 +267,15 @@ private static String idWatermark(CompactionInfo ci) { } /** - * @return true if any files were removed + * @return true if the cleaner has removed all files rendered obsolete by compaction */ private boolean removeFiles(String location, ValidWriteIdList writeIdList, CompactionInfo ci) throws IOException, NoSuchObjectException, MetaException { Path locPath = new Path(location); +FileSystem fs = locPath.getFileSystem(conf); +Map dirSnapshots = AcidUtils.getHdfsDirSnapshots(fs, locPath); AcidUtils.Directory dir = AcidUtils.getAcidState(locPath.getFileSystem(conf), locPath, conf, writeIdList, Ref.from( Review comment: No, not with HIVE-24291. Without HIVE-24291 (which might not be usable if for example if HMS schema changes are out the question) we still could have a pileup of the same table/partition in "ready for cleaning" in the queue. Without this change (HIVE-2) some of them might not be deleted. The goal of this change is that, when the table does get cleaned, all of the records will be deleted. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518060) Time Spent: 3h 20m (was: 3h 10m) > compactor.Cleaner should not set state "mark cleaned" if there are obsolete > files in the FS > --- > > Key: HIVE-2 > URL: https://issues.apache.org/jira/browse/HIVE-2 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 3h 20m > Remaining Estimate: 0h > > This is an improvement on HIVE-24314, in which markCleaned() is called only > if +any+ files are deleted by the cleaner. This could cause a problem in the > following case: > Say for table_1 compaction1 cleaning was blocked by an open txn, and > compaction is run again on the same table (compaction2). Both compaction1 and > compaction2 could be in "ready for cleaning" at the same time. By this time > the blocking open txn could be committed. When the cleaner runs, one of > compaction1 and compaction2 will remain in the "ready for cleaning" state: > Say compaction2 is picked up by the cleaner first. The Cleaner deletes all > obsolete files. Then compaction1 is picked up by the cleaner; the cleaner > doesn't remove any files and compaction1 will stay in the queue in a "ready > for cleaning" state. > HIVE-24291 already solves this issue but if it isn't usable (for example if > HMS schema changes are out the question) then HIVE-24314 + this change will > fix the issue of the Cleaner not removing all obsolete files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24444) compactor.Cleaner should not set state "mark cleaned" if there are obsolete files in the FS
[ https://issues.apache.org/jira/browse/HIVE-2?focusedWorklogId=518059=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518059 ] ASF GitHub Bot logged work on HIVE-2: - Author: ASF GitHub Bot Created on: 30/Nov/20 18:04 Start Date: 30/Nov/20 18:04 Worklog Time Spent: 10m Work Description: klcopp commented on a change in pull request #1716: URL: https://github.com/apache/hive/pull/1716#discussion_r532794904 ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java ## @@ -265,13 +267,15 @@ private static String idWatermark(CompactionInfo ci) { } /** - * @return true if any files were removed + * @return true if the cleaner has removed all files rendered obsolete by compaction */ private boolean removeFiles(String location, ValidWriteIdList writeIdList, CompactionInfo ci) throws IOException, NoSuchObjectException, MetaException { Path locPath = new Path(location); +FileSystem fs = locPath.getFileSystem(conf); +Map dirSnapshots = AcidUtils.getHdfsDirSnapshots(fs, locPath); AcidUtils.Directory dir = AcidUtils.getAcidState(locPath.getFileSystem(conf), locPath, conf, writeIdList, Ref.from( Review comment: No, not with HIVE-24291. Without HIVE-24291 (which might not be usable if for example if HMS schema changes are out the question) and without this change, we still could have a pileup of the same table/partition in "ready for cleaning" in the queue. Without this change (HIVE-2) some of them might not be deleted. The goal of this change is that, when the table does get cleaned, all of the records will be deleted. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518059) Time Spent: 3h 10m (was: 3h) > compactor.Cleaner should not set state "mark cleaned" if there are obsolete > files in the FS > --- > > Key: HIVE-2 > URL: https://issues.apache.org/jira/browse/HIVE-2 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 3h 10m > Remaining Estimate: 0h > > This is an improvement on HIVE-24314, in which markCleaned() is called only > if +any+ files are deleted by the cleaner. This could cause a problem in the > following case: > Say for table_1 compaction1 cleaning was blocked by an open txn, and > compaction is run again on the same table (compaction2). Both compaction1 and > compaction2 could be in "ready for cleaning" at the same time. By this time > the blocking open txn could be committed. When the cleaner runs, one of > compaction1 and compaction2 will remain in the "ready for cleaning" state: > Say compaction2 is picked up by the cleaner first. The Cleaner deletes all > obsolete files. Then compaction1 is picked up by the cleaner; the cleaner > doesn't remove any files and compaction1 will stay in the queue in a "ready > for cleaning" state. > HIVE-24291 already solves this issue but if it isn't usable (for example if > HMS schema changes are out the question) then HIVE-24314 + this change will > fix the issue of the Cleaner not removing all obsolete files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24444) compactor.Cleaner should not set state "mark cleaned" if there are obsolete files in the FS
[ https://issues.apache.org/jira/browse/HIVE-2?focusedWorklogId=518057=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518057 ] ASF GitHub Bot logged work on HIVE-2: - Author: ASF GitHub Bot Created on: 30/Nov/20 18:00 Start Date: 30/Nov/20 18:00 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #1716: URL: https://github.com/apache/hive/pull/1716#discussion_r532776190 ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java ## @@ -316,6 +314,30 @@ private boolean removeFiles(String location, ValidWriteIdList writeIdList, Compa } fs.delete(dead, true); } -return true; +// Check if there will be more obsolete directories to clean when possible. We will only mark cleaned when this +// number reaches 0. +return getNumEventuallyObsoleteDirs(location, dirSnapshots) == 0; + } + + /** + * Get the number of base/delta directories the Cleaner should remove eventually. If we check this after cleaning + * we can see if the Cleaner has further work to do in this table/partition directory that it hasn't been able to + * finish, e.g. because of an open transaction at the time of compaction. + * We do this by assuming that there are no open transactions anywhere and then calling getAcidState. If there are + * obsolete directories, then the Cleaner has more work to do. + * @param location location of table + * @return number of dirs left for the cleaner to clean – eventually + * @throws IOException + */ + private int getNumEventuallyObsoleteDirs(String location, Map dirSnapshots) + throws IOException { +ValidTxnList validTxnList = new ValidReadTxnList(); Review comment: Will we consider all writes as valid here? Shouldn't we limit at least by compactor txnId? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518057) Time Spent: 3h (was: 2h 50m) > compactor.Cleaner should not set state "mark cleaned" if there are obsolete > files in the FS > --- > > Key: HIVE-2 > URL: https://issues.apache.org/jira/browse/HIVE-2 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 3h > Remaining Estimate: 0h > > This is an improvement on HIVE-24314, in which markCleaned() is called only > if +any+ files are deleted by the cleaner. This could cause a problem in the > following case: > Say for table_1 compaction1 cleaning was blocked by an open txn, and > compaction is run again on the same table (compaction2). Both compaction1 and > compaction2 could be in "ready for cleaning" at the same time. By this time > the blocking open txn could be committed. When the cleaner runs, one of > compaction1 and compaction2 will remain in the "ready for cleaning" state: > Say compaction2 is picked up by the cleaner first. The Cleaner deletes all > obsolete files. Then compaction1 is picked up by the cleaner; the cleaner > doesn't remove any files and compaction1 will stay in the queue in a "ready > for cleaning" state. > HIVE-24291 already solves this issue but if it isn't usable (for example if > HMS schema changes are out the question) then HIVE-24314 + this change will > fix the issue of the Cleaner not removing all obsolete files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24444) compactor.Cleaner should not set state "mark cleaned" if there are obsolete files in the FS
[ https://issues.apache.org/jira/browse/HIVE-2?focusedWorklogId=518048=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518048 ] ASF GitHub Bot logged work on HIVE-2: - Author: ASF GitHub Bot Created on: 30/Nov/20 17:35 Start Date: 30/Nov/20 17:35 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #1716: URL: https://github.com/apache/hive/pull/1716#discussion_r532776190 ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java ## @@ -316,6 +314,30 @@ private boolean removeFiles(String location, ValidWriteIdList writeIdList, Compa } fs.delete(dead, true); } -return true; +// Check if there will be more obsolete directories to clean when possible. We will only mark cleaned when this +// number reaches 0. +return getNumEventuallyObsoleteDirs(location, dirSnapshots) == 0; + } + + /** + * Get the number of base/delta directories the Cleaner should remove eventually. If we check this after cleaning + * we can see if the Cleaner has further work to do in this table/partition directory that it hasn't been able to + * finish, e.g. because of an open transaction at the time of compaction. + * We do this by assuming that there are no open transactions anywhere and then calling getAcidState. If there are + * obsolete directories, then the Cleaner has more work to do. + * @param location location of table + * @return number of dirs left for the cleaner to clean – eventually + * @throws IOException + */ + private int getNumEventuallyObsoleteDirs(String location, Map dirSnapshots) + throws IOException { +ValidTxnList validTxnList = new ValidReadTxnList(); Review comment: Will we consider all writes as valid here? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518048) Time Spent: 2h 50m (was: 2h 40m) > compactor.Cleaner should not set state "mark cleaned" if there are obsolete > files in the FS > --- > > Key: HIVE-2 > URL: https://issues.apache.org/jira/browse/HIVE-2 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 2h 50m > Remaining Estimate: 0h > > This is an improvement on HIVE-24314, in which markCleaned() is called only > if +any+ files are deleted by the cleaner. This could cause a problem in the > following case: > Say for table_1 compaction1 cleaning was blocked by an open txn, and > compaction is run again on the same table (compaction2). Both compaction1 and > compaction2 could be in "ready for cleaning" at the same time. By this time > the blocking open txn could be committed. When the cleaner runs, one of > compaction1 and compaction2 will remain in the "ready for cleaning" state: > Say compaction2 is picked up by the cleaner first. The Cleaner deletes all > obsolete files. Then compaction1 is picked up by the cleaner; the cleaner > doesn't remove any files and compaction1 will stay in the queue in a "ready > for cleaning" state. > HIVE-24291 already solves this issue but if it isn't usable (for example if > HMS schema changes are out the question) then HIVE-24314 + this change will > fix the issue of the Cleaner not removing all obsolete files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24444) compactor.Cleaner should not set state "mark cleaned" if there are obsolete files in the FS
[ https://issues.apache.org/jira/browse/HIVE-2?focusedWorklogId=518046=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518046 ] ASF GitHub Bot logged work on HIVE-2: - Author: ASF GitHub Bot Created on: 30/Nov/20 17:31 Start Date: 30/Nov/20 17:31 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #1716: URL: https://github.com/apache/hive/pull/1716#discussion_r532772806 ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java ## @@ -265,13 +267,15 @@ private static String idWatermark(CompactionInfo ci) { } /** - * @return true if any files were removed + * @return true if the cleaner has removed all files rendered obsolete by compaction */ private boolean removeFiles(String location, ValidWriteIdList writeIdList, CompactionInfo ci) throws IOException, NoSuchObjectException, MetaException { Path locPath = new Path(location); +FileSystem fs = locPath.getFileSystem(conf); +Map dirSnapshots = AcidUtils.getHdfsDirSnapshots(fs, locPath); AcidUtils.Directory dir = AcidUtils.getAcidState(locPath.getFileSystem(conf), locPath, conf, writeIdList, Ref.from( Review comment: what would happen if there is a long running read-only txn + multiple compaction attempts? are we going to have multiple records in a queue for the same table/partition pending cleanup? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518046) Time Spent: 2h 40m (was: 2.5h) > compactor.Cleaner should not set state "mark cleaned" if there are obsolete > files in the FS > --- > > Key: HIVE-2 > URL: https://issues.apache.org/jira/browse/HIVE-2 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 2h 40m > Remaining Estimate: 0h > > This is an improvement on HIVE-24314, in which markCleaned() is called only > if +any+ files are deleted by the cleaner. This could cause a problem in the > following case: > Say for table_1 compaction1 cleaning was blocked by an open txn, and > compaction is run again on the same table (compaction2). Both compaction1 and > compaction2 could be in "ready for cleaning" at the same time. By this time > the blocking open txn could be committed. When the cleaner runs, one of > compaction1 and compaction2 will remain in the "ready for cleaning" state: > Say compaction2 is picked up by the cleaner first. The Cleaner deletes all > obsolete files. Then compaction1 is picked up by the cleaner; the cleaner > doesn't remove any files and compaction1 will stay in the queue in a "ready > for cleaning" state. > HIVE-24291 already solves this issue but if it isn't usable (for example if > HMS schema changes are out the question) then HIVE-24314 + this change will > fix the issue of the Cleaner not removing all obsolete files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24444) compactor.Cleaner should not set state "mark cleaned" if there are obsolete files in the FS
[ https://issues.apache.org/jira/browse/HIVE-2?focusedWorklogId=518045=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518045 ] ASF GitHub Bot logged work on HIVE-2: - Author: ASF GitHub Bot Created on: 30/Nov/20 17:30 Start Date: 30/Nov/20 17:30 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #1716: URL: https://github.com/apache/hive/pull/1716#discussion_r532772806 ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java ## @@ -265,13 +267,15 @@ private static String idWatermark(CompactionInfo ci) { } /** - * @return true if any files were removed + * @return true if the cleaner has removed all files rendered obsolete by compaction */ private boolean removeFiles(String location, ValidWriteIdList writeIdList, CompactionInfo ci) throws IOException, NoSuchObjectException, MetaException { Path locPath = new Path(location); +FileSystem fs = locPath.getFileSystem(conf); +Map dirSnapshots = AcidUtils.getHdfsDirSnapshots(fs, locPath); AcidUtils.Directory dir = AcidUtils.getAcidState(locPath.getFileSystem(conf), locPath, conf, writeIdList, Ref.from( Review comment: what would happen if there is a long running read-only txn + multiple compaction attempts? are we going to have multiple records in a queue for the same tab;e/partition pending cleanup? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518045) Time Spent: 2.5h (was: 2h 20m) > compactor.Cleaner should not set state "mark cleaned" if there are obsolete > files in the FS > --- > > Key: HIVE-2 > URL: https://issues.apache.org/jira/browse/HIVE-2 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 2.5h > Remaining Estimate: 0h > > This is an improvement on HIVE-24314, in which markCleaned() is called only > if +any+ files are deleted by the cleaner. This could cause a problem in the > following case: > Say for table_1 compaction1 cleaning was blocked by an open txn, and > compaction is run again on the same table (compaction2). Both compaction1 and > compaction2 could be in "ready for cleaning" at the same time. By this time > the blocking open txn could be committed. When the cleaner runs, one of > compaction1 and compaction2 will remain in the "ready for cleaning" state: > Say compaction2 is picked up by the cleaner first. The Cleaner deletes all > obsolete files. Then compaction1 is picked up by the cleaner; the cleaner > doesn't remove any files and compaction1 will stay in the queue in a "ready > for cleaning" state. > HIVE-24291 already solves this issue but if it isn't usable (for example if > HMS schema changes are out the question) then HIVE-24314 + this change will > fix the issue of the Cleaner not removing all obsolete files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24444) compactor.Cleaner should not set state "mark cleaned" if there are obsolete files in the FS
[ https://issues.apache.org/jira/browse/HIVE-2?focusedWorklogId=518043=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518043 ] ASF GitHub Bot logged work on HIVE-2: - Author: ASF GitHub Bot Created on: 30/Nov/20 17:18 Start Date: 30/Nov/20 17:18 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #1716: URL: https://github.com/apache/hive/pull/1716#discussion_r532764005 ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java ## @@ -265,13 +267,15 @@ private static String idWatermark(CompactionInfo ci) { } /** - * @return true if any files were removed + * @return true if the cleaner has removed all files rendered obsolete by compaction */ private boolean removeFiles(String location, ValidWriteIdList writeIdList, CompactionInfo ci) throws IOException, NoSuchObjectException, MetaException { Path locPath = new Path(location); +FileSystem fs = locPath.getFileSystem(conf); +Map dirSnapshots = AcidUtils.getHdfsDirSnapshots(fs, locPath); AcidUtils.Directory dir = AcidUtils.getAcidState(locPath.getFileSystem(conf), locPath, conf, writeIdList, Ref.from( Review comment: could be replaced with above fs variable This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 518043) Time Spent: 2h 20m (was: 2h 10m) > compactor.Cleaner should not set state "mark cleaned" if there are obsolete > files in the FS > --- > > Key: HIVE-2 > URL: https://issues.apache.org/jira/browse/HIVE-2 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > > This is an improvement on HIVE-24314, in which markCleaned() is called only > if +any+ files are deleted by the cleaner. This could cause a problem in the > following case: > Say for table_1 compaction1 cleaning was blocked by an open txn, and > compaction is run again on the same table (compaction2). Both compaction1 and > compaction2 could be in "ready for cleaning" at the same time. By this time > the blocking open txn could be committed. When the cleaner runs, one of > compaction1 and compaction2 will remain in the "ready for cleaning" state: > Say compaction2 is picked up by the cleaner first. The Cleaner deletes all > obsolete files. Then compaction1 is picked up by the cleaner; the cleaner > doesn't remove any files and compaction1 will stay in the queue in a "ready > for cleaning" state. > HIVE-24291 already solves this issue but if it isn't usable (for example if > HMS schema changes are out the question) then HIVE-24314 + this change will > fix the issue of the Cleaner not removing all obsolete files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24449) Implement connector provider for Derby DB
[ https://issues.apache.org/jira/browse/HIVE-24449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam reassigned HIVE-24449: > Implement connector provider for Derby DB > - > > Key: HIVE-24449 > URL: https://issues.apache.org/jira/browse/HIVE-24449 > Project: Hive > Issue Type: Sub-task > Components: Hive >Affects Versions: 4.0.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam >Priority: Major > > Provide an implementation of Connector provider for Derby DB. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24447) Move create/drop/alter table to the provider interface
[ https://issues.apache.org/jira/browse/HIVE-24447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam reassigned HIVE-24447: > Move create/drop/alter table to the provider interface > -- > > Key: HIVE-24447 > URL: https://issues.apache.org/jira/browse/HIVE-24447 > Project: Hive > Issue Type: Sub-task > Components: Hive >Affects Versions: 4.0.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam >Priority: Major > > The support for such operations on a table in a REMOTE database will be left > to the discretion of the providers to support/implement. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24446) Materialized View plan alters explicit cast type in query
[ https://issues.apache.org/jira/browse/HIVE-24446?focusedWorklogId=517979=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517979 ] ASF GitHub Bot logged work on HIVE-24446: - Author: ASF GitHub Bot Created on: 30/Nov/20 14:55 Start Date: 30/Nov/20 14:55 Worklog Time Spent: 10m Work Description: kasakrisz opened a new pull request #1717: URL: https://github.com/apache/hive/pull/1717 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517979) Remaining Estimate: 0h Time Spent: 10m > Materialized View plan alters explicit cast type in query > - > > Key: HIVE-24446 > URL: https://issues.apache.org/jira/browse/HIVE-24446 > Project: Hive > Issue Type: Bug > Components: Materialized views, Types >Affects Versions: 4.0.0 >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > {code:java} > create materialized view mv_tv_view_data_av2 stored as orc TBLPROPERTIES > ('transactional'='true') as > select > total_views `total_views`, > sum(cast(1.5 as decimal(9,4))) over (order by total_views) as quartile, > program > from tv_view_data; > {code} > {code:java} > LogicalProject(quartile=[CAST($0):DECIMAL(12, 1)], total=[$1]) > HiveTableScan(table=[[arc_view, mv_tv_view_data_av1]], > table:alias=[mv_tv_view_data_av1]) > {code} > Some constant decimal values are not padded in the result set. > {code} > select > POSTHOOK: query: select > t.quartile, > t.quartile, > max(t.total_views) total > max(t.total_views) total > from wealth t2, > from wealth t2, > (select > (select > total_views `total_views`, > total_views `total_views`, > sum(cast(1.5 as decimal(9,4))) over (order by total_views) as quartile, > sum(cast(1.5 as decimal(9,4))) over (order by total_views) as quartile, > program > program > from tv_view_data) t > from tv_view_data) t > where t.program=t2.watches > where t.program=t2.watches > group by quartile > group by quartile > order by quartile > {code} > {code} > 1.5 130 > 4.5 1500 > 6.0 2000 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24446) Materialized View plan alters explicit cast type in query
[ https://issues.apache.org/jira/browse/HIVE-24446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24446: -- Labels: pull-request-available (was: ) > Materialized View plan alters explicit cast type in query > - > > Key: HIVE-24446 > URL: https://issues.apache.org/jira/browse/HIVE-24446 > Project: Hive > Issue Type: Bug > Components: Materialized views, Types >Affects Versions: 4.0.0 >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > {code:java} > create materialized view mv_tv_view_data_av2 stored as orc TBLPROPERTIES > ('transactional'='true') as > select > total_views `total_views`, > sum(cast(1.5 as decimal(9,4))) over (order by total_views) as quartile, > program > from tv_view_data; > {code} > {code:java} > LogicalProject(quartile=[CAST($0):DECIMAL(12, 1)], total=[$1]) > HiveTableScan(table=[[arc_view, mv_tv_view_data_av1]], > table:alias=[mv_tv_view_data_av1]) > {code} > Some constant decimal values are not padded in the result set. > {code} > select > POSTHOOK: query: select > t.quartile, > t.quartile, > max(t.total_views) total > max(t.total_views) total > from wealth t2, > from wealth t2, > (select > (select > total_views `total_views`, > total_views `total_views`, > sum(cast(1.5 as decimal(9,4))) over (order by total_views) as quartile, > sum(cast(1.5 as decimal(9,4))) over (order by total_views) as quartile, > program > program > from tv_view_data) t > from tv_view_data) t > where t.program=t2.watches > where t.program=t2.watches > group by quartile > group by quartile > order by quartile > {code} > {code} > 1.5 130 > 4.5 1500 > 6.0 2000 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24446) Materialized View plan alters explicit cast type in query
[ https://issues.apache.org/jira/browse/HIVE-24446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Kasa updated HIVE-24446: -- Component/s: Materialized views > Materialized View plan alters explicit cast type in query > - > > Key: HIVE-24446 > URL: https://issues.apache.org/jira/browse/HIVE-24446 > Project: Hive > Issue Type: Bug > Components: Materialized views >Affects Versions: 4.0.0 >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > > {code:java} > create materialized view mv_tv_view_data_av2 stored as orc TBLPROPERTIES > ('transactional'='true') as > select > total_views `total_views`, > sum(cast(1.5 as decimal(9,4))) over (order by total_views) as quartile, > program > from tv_view_data; > {code} > {code:java} > LogicalProject(quartile=[CAST($0):DECIMAL(12, 1)], total=[$1]) > HiveTableScan(table=[[arc_view, mv_tv_view_data_av1]], > table:alias=[mv_tv_view_data_av1]) > {code} > Some constant decimal values are not padded in the result set. > {code} > select > POSTHOOK: query: select > t.quartile, > t.quartile, > max(t.total_views) total > max(t.total_views) total > from wealth t2, > from wealth t2, > (select > (select > total_views `total_views`, > total_views `total_views`, > sum(cast(1.5 as decimal(9,4))) over (order by total_views) as quartile, > sum(cast(1.5 as decimal(9,4))) over (order by total_views) as quartile, > program > program > from tv_view_data) t > from tv_view_data) t > where t.program=t2.watches > where t.program=t2.watches > group by quartile > group by quartile > order by quartile > {code} > {code} > 1.5 130 > 4.5 1500 > 6.0 2000 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24446) Materialized View plan alters explicit cast type in query
[ https://issues.apache.org/jira/browse/HIVE-24446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Kasa updated HIVE-24446: -- Component/s: Types > Materialized View plan alters explicit cast type in query > - > > Key: HIVE-24446 > URL: https://issues.apache.org/jira/browse/HIVE-24446 > Project: Hive > Issue Type: Bug > Components: Materialized views, Types >Affects Versions: 4.0.0 >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > > {code:java} > create materialized view mv_tv_view_data_av2 stored as orc TBLPROPERTIES > ('transactional'='true') as > select > total_views `total_views`, > sum(cast(1.5 as decimal(9,4))) over (order by total_views) as quartile, > program > from tv_view_data; > {code} > {code:java} > LogicalProject(quartile=[CAST($0):DECIMAL(12, 1)], total=[$1]) > HiveTableScan(table=[[arc_view, mv_tv_view_data_av1]], > table:alias=[mv_tv_view_data_av1]) > {code} > Some constant decimal values are not padded in the result set. > {code} > select > POSTHOOK: query: select > t.quartile, > t.quartile, > max(t.total_views) total > max(t.total_views) total > from wealth t2, > from wealth t2, > (select > (select > total_views `total_views`, > total_views `total_views`, > sum(cast(1.5 as decimal(9,4))) over (order by total_views) as quartile, > sum(cast(1.5 as decimal(9,4))) over (order by total_views) as quartile, > program > program > from tv_view_data) t > from tv_view_data) t > where t.program=t2.watches > where t.program=t2.watches > group by quartile > group by quartile > order by quartile > {code} > {code} > 1.5 130 > 4.5 1500 > 6.0 2000 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24446) Materialized View plan alters explicit cast type in query
[ https://issues.apache.org/jira/browse/HIVE-24446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Kasa updated HIVE-24446: -- Description: {code:java} create materialized view mv_tv_view_data_av2 stored as orc TBLPROPERTIES ('transactional'='true') as select total_views `total_views`, sum(cast(1.5 as decimal(9,4))) over (order by total_views) as quartile, program from tv_view_data; {code} {code:java} LogicalProject(quartile=[CAST($0):DECIMAL(12, 1)], total=[$1]) HiveTableScan(table=[[arc_view, mv_tv_view_data_av1]], table:alias=[mv_tv_view_data_av1]) {code} Some constant decimal values are not padded in the result set. {code} select POSTHOOK: query: select t.quartile, t.quartile, max(t.total_views) total max(t.total_views) total from wealth t2, from wealth t2, (select (select total_views `total_views`, total_views `total_views`, sum(cast(1.5 as decimal(9,4))) over (order by total_views) as quartile, sum(cast(1.5 as decimal(9,4))) over (order by total_views) as quartile, program program from tv_view_data) t from tv_view_data) t where t.program=t2.watches where t.program=t2.watches group by quartile group by quartile order by quartile {code} {code} 1.5 130 4.5 1500 6.0 2000 {code} was: {code} create materialized view mv_tv_view_data_av2 stored as orc TBLPROPERTIES ('transactional'='true') as select total_views `total_views`, sum(cast(1.5 as decimal(9,4))) over (order by total_views) as quartile, program from tv_view_data; {code} {code} LogicalProject(quartile=[CAST($0):DECIMAL(12, 1)], total=[$1]) HiveTableScan(table=[[arc_view, mv_tv_view_data_av1]], table:alias=[mv_tv_view_data_av1]) {code} > Materialized View plan alters explicit cast type in query > - > > Key: HIVE-24446 > URL: https://issues.apache.org/jira/browse/HIVE-24446 > Project: Hive > Issue Type: Bug >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > > {code:java} > create materialized view mv_tv_view_data_av2 stored as orc TBLPROPERTIES > ('transactional'='true') as > select > total_views `total_views`, > sum(cast(1.5 as decimal(9,4))) over (order by total_views) as quartile, > program > from tv_view_data; > {code} > {code:java} > LogicalProject(quartile=[CAST($0):DECIMAL(12, 1)], total=[$1]) > HiveTableScan(table=[[arc_view, mv_tv_view_data_av1]], > table:alias=[mv_tv_view_data_av1]) > {code} > Some constant decimal values are not padded in the result set. > {code} > select > POSTHOOK: query: select > t.quartile, > t.quartile, > max(t.total_views) total > max(t.total_views) total > from wealth t2, > from wealth t2, > (select > (select > total_views `total_views`, > total_views `total_views`, > sum(cast(1.5 as decimal(9,4))) over (order by total_views) as quartile, > sum(cast(1.5 as decimal(9,4))) over (order by total_views) as quartile, > program > program > from tv_view_data) t > from tv_view_data) t > where t.program=t2.watches > where t.program=t2.watches > group by quartile > group by quartile > order by quartile > {code} > {code} > 1.5 130 > 4.5 1500 > 6.0 2000 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24446) Materialized View plan alters explicit cast type in query
[ https://issues.apache.org/jira/browse/HIVE-24446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Kasa updated HIVE-24446: -- Affects Version/s: 4.0.0 > Materialized View plan alters explicit cast type in query > - > > Key: HIVE-24446 > URL: https://issues.apache.org/jira/browse/HIVE-24446 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > > {code:java} > create materialized view mv_tv_view_data_av2 stored as orc TBLPROPERTIES > ('transactional'='true') as > select > total_views `total_views`, > sum(cast(1.5 as decimal(9,4))) over (order by total_views) as quartile, > program > from tv_view_data; > {code} > {code:java} > LogicalProject(quartile=[CAST($0):DECIMAL(12, 1)], total=[$1]) > HiveTableScan(table=[[arc_view, mv_tv_view_data_av1]], > table:alias=[mv_tv_view_data_av1]) > {code} > Some constant decimal values are not padded in the result set. > {code} > select > POSTHOOK: query: select > t.quartile, > t.quartile, > max(t.total_views) total > max(t.total_views) total > from wealth t2, > from wealth t2, > (select > (select > total_views `total_views`, > total_views `total_views`, > sum(cast(1.5 as decimal(9,4))) over (order by total_views) as quartile, > sum(cast(1.5 as decimal(9,4))) over (order by total_views) as quartile, > program > program > from tv_view_data) t > from tv_view_data) t > where t.program=t2.watches > where t.program=t2.watches > group by quartile > group by quartile > order by quartile > {code} > {code} > 1.5 130 > 4.5 1500 > 6.0 2000 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-24423) Improve DbNotificationListener Thread
[ https://issues.apache.org/jira/browse/HIVE-24423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Mollitor resolved HIVE-24423. --- Fix Version/s: 4.0.0 Resolution: Fixed HIVE-24423: Improve DbNotificationListener Thread (David Mollitor reviewed by Naveen Gangam, Miklos Gergely) Thanks [~ngangam] and [~mgergely] for the review! > Improve DbNotificationListener Thread > - > > Key: HIVE-24423 > URL: https://issues.apache.org/jira/browse/HIVE-24423 > Project: Hive > Issue Type: Improvement >Affects Versions: 3.1.0 >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1h > Remaining Estimate: 0h > > Clean up and simplify {{DbNotificationListener}} thread class. > Most importantly, stop the thread and wait for it to finish before launching > a new thread. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24423) Improve DbNotificationListener Thread
[ https://issues.apache.org/jira/browse/HIVE-24423?focusedWorklogId=517965=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517965 ] ASF GitHub Bot logged work on HIVE-24423: - Author: ASF GitHub Bot Created on: 30/Nov/20 14:30 Start Date: 30/Nov/20 14:30 Worklog Time Spent: 10m Work Description: belugabehr merged pull request #1703: URL: https://github.com/apache/hive/pull/1703 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517965) Time Spent: 1h (was: 50m) > Improve DbNotificationListener Thread > - > > Key: HIVE-24423 > URL: https://issues.apache.org/jira/browse/HIVE-24423 > Project: Hive > Issue Type: Improvement >Affects Versions: 3.1.0 >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Clean up and simplify {{DbNotificationListener}} thread class. > Most importantly, stop the thread and wait for it to finish before launching > a new thread. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24446) Materialized View plan alters explicit cast type in query
[ https://issues.apache.org/jira/browse/HIVE-24446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Kasa updated HIVE-24446: -- Summary: Materialized View plan alters explicit cast type in query (was: Materialized View plan remove explicit cast from query) > Materialized View plan alters explicit cast type in query > - > > Key: HIVE-24446 > URL: https://issues.apache.org/jira/browse/HIVE-24446 > Project: Hive > Issue Type: Bug >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > > {code} > create materialized view mv_tv_view_data_av2 stored as orc TBLPROPERTIES > ('transactional'='true') as > select > total_views `total_views`, > sum(cast(1.5 as decimal(9,4))) over (order by total_views) as quartile, > program > from tv_view_data; > {code} > {code} > LogicalProject(quartile=[CAST($0):DECIMAL(12, 1)], total=[$1]) > HiveTableScan(table=[[arc_view, mv_tv_view_data_av1]], > table:alias=[mv_tv_view_data_av1]) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24446) Materialized View plan remove explicit cast from query
[ https://issues.apache.org/jira/browse/HIVE-24446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Kasa reassigned HIVE-24446: - > Materialized View plan remove explicit cast from query > -- > > Key: HIVE-24446 > URL: https://issues.apache.org/jira/browse/HIVE-24446 > Project: Hive > Issue Type: Bug >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > > {code} > create materialized view mv_tv_view_data_av2 stored as orc TBLPROPERTIES > ('transactional'='true') as > select > total_views `total_views`, > sum(cast(1.5 as decimal(9,4))) over (order by total_views) as quartile, > program > from tv_view_data; > {code} > {code} > LogicalProject(quartile=[CAST($0):DECIMAL(12, 1)], total=[$1]) > HiveTableScan(table=[[arc_view, mv_tv_view_data_av1]], > table:alias=[mv_tv_view_data_av1]) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-21843) UNION query with regular expressions for column name does not work
[ https://issues.apache.org/jira/browse/HIVE-21843?focusedWorklogId=517945=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517945 ] ASF GitHub Bot logged work on HIVE-21843: - Author: ASF GitHub Bot Created on: 30/Nov/20 13:57 Start Date: 30/Nov/20 13:57 Worklog Time Spent: 10m Work Description: kasakrisz commented on a change in pull request #1684: URL: https://github.com/apache/hive/pull/1684#discussion_r532610692 ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ## @@ -4453,7 +4452,7 @@ private boolean isAggregateInSelect(Node node, Collection aggregateFunc * Returns whether the pattern is a regex expression (instead of a normal * string). Normal string is a string with all alphabets/digits and "_". */ - boolean isRegex(String pattern, HiveConf conf) { + static boolean isRegex(String pattern, HiveConf conf) { Review comment: This function is called from `SemanticAnalyzer` and its subclasses. I haven't found any other invocations. Is it necessary to change this to `static` ? ## File path: ql/src/test/org/apache/hadoop/hive/ql/parse/authorization/PrivilegesTestBase.java ## @@ -35,8 +37,9 @@ public static void grantUserTable(String privStr, PrivilegeType privType, QueryState queryState, Hive db) throws Exception { +Context ctx=new Context(new HiveConf()); DDLWork work = AuthorizationTestUtil.analyze( -"GRANT " + privStr + " ON TABLE " + TABLE + " TO USER " + USER, queryState, db); +"GRANT " + privStr + " ON TABLE " + TABLE + " TO USER " + USER, queryState, db,ctx); Review comment: nit: space `queryState, db, ctx);` ## File path: ql/src/test/org/apache/hadoop/hive/ql/tool/TestLineageInfo.java ## @@ -58,7 +76,7 @@ public void testSimpleQuery() { try { lep.getLineageInfo("INSERT OVERWRITE TABLE dest1 partition (ds = '111') " + "SELECT s.* FROM srcpart TABLESAMPLE (BUCKET 1 OUT OF 1) s " - + "WHERE s.ds='2008-04-08' and s.hr='11'"); + + "WHERE s.ds='2008-04-08' and s.hr='11'",ctx); Review comment: nit: space `s.hr='11'", ctx);` ## File path: ql/src/test/org/apache/hadoop/hive/ql/tool/TestLineageInfo.java ## @@ -128,7 +136,7 @@ public void testSimpleQuery5() { LineageInfo lep = new LineageInfo(); try { lep.getLineageInfo("insert overwrite table x select a.y, b.y " - + "from a a full outer join b b on (a.x = b.y)"); + + "from a a full outer join b b on (a.x = b.y)",ctx); Review comment: `(a.x = b.y)", ctx);` ## File path: ql/src/test/org/apache/hadoop/hive/ql/tool/TestLineageInfo.java ## @@ -71,47 +89,37 @@ public void testSimpleQuery() { } @Test - public void testSimpleQuery2() { + public void testSimpleQuery2() throws Exception { LineageInfo lep = new LineageInfo(); -try { - lep.getLineageInfo("FROM (FROM src select src.key, src.value " - + "WHERE src.key < 10 UNION ALL FROM src SELECT src.* WHERE src.key > 10 ) unioninput " - + "INSERT OVERWRITE DIRECTORY '../../../../build/contrib/hive/ql/test/data/warehouse/union.out' " - + "SELECT unioninput.*"); - TreeSet i = new TreeSet(); - TreeSet o = new TreeSet(); - i.add("src"); - checkOutput(lep, i, o); -} catch (Exception e) { - e.printStackTrace(); - fail("Failed"); -} +lep.getLineageInfo("FROM (FROM src select src.key, src.value " ++ "WHERE src.key < 10 UNION ALL FROM src SELECT src.* WHERE src.key > 10 ) unioninput " ++ "INSERT OVERWRITE DIRECTORY '../../../../build/contrib/hive/ql/test/data/warehouse/union.out' " ++ "SELECT unioninput.*",ctx); +TreeSet i = new TreeSet(); +TreeSet o = new TreeSet(); +i.add("src"); +checkOutput(lep, i, o); } @Test - public void testSimpleQuery3() { + public void testSimpleQuery3() throws Exception { LineageInfo lep = new LineageInfo(); -try { - lep.getLineageInfo("FROM (FROM src select src.key, src.value " - + "WHERE src.key < 10 UNION ALL FROM src1 SELECT src1.* WHERE src1.key > 10 ) unioninput " - + "INSERT OVERWRITE DIRECTORY '../../../../build/contrib/hive/ql/test/data/warehouse/union.out' " - + "SELECT unioninput.*"); - TreeSet i = new TreeSet(); - TreeSet o = new TreeSet(); - i.add("src"); - i.add("src1"); - checkOutput(lep, i, o); -} catch (Exception e) { - e.printStackTrace(); - fail("Failed"); -} +lep.getLineageInfo("FROM (FROM src select src.key, src.value " ++ "WHERE src.key < 10 UNION ALL FROM src1 SELECT src1.* WHERE src1.key > 10 ) unioninput " ++ "INSERT OVERWRITE DIRECTORY '../../../../build/contrib/hive/ql/test/data/warehouse/union.out' " +
[jira] [Assigned] (HIVE-24445) Non blocking DROP table implementation
[ https://issues.apache.org/jira/browse/HIVE-24445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Chovan reassigned HIVE-24445: > Non blocking DROP table implementation > -- > > Key: HIVE-24445 > URL: https://issues.apache.org/jira/browse/HIVE-24445 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Zoltan Chovan >Assignee: Zoltan Chovan >Priority: Major > > Implement a way to execute drop table operations in a way that doesn't have > to wait for currently running read operations to be finished. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24444) compactor.Cleaner should not set state "mark cleaned" if there are obsolete files in the FS
[ https://issues.apache.org/jira/browse/HIVE-2?focusedWorklogId=517915=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517915 ] ASF GitHub Bot logged work on HIVE-2: - Author: ASF GitHub Bot Created on: 30/Nov/20 13:01 Start Date: 30/Nov/20 13:01 Worklog Time Spent: 10m Work Description: pvargacl commented on a change in pull request #1716: URL: https://github.com/apache/hive/pull/1716#discussion_r532579637 ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java ## @@ -265,13 +267,14 @@ private static String idWatermark(CompactionInfo ci) { } /** - * @return true if any files were removed + * @return true if the cleaner has removed all files rendered obsolete by compaction */ private boolean removeFiles(String location, ValidWriteIdList writeIdList, CompactionInfo ci) throws IOException, NoSuchObjectException, MetaException { Path locPath = new Path(location); +Map dirSnapshots = null; AcidUtils.Directory dir = AcidUtils.getAcidState(locPath.getFileSystem(conf), locPath, conf, writeIdList, Ref.from( -false), false); +false), false, dirSnapshots); Review comment: Oh, sorry. I was talking nonsense. Yes, we should not pass null in. I thought an empty map, but checked the code that would not get filled up either. There is a working example in AcidUtils.getAcidFilesForStats, you should create the snapshot beforehand This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517915) Time Spent: 2h 10m (was: 2h) > compactor.Cleaner should not set state "mark cleaned" if there are obsolete > files in the FS > --- > > Key: HIVE-2 > URL: https://issues.apache.org/jira/browse/HIVE-2 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > > This is an improvement on HIVE-24314, in which markCleaned() is called only > if +any+ files are deleted by the cleaner. This could cause a problem in the > following case: > Say for table_1 compaction1 cleaning was blocked by an open txn, and > compaction is run again on the same table (compaction2). Both compaction1 and > compaction2 could be in "ready for cleaning" at the same time. By this time > the blocking open txn could be committed. When the cleaner runs, one of > compaction1 and compaction2 will remain in the "ready for cleaning" state: > Say compaction2 is picked up by the cleaner first. The Cleaner deletes all > obsolete files. Then compaction1 is picked up by the cleaner; the cleaner > doesn't remove any files and compaction1 will stay in the queue in a "ready > for cleaning" state. > HIVE-24291 already solves this issue but if it isn't usable (for example if > HMS schema changes are out the question) then HIVE-24314 + this change will > fix the issue of the Cleaner not removing all obsolete files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24444) compactor.Cleaner should not set state "mark cleaned" if there are obsolete files in the FS
[ https://issues.apache.org/jira/browse/HIVE-2?focusedWorklogId=517913=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517913 ] ASF GitHub Bot logged work on HIVE-2: - Author: ASF GitHub Bot Created on: 30/Nov/20 12:59 Start Date: 30/Nov/20 12:59 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #1716: URL: https://github.com/apache/hive/pull/1716#discussion_r532577307 ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java ## @@ -316,6 +314,30 @@ private boolean removeFiles(String location, ValidWriteIdList writeIdList, Compa } fs.delete(dead, true); } -return true; +// Check if there will be more obsolete directories to clean when possible. We will only mark cleaned when this +// number reaches 0. +return getNumEventuallyObsoleteDirs(location, dirSnapshots) == 0; + } + + /** + * Get the number of base/delta directories the Cleaner should remove eventually. If we check this after cleaning + * we can see if the Cleaner has further work to do in this table/partition directory that it hasn't been able to + * finish, e.g. because of an open transaction at the time of compaction. + * We do this by assuming that there are no open transactions anywhere and then calling getAcidState. If there are + * obsolete directories, then the Cleaner has more work to do. + * @param location location of table + * @return number of dirs left for the cleaner to clean – eventually + * @throws IOException + */ + private int getNumEventuallyObsoleteDirs(String location, Map dirSnapshots) + throws IOException { +ValidTxnList validTxnList = new ValidReadTxnList(); +//save it so that getAcidState() sees it +conf.set(ValidTxnList.VALID_TXNS_KEY, validTxnList.writeToString()); +ValidReaderWriteIdList validWriteIdList = new ValidReaderWriteIdList(); +Path locPath = new Path(location); +AcidUtils.Directory dir = AcidUtils.getAcidState(locPath.getFileSystem(conf), locPath, conf, validWriteIdList, +Ref.from(false), false, dirSnapshots); +return dir.getObsolete().size(); Review comment: there could be deltas with higher txnId than compaction :) If we won't handle this, we might get uncleaned aborts This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517913) Time Spent: 2h (was: 1h 50m) > compactor.Cleaner should not set state "mark cleaned" if there are obsolete > files in the FS > --- > > Key: HIVE-2 > URL: https://issues.apache.org/jira/browse/HIVE-2 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > This is an improvement on HIVE-24314, in which markCleaned() is called only > if +any+ files are deleted by the cleaner. This could cause a problem in the > following case: > Say for table_1 compaction1 cleaning was blocked by an open txn, and > compaction is run again on the same table (compaction2). Both compaction1 and > compaction2 could be in "ready for cleaning" at the same time. By this time > the blocking open txn could be committed. When the cleaner runs, one of > compaction1 and compaction2 will remain in the "ready for cleaning" state: > Say compaction2 is picked up by the cleaner first. The Cleaner deletes all > obsolete files. Then compaction1 is picked up by the cleaner; the cleaner > doesn't remove any files and compaction1 will stay in the queue in a "ready > for cleaning" state. > HIVE-24291 already solves this issue but if it isn't usable (for example if > HMS schema changes are out the question) then HIVE-24314 + this change will > fix the issue of the Cleaner not removing all obsolete files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24444) compactor.Cleaner should not set state "mark cleaned" if there are obsolete files in the FS
[ https://issues.apache.org/jira/browse/HIVE-2?focusedWorklogId=517912=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517912 ] ASF GitHub Bot logged work on HIVE-2: - Author: ASF GitHub Bot Created on: 30/Nov/20 12:57 Start Date: 30/Nov/20 12:57 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #1716: URL: https://github.com/apache/hive/pull/1716#discussion_r532577307 ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java ## @@ -316,6 +314,30 @@ private boolean removeFiles(String location, ValidWriteIdList writeIdList, Compa } fs.delete(dead, true); } -return true; +// Check if there will be more obsolete directories to clean when possible. We will only mark cleaned when this +// number reaches 0. +return getNumEventuallyObsoleteDirs(location, dirSnapshots) == 0; + } + + /** + * Get the number of base/delta directories the Cleaner should remove eventually. If we check this after cleaning + * we can see if the Cleaner has further work to do in this table/partition directory that it hasn't been able to + * finish, e.g. because of an open transaction at the time of compaction. + * We do this by assuming that there are no open transactions anywhere and then calling getAcidState. If there are + * obsolete directories, then the Cleaner has more work to do. + * @param location location of table + * @return number of dirs left for the cleaner to clean – eventually + * @throws IOException + */ + private int getNumEventuallyObsoleteDirs(String location, Map dirSnapshots) + throws IOException { +ValidTxnList validTxnList = new ValidReadTxnList(); +//save it so that getAcidState() sees it +conf.set(ValidTxnList.VALID_TXNS_KEY, validTxnList.writeToString()); +ValidReaderWriteIdList validWriteIdList = new ValidReaderWriteIdList(); +Path locPath = new Path(location); +AcidUtils.Directory dir = AcidUtils.getAcidState(locPath.getFileSystem(conf), locPath, conf, validWriteIdList, +Ref.from(false), false, dirSnapshots); +return dir.getObsolete().size(); Review comment: there could be deltas with higher txnId than compaction :) If we wan't handle this, we might get uncleaned aborts This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517912) Time Spent: 1h 50m (was: 1h 40m) > compactor.Cleaner should not set state "mark cleaned" if there are obsolete > files in the FS > --- > > Key: HIVE-2 > URL: https://issues.apache.org/jira/browse/HIVE-2 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > This is an improvement on HIVE-24314, in which markCleaned() is called only > if +any+ files are deleted by the cleaner. This could cause a problem in the > following case: > Say for table_1 compaction1 cleaning was blocked by an open txn, and > compaction is run again on the same table (compaction2). Both compaction1 and > compaction2 could be in "ready for cleaning" at the same time. By this time > the blocking open txn could be committed. When the cleaner runs, one of > compaction1 and compaction2 will remain in the "ready for cleaning" state: > Say compaction2 is picked up by the cleaner first. The Cleaner deletes all > obsolete files. Then compaction1 is picked up by the cleaner; the cleaner > doesn't remove any files and compaction1 will stay in the queue in a "ready > for cleaning" state. > HIVE-24291 already solves this issue but if it isn't usable (for example if > HMS schema changes are out the question) then HIVE-24314 + this change will > fix the issue of the Cleaner not removing all obsolete files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24426) Spark job fails with fixed LlapTaskUmbilicalServer port
[ https://issues.apache.org/jira/browse/HIVE-24426?focusedWorklogId=517910=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517910 ] ASF GitHub Bot logged work on HIVE-24426: - Author: ASF GitHub Bot Created on: 30/Nov/20 12:56 Start Date: 30/Nov/20 12:56 Worklog Time Spent: 10m Work Description: ayushtkn commented on pull request #1705: URL: https://github.com/apache/hive/pull/1705#issuecomment-735769256 Thanx @prasanthj for the review, I have handled the review comments. Please have a look This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517910) Time Spent: 0.5h (was: 20m) > Spark job fails with fixed LlapTaskUmbilicalServer port > --- > > Key: HIVE-24426 > URL: https://issues.apache.org/jira/browse/HIVE-24426 > Project: Hive > Issue Type: Sub-task >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Critical > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > In case of cloud deployments, multiple executors are launched on name node, > and incase a fixed umbilical port is specified using > {{spark.hadoop.hive.llap.daemon.umbilical.port=30006}} > The job fails with BindException. > {noformat} > Caused by: java.net.BindException: Problem binding to [0.0.0.0:30006] > java.net.BindException: Address already in use; For more details see: > http://wiki.apache.org/hadoop/BindException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:840) > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:741) > at org.apache.hadoop.ipc.Server.bind(Server.java:605) > at org.apache.hadoop.ipc.Server$Listener.(Server.java:1169) > at org.apache.hadoop.ipc.Server.(Server.java:3032) > at org.apache.hadoop.ipc.RPC$Server.(RPC.java:1039) > at > org.apache.hadoop.ipc.WritableRpcEngine$Server.(WritableRpcEngine.java:438) > at > org.apache.hadoop.ipc.WritableRpcEngine.getServer(WritableRpcEngine.java:332) > at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:848) > at > org.apache.hadoop.hive.llap.tezplugins.helpers.LlapTaskUmbilicalServer.(LlapTaskUmbilicalServer.java:67) > at > org.apache.hadoop.hive.llap.ext.LlapTaskUmbilicalExternalClient$SharedUmbilicalServer.(LlapTaskUmbilicalExternalClient.java:122) > ... 26 more > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:433) > at sun.nio.ch.Net.bind(Net.java:425) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:220) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:85) > at org.apache.hadoop.ipc.Server.bind(Server.java:588) > ... 34 more{noformat} > To counter this, better to provide a range of ports -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24444) compactor.Cleaner should not set state "mark cleaned" if there are obsolete files in the FS
[ https://issues.apache.org/jira/browse/HIVE-2?focusedWorklogId=517908=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517908 ] ASF GitHub Bot logged work on HIVE-2: - Author: ASF GitHub Bot Created on: 30/Nov/20 12:54 Start Date: 30/Nov/20 12:54 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #1716: URL: https://github.com/apache/hive/pull/1716#discussion_r532575138 ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java ## @@ -265,13 +267,14 @@ private static String idWatermark(CompactionInfo ci) { } /** - * @return true if any files were removed + * @return true if the cleaner has removed all files rendered obsolete by compaction */ private boolean removeFiles(String location, ValidWriteIdList writeIdList, CompactionInfo ci) throws IOException, NoSuchObjectException, MetaException { Path locPath = new Path(location); +Map dirSnapshots = null; AcidUtils.Directory dir = AcidUtils.getAcidState(locPath.getFileSystem(conf), locPath, conf, writeIdList, Ref.from( -false), false); +false), false, dirSnapshots); Review comment: how exactly? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517908) Time Spent: 1h 40m (was: 1.5h) > compactor.Cleaner should not set state "mark cleaned" if there are obsolete > files in the FS > --- > > Key: HIVE-2 > URL: https://issues.apache.org/jira/browse/HIVE-2 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > This is an improvement on HIVE-24314, in which markCleaned() is called only > if +any+ files are deleted by the cleaner. This could cause a problem in the > following case: > Say for table_1 compaction1 cleaning was blocked by an open txn, and > compaction is run again on the same table (compaction2). Both compaction1 and > compaction2 could be in "ready for cleaning" at the same time. By this time > the blocking open txn could be committed. When the cleaner runs, one of > compaction1 and compaction2 will remain in the "ready for cleaning" state: > Say compaction2 is picked up by the cleaner first. The Cleaner deletes all > obsolete files. Then compaction1 is picked up by the cleaner; the cleaner > doesn't remove any files and compaction1 will stay in the queue in a "ready > for cleaning" state. > HIVE-24291 already solves this issue but if it isn't usable (for example if > HMS schema changes are out the question) then HIVE-24314 + this change will > fix the issue of the Cleaner not removing all obsolete files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24444) compactor.Cleaner should not set state "mark cleaned" if there are obsolete files in the FS
[ https://issues.apache.org/jira/browse/HIVE-2?focusedWorklogId=517897=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517897 ] ASF GitHub Bot logged work on HIVE-2: - Author: ASF GitHub Bot Created on: 30/Nov/20 12:36 Start Date: 30/Nov/20 12:36 Worklog Time Spent: 10m Work Description: pvargacl commented on a change in pull request #1716: URL: https://github.com/apache/hive/pull/1716#discussion_r532564972 ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java ## @@ -316,6 +314,30 @@ private boolean removeFiles(String location, ValidWriteIdList writeIdList, Compa } fs.delete(dead, true); } -return true; +// Check if there will be more obsolete directories to clean when possible. We will only mark cleaned when this +// number reaches 0. +return getNumEventuallyObsoleteDirs(location, dirSnapshots) == 0; + } + + /** + * Get the number of base/delta directories the Cleaner should remove eventually. If we check this after cleaning + * we can see if the Cleaner has further work to do in this table/partition directory that it hasn't been able to + * finish, e.g. because of an open transaction at the time of compaction. + * We do this by assuming that there are no open transactions anywhere and then calling getAcidState. If there are + * obsolete directories, then the Cleaner has more work to do. + * @param location location of table + * @return number of dirs left for the cleaner to clean – eventually + * @throws IOException + */ + private int getNumEventuallyObsoleteDirs(String location, Map dirSnapshots) + throws IOException { +ValidTxnList validTxnList = new ValidReadTxnList(); +//save it so that getAcidState() sees it +conf.set(ValidTxnList.VALID_TXNS_KEY, validTxnList.writeToString()); +ValidReaderWriteIdList validWriteIdList = new ValidReaderWriteIdList(); +Path locPath = new Path(location); +AcidUtils.Directory dir = AcidUtils.getAcidState(locPath.getFileSystem(conf), locPath, conf, validWriteIdList, +Ref.from(false), false, dirSnapshots); +return dir.getObsolete().size(); Review comment: You should not, there can be aborts with higher txnId than compaction, this will see those, but the cleaner will never see them, so it would never finish its job This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517897) Time Spent: 1.5h (was: 1h 20m) > compactor.Cleaner should not set state "mark cleaned" if there are obsolete > files in the FS > --- > > Key: HIVE-2 > URL: https://issues.apache.org/jira/browse/HIVE-2 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > This is an improvement on HIVE-24314, in which markCleaned() is called only > if +any+ files are deleted by the cleaner. This could cause a problem in the > following case: > Say for table_1 compaction1 cleaning was blocked by an open txn, and > compaction is run again on the same table (compaction2). Both compaction1 and > compaction2 could be in "ready for cleaning" at the same time. By this time > the blocking open txn could be committed. When the cleaner runs, one of > compaction1 and compaction2 will remain in the "ready for cleaning" state: > Say compaction2 is picked up by the cleaner first. The Cleaner deletes all > obsolete files. Then compaction1 is picked up by the cleaner; the cleaner > doesn't remove any files and compaction1 will stay in the queue in a "ready > for cleaning" state. > HIVE-24291 already solves this issue but if it isn't usable (for example if > HMS schema changes are out the question) then HIVE-24314 + this change will > fix the issue of the Cleaner not removing all obsolete files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24444) compactor.Cleaner should not set state "mark cleaned" if there are obsolete files in the FS
[ https://issues.apache.org/jira/browse/HIVE-2?focusedWorklogId=517891=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517891 ] ASF GitHub Bot logged work on HIVE-2: - Author: ASF GitHub Bot Created on: 30/Nov/20 12:34 Start Date: 30/Nov/20 12:34 Worklog Time Spent: 10m Work Description: pvargacl commented on a change in pull request #1716: URL: https://github.com/apache/hive/pull/1716#discussion_r532563919 ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java ## @@ -265,13 +267,14 @@ private static String idWatermark(CompactionInfo ci) { } /** - * @return true if any files were removed + * @return true if the cleaner has removed all files rendered obsolete by compaction */ private boolean removeFiles(String location, ValidWriteIdList writeIdList, CompactionInfo ci) throws IOException, NoSuchObjectException, MetaException { Path locPath = new Path(location); +Map dirSnapshots = null; AcidUtils.Directory dir = AcidUtils.getAcidState(locPath.getFileSystem(conf), locPath, conf, writeIdList, Ref.from( -false), false); +false), false, dirSnapshots); Review comment: This will fill up the snapshot, so it can be used later This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517891) Time Spent: 1h 10m (was: 1h) > compactor.Cleaner should not set state "mark cleaned" if there are obsolete > files in the FS > --- > > Key: HIVE-2 > URL: https://issues.apache.org/jira/browse/HIVE-2 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > This is an improvement on HIVE-24314, in which markCleaned() is called only > if +any+ files are deleted by the cleaner. This could cause a problem in the > following case: > Say for table_1 compaction1 cleaning was blocked by an open txn, and > compaction is run again on the same table (compaction2). Both compaction1 and > compaction2 could be in "ready for cleaning" at the same time. By this time > the blocking open txn could be committed. When the cleaner runs, one of > compaction1 and compaction2 will remain in the "ready for cleaning" state: > Say compaction2 is picked up by the cleaner first. The Cleaner deletes all > obsolete files. Then compaction1 is picked up by the cleaner; the cleaner > doesn't remove any files and compaction1 will stay in the queue in a "ready > for cleaning" state. > HIVE-24291 already solves this issue but if it isn't usable (for example if > HMS schema changes are out the question) then HIVE-24314 + this change will > fix the issue of the Cleaner not removing all obsolete files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24444) compactor.Cleaner should not set state "mark cleaned" if there are obsolete files in the FS
[ https://issues.apache.org/jira/browse/HIVE-2?focusedWorklogId=517892=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517892 ] ASF GitHub Bot logged work on HIVE-2: - Author: ASF GitHub Bot Created on: 30/Nov/20 12:34 Start Date: 30/Nov/20 12:34 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #1716: URL: https://github.com/apache/hive/pull/1716#discussion_r532556849 ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java ## @@ -265,13 +267,14 @@ private static String idWatermark(CompactionInfo ci) { } /** - * @return true if any files were removed + * @return true if the cleaner has removed all files rendered obsolete by compaction */ private boolean removeFiles(String location, ValidWriteIdList writeIdList, CompactionInfo ci) throws IOException, NoSuchObjectException, MetaException { Path locPath = new Path(location); +Map dirSnapshots = null; AcidUtils.Directory dir = AcidUtils.getAcidState(locPath.getFileSystem(conf), locPath, conf, writeIdList, Ref.from( -false), false); +false), false, dirSnapshots); Review comment: What's you expectation here? dirSnapshots would be always null. I think, what you wanted to do is: `dirSnapshots = getHdfsDirSnapshots(locPath.getFileSystem(conf), locPath)` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517892) Time Spent: 1h 20m (was: 1h 10m) > compactor.Cleaner should not set state "mark cleaned" if there are obsolete > files in the FS > --- > > Key: HIVE-2 > URL: https://issues.apache.org/jira/browse/HIVE-2 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > This is an improvement on HIVE-24314, in which markCleaned() is called only > if +any+ files are deleted by the cleaner. This could cause a problem in the > following case: > Say for table_1 compaction1 cleaning was blocked by an open txn, and > compaction is run again on the same table (compaction2). Both compaction1 and > compaction2 could be in "ready for cleaning" at the same time. By this time > the blocking open txn could be committed. When the cleaner runs, one of > compaction1 and compaction2 will remain in the "ready for cleaning" state: > Say compaction2 is picked up by the cleaner first. The Cleaner deletes all > obsolete files. Then compaction1 is picked up by the cleaner; the cleaner > doesn't remove any files and compaction1 will stay in the queue in a "ready > for cleaning" state. > HIVE-24291 already solves this issue but if it isn't usable (for example if > HMS schema changes are out the question) then HIVE-24314 + this change will > fix the issue of the Cleaner not removing all obsolete files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24444) compactor.Cleaner should not set state "mark cleaned" if there are obsolete files in the FS
[ https://issues.apache.org/jira/browse/HIVE-2?focusedWorklogId=517886=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517886 ] ASF GitHub Bot logged work on HIVE-2: - Author: ASF GitHub Bot Created on: 30/Nov/20 12:21 Start Date: 30/Nov/20 12:21 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #1716: URL: https://github.com/apache/hive/pull/1716#discussion_r532556849 ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java ## @@ -265,13 +267,14 @@ private static String idWatermark(CompactionInfo ci) { } /** - * @return true if any files were removed + * @return true if the cleaner has removed all files rendered obsolete by compaction */ private boolean removeFiles(String location, ValidWriteIdList writeIdList, CompactionInfo ci) throws IOException, NoSuchObjectException, MetaException { Path locPath = new Path(location); +Map dirSnapshots = null; AcidUtils.Directory dir = AcidUtils.getAcidState(locPath.getFileSystem(conf), locPath, conf, writeIdList, Ref.from( -false), false); +false), false, dirSnapshots); Review comment: What's you expectation here? dirSnapshots would be always null. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517886) Time Spent: 1h (was: 50m) > compactor.Cleaner should not set state "mark cleaned" if there are obsolete > files in the FS > --- > > Key: HIVE-2 > URL: https://issues.apache.org/jira/browse/HIVE-2 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > This is an improvement on HIVE-24314, in which markCleaned() is called only > if +any+ files are deleted by the cleaner. This could cause a problem in the > following case: > Say for table_1 compaction1 cleaning was blocked by an open txn, and > compaction is run again on the same table (compaction2). Both compaction1 and > compaction2 could be in "ready for cleaning" at the same time. By this time > the blocking open txn could be committed. When the cleaner runs, one of > compaction1 and compaction2 will remain in the "ready for cleaning" state: > Say compaction2 is picked up by the cleaner first. The Cleaner deletes all > obsolete files. Then compaction1 is picked up by the cleaner; the cleaner > doesn't remove any files and compaction1 will stay in the queue in a "ready > for cleaning" state. > HIVE-24291 already solves this issue but if it isn't usable (for example if > HMS schema changes are out the question) then HIVE-24314 + this change will > fix the issue of the Cleaner not removing all obsolete files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24444) compactor.Cleaner should not set state "mark cleaned" if there are obsolete files in the FS
[ https://issues.apache.org/jira/browse/HIVE-2?focusedWorklogId=517882=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517882 ] ASF GitHub Bot logged work on HIVE-2: - Author: ASF GitHub Bot Created on: 30/Nov/20 12:12 Start Date: 30/Nov/20 12:12 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #1716: URL: https://github.com/apache/hive/pull/1716#discussion_r532552240 ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java ## @@ -316,6 +314,30 @@ private boolean removeFiles(String location, ValidWriteIdList writeIdList, Compa } fs.delete(dead, true); } -return true; +// Check if there will be more obsolete directories to clean when possible. We will only mark cleaned when this +// number reaches 0. +return getNumEventuallyObsoleteDirs(location, dirSnapshots) == 0; + } + + /** + * Get the number of base/delta directories the Cleaner should remove eventually. If we check this after cleaning + * we can see if the Cleaner has further work to do in this table/partition directory that it hasn't been able to + * finish, e.g. because of an open transaction at the time of compaction. + * We do this by assuming that there are no open transactions anywhere and then calling getAcidState. If there are + * obsolete directories, then the Cleaner has more work to do. + * @param location location of table + * @return number of dirs left for the cleaner to clean – eventually + * @throws IOException + */ + private int getNumEventuallyObsoleteDirs(String location, Map dirSnapshots) + throws IOException { +ValidTxnList validTxnList = new ValidReadTxnList(); +//save it so that getAcidState() sees it +conf.set(ValidTxnList.VALID_TXNS_KEY, validTxnList.writeToString()); +ValidReaderWriteIdList validWriteIdList = new ValidReaderWriteIdList(); +Path locPath = new Path(location); +AcidUtils.Directory dir = AcidUtils.getAcidState(locPath.getFileSystem(conf), locPath, conf, validWriteIdList, +Ref.from(false), false, dirSnapshots); +return dir.getObsolete().size(); Review comment: should we consider aborts as well? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517882) Time Spent: 50m (was: 40m) > compactor.Cleaner should not set state "mark cleaned" if there are obsolete > files in the FS > --- > > Key: HIVE-2 > URL: https://issues.apache.org/jira/browse/HIVE-2 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > This is an improvement on HIVE-24314, in which markCleaned() is called only > if +any+ files are deleted by the cleaner. This could cause a problem in the > following case: > Say for table_1 compaction1 cleaning was blocked by an open txn, and > compaction is run again on the same table (compaction2). Both compaction1 and > compaction2 could be in "ready for cleaning" at the same time. By this time > the blocking open txn could be committed. When the cleaner runs, one of > compaction1 and compaction2 will remain in the "ready for cleaning" state: > Say compaction2 is picked up by the cleaner first. The Cleaner deletes all > obsolete files. Then compaction1 is picked up by the cleaner; the cleaner > doesn't remove any files and compaction1 will stay in the queue in a "ready > for cleaning" state. > HIVE-24291 already solves this issue but if it isn't usable (for example if > HMS schema changes are out the question) then HIVE-24314 + this change will > fix the issue of the Cleaner not removing all obsolete files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24444) compactor.Cleaner should not set state "mark cleaned" if there are obsolete files in the FS
[ https://issues.apache.org/jira/browse/HIVE-2?focusedWorklogId=517827=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517827 ] ASF GitHub Bot logged work on HIVE-2: - Author: ASF GitHub Bot Created on: 30/Nov/20 10:19 Start Date: 30/Nov/20 10:19 Worklog Time Spent: 10m Work Description: klcopp commented on a change in pull request #1716: URL: https://github.com/apache/hive/pull/1716#discussion_r532486047 ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java ## @@ -316,6 +312,29 @@ private boolean removeFiles(String location, ValidWriteIdList writeIdList, Compa } fs.delete(dead, true); } -return true; +// Check if there will be more obsolete directories to clean when possible. We will only mark cleaned when this +// number reaches 0. +return getNumEventuallyObsoleteDirs(location) == 0; + } + + /** + * Get the number of base/delta directories the Cleaner should remove eventually. If we check this after cleaning + * we can see if the Cleaner has further work to do in this table/partition directory that it hasn't been able to + * finish, e.g. because of an open transaction at the time of compaction. + * We do this by assuming that there are no open transactions anywhere and then calling getAcidState. If there are + * obsolete directories, then the Cleaner has more work to do. + * @param location location of table + * @return number of dirs left for the cleaner to clean – eventually + * @throws IOException + */ + private int getNumEventuallyObsoleteDirs(String location) throws IOException { +ValidTxnList validTxnList = new ValidReadTxnList(); +//save it so that getAcidState() sees it +conf.set(ValidTxnList.VALID_TXNS_KEY, validTxnList.writeToString()); +ValidReaderWriteIdList validWriteIdList = new ValidReaderWriteIdList(); +Path locPath = new Path(location); +AcidUtils.Directory dir = AcidUtils.getAcidState(locPath.getFileSystem(conf), locPath, conf, validWriteIdList, Review comment: Good idea, thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517827) Time Spent: 40m (was: 0.5h) > compactor.Cleaner should not set state "mark cleaned" if there are obsolete > files in the FS > --- > > Key: HIVE-2 > URL: https://issues.apache.org/jira/browse/HIVE-2 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > This is an improvement on HIVE-24314, in which markCleaned() is called only > if +any+ files are deleted by the cleaner. This could cause a problem in the > following case: > Say for table_1 compaction1 cleaning was blocked by an open txn, and > compaction is run again on the same table (compaction2). Both compaction1 and > compaction2 could be in "ready for cleaning" at the same time. By this time > the blocking open txn could be committed. When the cleaner runs, one of > compaction1 and compaction2 will remain in the "ready for cleaning" state: > Say compaction2 is picked up by the cleaner first. The Cleaner deletes all > obsolete files. Then compaction1 is picked up by the cleaner; the cleaner > doesn't remove any files and compaction1 will stay in the queue in a "ready > for cleaning" state. > HIVE-24291 already solves this issue but if it isn't usable (for example if > HMS schema changes are out the question) then HIVE-24314 + this change will > fix the issue of the Cleaner not removing all obsolete files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24444) compactor.Cleaner should not set state "mark cleaned" if there are obsolete files in the FS
[ https://issues.apache.org/jira/browse/HIVE-2?focusedWorklogId=517825=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517825 ] ASF GitHub Bot logged work on HIVE-2: - Author: ASF GitHub Bot Created on: 30/Nov/20 10:13 Start Date: 30/Nov/20 10:13 Worklog Time Spent: 10m Work Description: pvargacl commented on a change in pull request #1716: URL: https://github.com/apache/hive/pull/1716#discussion_r532482152 ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java ## @@ -316,6 +312,29 @@ private boolean removeFiles(String location, ValidWriteIdList writeIdList, Compa } fs.delete(dead, true); } -return true; +// Check if there will be more obsolete directories to clean when possible. We will only mark cleaned when this +// number reaches 0. +return getNumEventuallyObsoleteDirs(location) == 0; + } + + /** + * Get the number of base/delta directories the Cleaner should remove eventually. If we check this after cleaning + * we can see if the Cleaner has further work to do in this table/partition directory that it hasn't been able to + * finish, e.g. because of an open transaction at the time of compaction. + * We do this by assuming that there are no open transactions anywhere and then calling getAcidState. If there are + * obsolete directories, then the Cleaner has more work to do. + * @param location location of table + * @return number of dirs left for the cleaner to clean – eventually + * @throws IOException + */ + private int getNumEventuallyObsoleteDirs(String location) throws IOException { +ValidTxnList validTxnList = new ValidReadTxnList(); +//save it so that getAcidState() sees it +conf.set(ValidTxnList.VALID_TXNS_KEY, validTxnList.writeToString()); +ValidReaderWriteIdList validWriteIdList = new ValidReaderWriteIdList(); +Path locPath = new Path(location); +AcidUtils.Directory dir = AcidUtils.getAcidState(locPath.getFileSystem(conf), locPath, conf, validWriteIdList, Review comment: You could pass an empty dirSnapshot to the first getAcidState in removeFiles, that will fill up the snapshot, and you can you reuse it here, so it won't make a second listing on the FS This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517825) Time Spent: 0.5h (was: 20m) > compactor.Cleaner should not set state "mark cleaned" if there are obsolete > files in the FS > --- > > Key: HIVE-2 > URL: https://issues.apache.org/jira/browse/HIVE-2 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > This is an improvement on HIVE-24314, in which markCleaned() is called only > if +any+ files are deleted by the cleaner. This could cause a problem in the > following case: > Say for table_1 compaction1 cleaning was blocked by an open txn, and > compaction is run again on the same table (compaction2). Both compaction1 and > compaction2 could be in "ready for cleaning" at the same time. By this time > the blocking open txn could be committed. When the cleaner runs, one of > compaction1 and compaction2 will remain in the "ready for cleaning" state: > Say compaction2 is picked up by the cleaner first. The Cleaner deletes all > obsolete files. Then compaction1 is picked up by the cleaner; the cleaner > doesn't remove any files and compaction1 will stay in the queue in a "ready > for cleaning" state. > HIVE-24291 already solves this issue but if it isn't usable (for example if > HMS schema changes are out the question) then HIVE-24314 + this change will > fix the issue of the Cleaner not removing all obsolete files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24433) AutoCompaction is not getting triggered for CamelCase Partition Values
[ https://issues.apache.org/jira/browse/HIVE-24433?focusedWorklogId=517823=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517823 ] ASF GitHub Bot logged work on HIVE-24433: - Author: ASF GitHub Bot Created on: 30/Nov/20 09:57 Start Date: 30/Nov/20 09:57 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #1712: URL: https://github.com/apache/hive/pull/1712#discussion_r532463117 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -2725,7 +2725,7 @@ private void insertTxnComponents(long txnid, LockRequest rqst, Connection dbConn } String dbName = normalizeCase(lc.getDbname()); String tblName = normalizeCase(lc.getTablename()); - String partName = normalizeCase(lc.getPartitionname()); + String partName = lc.getPartitionname(); Review comment: @nareshpr , do you know if partition key name (`name`=value) is already normalized here? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517823) Time Spent: 1h (was: 50m) > AutoCompaction is not getting triggered for CamelCase Partition Values > -- > > Key: HIVE-24433 > URL: https://issues.apache.org/jira/browse/HIVE-24433 > Project: Hive > Issue Type: Bug >Reporter: Naresh P R >Assignee: Naresh P R >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > PartionKeyValue is getting converted into lowerCase in below 2 places. > [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L2728] > [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L2851] > Because of which TXN_COMPONENTS & HIVE_LOCKS tables are not having entries > from proper partition values. > When query completes, the entry moves from TXN_COMPONENTS to > COMPLETED_TXN_COMPONENTS. Hive AutoCompaction will not recognize the > partition & considers it as invalid partition > {code:java} > create table abc(name string) partitioned by(city string) stored as orc > tblproperties('transactional'='true'); > insert into abc partition(city='Bangalore') values('aaa'); > {code} > Example entry in COMPLETED_TXN_COMPONENTS > {noformat} > +---+--++---+-+-+---+ > | CTC_TXNID | CTC_DATABASE | CTC_TABLE | CTC_PARTITION | > CTC_TIMESTAMP | CTC_WRITEID | CTC_UPDATE_DELETE | > +---+--++---+-+-+---+ > | 2 | default | abc | city=bangalore | 2020-11-25 09:26:59 > | 1 | N | > +---+--++---+-+-+---+ > {noformat} > > AutoCompaction fails to get triggered with below error > {code:java} > 2020-11-25T09:35:10,364 INFO [Thread-9]: compactor.Initiator > (Initiator.java:run(98)) - Checking to see if we should compact > default.abc.city=bangalore > 2020-11-25T09:35:10,380 INFO [Thread-9]: compactor.Initiator > (Initiator.java:run(155)) - Can't find partition > default.compaction_test.city=bangalore, assuming it has been dropped and > moving on{code} > I verifed below 4 SQL's with my PR, those all produced correct > PartitionKeyValue > i.e, COMPLETED_TXN_COMPONENTS.CTC_PARTITION="city=Bangalore" > {code:java} > insert into table abc PARTITION(CitY='Bangalore') values('Dan'); > insert overwrite table abc partition(CiTy='Bangalore') select Name from abc; > update table abc set Name='xy' where CiTy='Bangalore'; > delete from abc where CiTy='Bangalore';{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24433) AutoCompaction is not getting triggered for CamelCase Partition Values
[ https://issues.apache.org/jira/browse/HIVE-24433?focusedWorklogId=517817=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517817 ] ASF GitHub Bot logged work on HIVE-24433: - Author: ASF GitHub Bot Created on: 30/Nov/20 09:44 Start Date: 30/Nov/20 09:44 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #1712: URL: https://github.com/apache/hive/pull/1712#discussion_r532463117 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -2725,7 +2725,7 @@ private void insertTxnComponents(long txnid, LockRequest rqst, Connection dbConn } String dbName = normalizeCase(lc.getDbname()); String tblName = normalizeCase(lc.getTablename()); - String partName = normalizeCase(lc.getPartitionname()); + String partName = lc.getPartitionname(); Review comment: @nareshpr , do you know if partition name part (=value) is already normalized here? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517817) Time Spent: 40m (was: 0.5h) > AutoCompaction is not getting triggered for CamelCase Partition Values > -- > > Key: HIVE-24433 > URL: https://issues.apache.org/jira/browse/HIVE-24433 > Project: Hive > Issue Type: Bug >Reporter: Naresh P R >Assignee: Naresh P R >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > PartionKeyValue is getting converted into lowerCase in below 2 places. > [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L2728] > [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L2851] > Because of which TXN_COMPONENTS & HIVE_LOCKS tables are not having entries > from proper partition values. > When query completes, the entry moves from TXN_COMPONENTS to > COMPLETED_TXN_COMPONENTS. Hive AutoCompaction will not recognize the > partition & considers it as invalid partition > {code:java} > create table abc(name string) partitioned by(city string) stored as orc > tblproperties('transactional'='true'); > insert into abc partition(city='Bangalore') values('aaa'); > {code} > Example entry in COMPLETED_TXN_COMPONENTS > {noformat} > +---+--++---+-+-+---+ > | CTC_TXNID | CTC_DATABASE | CTC_TABLE | CTC_PARTITION | > CTC_TIMESTAMP | CTC_WRITEID | CTC_UPDATE_DELETE | > +---+--++---+-+-+---+ > | 2 | default | abc | city=bangalore | 2020-11-25 09:26:59 > | 1 | N | > +---+--++---+-+-+---+ > {noformat} > > AutoCompaction fails to get triggered with below error > {code:java} > 2020-11-25T09:35:10,364 INFO [Thread-9]: compactor.Initiator > (Initiator.java:run(98)) - Checking to see if we should compact > default.abc.city=bangalore > 2020-11-25T09:35:10,380 INFO [Thread-9]: compactor.Initiator > (Initiator.java:run(155)) - Can't find partition > default.compaction_test.city=bangalore, assuming it has been dropped and > moving on{code} > I verifed below 4 SQL's with my PR, those all produced correct > PartitionKeyValue > i.e, COMPLETED_TXN_COMPONENTS.CTC_PARTITION="city=Bangalore" > {code:java} > insert into table abc PARTITION(CitY='Bangalore') values('Dan'); > insert overwrite table abc partition(CiTy='Bangalore') select Name from abc; > update table abc set Name='xy' where CiTy='Bangalore'; > delete from abc where CiTy='Bangalore';{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24433) AutoCompaction is not getting triggered for CamelCase Partition Values
[ https://issues.apache.org/jira/browse/HIVE-24433?focusedWorklogId=517818=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517818 ] ASF GitHub Bot logged work on HIVE-24433: - Author: ASF GitHub Bot Created on: 30/Nov/20 09:44 Start Date: 30/Nov/20 09:44 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #1712: URL: https://github.com/apache/hive/pull/1712#discussion_r532463117 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -2725,7 +2725,7 @@ private void insertTxnComponents(long txnid, LockRequest rqst, Connection dbConn } String dbName = normalizeCase(lc.getDbname()); String tblName = normalizeCase(lc.getTablename()); - String partName = normalizeCase(lc.getPartitionname()); + String partName = lc.getPartitionname(); Review comment: @nareshpr , do you know if partition name part (`name`=value) is already normalized here? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517818) Time Spent: 50m (was: 40m) > AutoCompaction is not getting triggered for CamelCase Partition Values > -- > > Key: HIVE-24433 > URL: https://issues.apache.org/jira/browse/HIVE-24433 > Project: Hive > Issue Type: Bug >Reporter: Naresh P R >Assignee: Naresh P R >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > PartionKeyValue is getting converted into lowerCase in below 2 places. > [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L2728] > [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L2851] > Because of which TXN_COMPONENTS & HIVE_LOCKS tables are not having entries > from proper partition values. > When query completes, the entry moves from TXN_COMPONENTS to > COMPLETED_TXN_COMPONENTS. Hive AutoCompaction will not recognize the > partition & considers it as invalid partition > {code:java} > create table abc(name string) partitioned by(city string) stored as orc > tblproperties('transactional'='true'); > insert into abc partition(city='Bangalore') values('aaa'); > {code} > Example entry in COMPLETED_TXN_COMPONENTS > {noformat} > +---+--++---+-+-+---+ > | CTC_TXNID | CTC_DATABASE | CTC_TABLE | CTC_PARTITION | > CTC_TIMESTAMP | CTC_WRITEID | CTC_UPDATE_DELETE | > +---+--++---+-+-+---+ > | 2 | default | abc | city=bangalore | 2020-11-25 09:26:59 > | 1 | N | > +---+--++---+-+-+---+ > {noformat} > > AutoCompaction fails to get triggered with below error > {code:java} > 2020-11-25T09:35:10,364 INFO [Thread-9]: compactor.Initiator > (Initiator.java:run(98)) - Checking to see if we should compact > default.abc.city=bangalore > 2020-11-25T09:35:10,380 INFO [Thread-9]: compactor.Initiator > (Initiator.java:run(155)) - Can't find partition > default.compaction_test.city=bangalore, assuming it has been dropped and > moving on{code} > I verifed below 4 SQL's with my PR, those all produced correct > PartitionKeyValue > i.e, COMPLETED_TXN_COMPONENTS.CTC_PARTITION="city=Bangalore" > {code:java} > insert into table abc PARTITION(CitY='Bangalore') values('Dan'); > insert overwrite table abc partition(CiTy='Bangalore') select Name from abc; > update table abc set Name='xy' where CiTy='Bangalore'; > delete from abc where CiTy='Bangalore';{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24444) compactor.Cleaner should not set state "mark cleaned" if there are obsolete files in the FS
[ https://issues.apache.org/jira/browse/HIVE-2?focusedWorklogId=517811=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517811 ] ASF GitHub Bot logged work on HIVE-2: - Author: ASF GitHub Bot Created on: 30/Nov/20 09:12 Start Date: 30/Nov/20 09:12 Worklog Time Spent: 10m Work Description: klcopp commented on pull request #1716: URL: https://github.com/apache/hive/pull/1716#issuecomment-735657907 @pvargacl would you mind taking a look too? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517811) Time Spent: 20m (was: 10m) > compactor.Cleaner should not set state "mark cleaned" if there are obsolete > files in the FS > --- > > Key: HIVE-2 > URL: https://issues.apache.org/jira/browse/HIVE-2 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > This is an improvement on HIVE-24314, in which markCleaned() is called only > if +any+ files are deleted by the cleaner. This could cause a problem in the > following case: > Say for table_1 compaction1 cleaning was blocked by an open txn, and > compaction is run again on the same table (compaction2). Both compaction1 and > compaction2 could be in "ready for cleaning" at the same time. By this time > the blocking open txn could be committed. When the cleaner runs, one of > compaction1 and compaction2 will remain in the "ready for cleaning" state: > Say compaction2 is picked up by the cleaner first. The Cleaner deletes all > obsolete files. Then compaction1 is picked up by the cleaner; the cleaner > doesn't remove any files and compaction1 will stay in the queue in a "ready > for cleaning" state. > HIVE-24291 already solves this issue but if it isn't usable (for example if > HMS schema changes are out the question) then HIVE-24314 + this change will > fix the issue of the Cleaner not removing all obsolete files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24444) compactor.Cleaner should not set state "mark cleaned" if there are obsolete files in the FS
[ https://issues.apache.org/jira/browse/HIVE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-2: -- Labels: pull-request-available (was: ) > compactor.Cleaner should not set state "mark cleaned" if there are obsolete > files in the FS > --- > > Key: HIVE-2 > URL: https://issues.apache.org/jira/browse/HIVE-2 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This is an improvement on HIVE-24314, in which markCleaned() is called only > if +any+ files are deleted by the cleaner. This could cause a problem in the > following case: > Say for table_1 compaction1 cleaning was blocked by an open txn, and > compaction is run again on the same table (compaction2). Both compaction1 and > compaction2 could be in "ready for cleaning" at the same time. By this time > the blocking open txn could be committed. When the cleaner runs, one of > compaction1 and compaction2 will remain in the "ready for cleaning" state: > Say compaction2 is picked up by the cleaner first. The Cleaner deletes all > obsolete files. Then compaction1 is picked up by the cleaner; the cleaner > doesn't remove any files and compaction1 will stay in the queue in a "ready > for cleaning" state. > HIVE-24291 already solves this issue but if it isn't usable (for example if > HMS schema changes are out the question) then HIVE-24314 + this change will > fix the issue of the Cleaner not removing all obsolete files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24444) compactor.Cleaner should not set state "mark cleaned" if there are obsolete files in the FS
[ https://issues.apache.org/jira/browse/HIVE-2?focusedWorklogId=517809=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517809 ] ASF GitHub Bot logged work on HIVE-2: - Author: ASF GitHub Bot Created on: 30/Nov/20 09:09 Start Date: 30/Nov/20 09:09 Worklog Time Spent: 10m Work Description: klcopp opened a new pull request #1716: URL: https://github.com/apache/hive/pull/1716 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? See HIVE-2 ### How was this patch tested? Unit test This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517809) Remaining Estimate: 0h Time Spent: 10m > compactor.Cleaner should not set state "mark cleaned" if there are obsolete > files in the FS > --- > > Key: HIVE-2 > URL: https://issues.apache.org/jira/browse/HIVE-2 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > This is an improvement on HIVE-24314, in which markCleaned() is called only > if +any+ files are deleted by the cleaner. This could cause a problem in the > following case: > Say for table_1 compaction1 cleaning was blocked by an open txn, and > compaction is run again on the same table (compaction2). Both compaction1 and > compaction2 could be in "ready for cleaning" at the same time. By this time > the blocking open txn could be committed. When the cleaner runs, one of > compaction1 and compaction2 will remain in the "ready for cleaning" state: > Say compaction2 is picked up by the cleaner first. The Cleaner deletes all > obsolete files. Then compaction1 is picked up by the cleaner; the cleaner > doesn't remove any files and compaction1 will stay in the queue in a "ready > for cleaning" state. > HIVE-24291 already solves this issue but if it isn't usable (for example if > HMS schema changes are out the question) then HIVE-24314 + this change will > fix the issue of the Cleaner not removing all obsolete files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24444) compactor.Cleaner should not set state "mark cleaned" if there are obsolete files in the FS
[ https://issues.apache.org/jira/browse/HIVE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Coppage reassigned HIVE-2: > compactor.Cleaner should not set state "mark cleaned" if there are obsolete > files in the FS > --- > > Key: HIVE-2 > URL: https://issues.apache.org/jira/browse/HIVE-2 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > > This is an improvement on HIVE-24314, in which markCleaned() is called only > if +any+ files are deleted by the cleaner. This could cause a problem in the > following case: > Say for table_1 compaction1 cleaning was blocked by an open txn, and > compaction is run again on the same table (compaction2). Both compaction1 and > compaction2 could be in "ready for cleaning" at the same time. By this time > the blocking open txn could be committed. When the cleaner runs, one of > compaction1 and compaction2 will remain in the "ready for cleaning" state: > Say compaction2 is picked up by the cleaner first. The Cleaner deletes all > obsolete files. Then compaction1 is picked up by the cleaner; the cleaner > doesn't remove any files and compaction1 will stay in the queue in a "ready > for cleaning" state. > HIVE-24291 already solves this issue but if it isn't usable (for example if > HMS schema changes are out the question) then HIVE-24314 + this change will > fix the issue of the Cleaner not removing all obsolete files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24423) Improve DbNotificationListener Thread
[ https://issues.apache.org/jira/browse/HIVE-24423?focusedWorklogId=517797=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517797 ] ASF GitHub Bot logged work on HIVE-24423: - Author: ASF GitHub Bot Created on: 30/Nov/20 08:39 Start Date: 30/Nov/20 08:39 Worklog Time Spent: 10m Work Description: miklosgergely commented on a change in pull request #1703: URL: https://github.com/apache/hive/pull/1703#discussion_r532424432 ## File path: hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java ## @@ -1242,64 +1244,50 @@ private void process(NotificationEvent event, ListenerEvent listenerEvent) throw } private static class CleanerThread extends Thread { -private RawStore rs; +private final RawStore rs; private int ttl; -private boolean shouldRun = true; private long sleepTime; CleanerThread(Configuration conf, RawStore rs) { super("DB-Notification-Cleaner"); - this.rs = rs; - boolean isReplEnabled = MetastoreConf.getBoolVar(conf, ConfVars.REPLCMENABLED); - if(isReplEnabled){ -setTimeToLive(MetastoreConf.getTimeVar(conf, ConfVars.REPL_EVENT_DB_LISTENER_TTL, -TimeUnit.SECONDS)); - } - else { -setTimeToLive(MetastoreConf.getTimeVar(conf, ConfVars.EVENT_DB_LISTENER_TTL, -TimeUnit.SECONDS)); - } - setCleanupInterval(MetastoreConf.getTimeVar(conf, ConfVars.EVENT_DB_LISTENER_CLEAN_INTERVAL, - TimeUnit.MILLISECONDS)); setDaemon(true); + this.rs = Objects.requireNonNull(rs); + + boolean isReplEnabled = MetastoreConf.getBoolVar(conf, ConfVars.REPLCMENABLED); + ConfVars ttlConf = (isReplEnabled) ? ConfVars.REPL_EVENT_DB_LISTENER_TTL : ConfVars.EVENT_DB_LISTENER_TTL; + setTimeToLive(MetastoreConf.getTimeVar(conf, ttlConf, TimeUnit.SECONDS)); + setCleanupInterval( + MetastoreConf.getTimeVar(conf, ConfVars.EVENT_DB_LISTENER_CLEAN_INTERVAL, TimeUnit.MILLISECONDS)); } @Override public void run() { - while (shouldRun) { + while (true) { +LOG.debug("Cleaner thread running"); try { rs.cleanNotificationEvents(ttl); rs.cleanWriteNotificationEvents(ttl); } catch (Exception ex) { - //catching exceptions here makes sure that the thread doesn't die in case of unexpected - //exceptions - LOG.warn("Exception received while cleaning notifications: ", ex); + LOG.warn("Exception received while cleaning notifications", ex); Review comment: No, go ahead and merge it This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 517797) Time Spent: 50m (was: 40m) > Improve DbNotificationListener Thread > - > > Key: HIVE-24423 > URL: https://issues.apache.org/jira/browse/HIVE-24423 > Project: Hive > Issue Type: Improvement >Affects Versions: 3.1.0 >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Clean up and simplify {{DbNotificationListener}} thread class. > Most importantly, stop the thread and wait for it to finish before launching > a new thread. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Issue Comment Deleted] (HIVE-24437) Add more removed configs for(Don't fail config validation for removed configs)
[ https://issues.apache.org/jira/browse/HIVE-24437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] JiangZhu updated HIVE-24437: Comment: was deleted (was: Need code review.) > Add more removed configs for(Don't fail config validation for removed configs) > -- > > Key: HIVE-24437 > URL: https://issues.apache.org/jira/browse/HIVE-24437 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.3.7 >Reporter: JiangZhu >Assignee: JiangZhu >Priority: Major > Attachments: HIVE-24437.1.patch > > > Add more removed configs for(HIVE-14132 Don't fail config validation for > removed configs) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24426) Spark job fails with fixed LlapTaskUmbilicalServer port
[ https://issues.apache.org/jira/browse/HIVE-24426?focusedWorklogId=517785=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517785 ] ASF GitHub Bot logged work on HIVE-24426: - Author: ASF GitHub Bot Created on: 30/Nov/20 08:15 Start Date: 30/Nov/20 08:15 Worklog Time Spent: 10m Work Description: prasanthj commented on a change in pull request #1705: URL: https://github.com/apache/hive/pull/1705#discussion_r532410351 ## File path: llap-client/src/java/org/apache/hadoop/hive/llap/tezplugins/helpers/LlapTaskUmbilicalServer.java ## @@ -54,27 +56,54 @@ public LlapTaskUmbilicalServer(Configuration conf, LlapTaskUmbilicalProtocol umbilical, int numHandlers) throws IOException { jobTokenSecretManager = new JobTokenSecretManager(); -int umbilicalPort = HiveConf.getIntVar(conf, HiveConf.ConfVars.LLAP_TASK_UMBILICAL_SERVER_PORT); -if (umbilicalPort <= 0) { - umbilicalPort = 0; + +String[] portRange = +conf.get(HiveConf.ConfVars.LLAP_TASK_UMBILICAL_SERVER_PORT.varname) +.split("-"); + +int minPort = Integer.parseInt(portRange[0]); +boolean portFound = false; +IOException e = null; +if (portRange.length == 1) { + // Single port specified, not Range. + startServer(conf, umbilical, numHandlers, minPort); + portFound = true; +} else { + int maxPort = Integer.parseInt(portRange[1]); + for (int i = minPort; i < maxPort; i++) { +try { + startServer(conf, umbilical, numHandlers, i); + portFound = true; + break; +} catch (BindException be) { + // Ignore and move ahead, in search of a free port. Review comment: Log at warn level to say which port is being tried and what error message received for debugging. ## File path: llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskCommunicator.java ## @@ -257,23 +259,32 @@ protected void startRpcServer() { int numHandlers = HiveConf.getIntVar(conf, ConfVars.LLAP_TASK_COMMUNICATOR_LISTENER_THREAD_COUNT); - int umbilicalPort = HiveConf.getIntVar(conf, ConfVars.LLAP_TASK_UMBILICAL_SERVER_PORT); - if (umbilicalPort <= 0) { -umbilicalPort = 0; + String[] portRange = + conf.get(HiveConf.ConfVars.LLAP_TASK_UMBILICAL_SERVER_PORT.varname) + .split("-"); + boolean portFound = false; + IOException ioe = null; + int minPort = Integer.parseInt(portRange[0]); + if (portRange.length == 1) { +// Single port specified, not range. +startServerInternal(conf, minPort, numHandlers, jobTokenSecretManager); +portFound = true; + } else { +int maxPort = Integer.parseInt(portRange[1]); +for (int i = minPort; i < maxPort; i++) { + try { +startServerInternal(conf, i, numHandlers, jobTokenSecretManager); +portFound = true; +break; + } catch (BindException be) { +// Ignore and move ahead, in search of a free port. Review comment: same here for logging ## File path: llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskCommunicator.java ## @@ -283,6 +294,23 @@ protected void startRpcServer() { } } + private void startServerInternal(Configuration conf, int umbilicalPort, + int numHandlers, JobTokenSecretManager jobTokenSecretManager) + throws IOException { +server = new RPC.Builder(conf).setProtocol(LlapTaskUmbilicalProtocol.class) + .setBindAddress("0.0.0.0").setPort(umbilicalPort).setInstance(umbilical) +.setNumHandlers(numHandlers).setSecretManager(jobTokenSecretManager) +.build(); + +if (conf Review comment: nit: same here to move this to private variable. ## File path: llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskCommunicator.java ## @@ -257,23 +259,32 @@ protected void startRpcServer() { int numHandlers = HiveConf.getIntVar(conf, ConfVars.LLAP_TASK_COMMUNICATOR_LISTENER_THREAD_COUNT); - int umbilicalPort = HiveConf.getIntVar(conf, ConfVars.LLAP_TASK_UMBILICAL_SERVER_PORT); - if (umbilicalPort <= 0) { -umbilicalPort = 0; + String[] portRange = + conf.get(HiveConf.ConfVars.LLAP_TASK_UMBILICAL_SERVER_PORT.varname) + .split("-"); + boolean portFound = false; + IOException ioe = null; + int minPort = Integer.parseInt(portRange[0]); + if (portRange.length == 1) { +// Single port specified, not range. +startServerInternal(conf, minPort, numHandlers, jobTokenSecretManager); +portFound = true; + } else { +int maxPort = Integer.parseInt(portRange[1]); Review comment: same here (use RangeValidator) ## File path:
[jira] [Commented] (HIVE-24437) Add more removed configs for(Don't fail config validation for removed configs)
[ https://issues.apache.org/jira/browse/HIVE-24437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17240548#comment-17240548 ] JiangZhu commented on HIVE-24437: - Need code review. > Add more removed configs for(Don't fail config validation for removed configs) > -- > > Key: HIVE-24437 > URL: https://issues.apache.org/jira/browse/HIVE-24437 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.3.7 >Reporter: JiangZhu >Assignee: JiangZhu >Priority: Major > Attachments: HIVE-24437.1.patch > > > Add more removed configs for(HIVE-14132 Don't fail config validation for > removed configs) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24437) Add more removed configs for(Don't fail config validation for removed configs)
[ https://issues.apache.org/jira/browse/HIVE-24437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17240547#comment-17240547 ] JiangZhu commented on HIVE-24437: - Need code review. > Add more removed configs for(Don't fail config validation for removed configs) > -- > > Key: HIVE-24437 > URL: https://issues.apache.org/jira/browse/HIVE-24437 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.3.7 >Reporter: JiangZhu >Assignee: JiangZhu >Priority: Major > Attachments: HIVE-24437.1.patch > > > Add more removed configs for(HIVE-14132 Don't fail config validation for > removed configs) -- This message was sent by Atlassian Jira (v8.3.4#803005)