Re: [EXTERNAL] Hive meetup
+1 > On 22-Feb-2021, at 11:54 PM, Matt McCline > wrote: > > Definitely interested. > > -Original Message- > From: Zoltan Haindrich > Sent: Monday, February 22, 2021 10:17 AM > To: dev@hive.apache.org > Subject: [EXTERNAL] Hive meetup > > Hey All! > > It was quite some time ago when we had a meetup - and in these covid times it > would be online-only anyway :) We were mentioning this lately here and there > at Cloudera. > I think we could have a few talks spanning 2-3 hours or so. > > Are there any interest in it? > > I would be happy to talk about how hive-test-kube works and how hive-dev-box > is employed during testing. > > cheers, > Zoltan
[jira] [Created] (HIVE-24811) Discover Other Areas that Can Benefit from Cached Dates
David Mollitor created HIVE-24811: - Summary: Discover Other Areas that Can Benefit from Cached Dates Key: HIVE-24811 URL: https://issues.apache.org/jira/browse/HIVE-24811 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor Based on my work on [HIVE-24808], I noticed other places that call {{Date#valueOf}} that can probably also benefit from using this cache mechanism. Locate those places and change calls to this. -- This message was sent by Atlassian Jira (v8.3.4#803005)
RE: [EXTERNAL] Hive meetup
Definitely interested. -Original Message- From: Zoltan Haindrich Sent: Monday, February 22, 2021 10:17 AM To: dev@hive.apache.org Subject: [EXTERNAL] Hive meetup Hey All! It was quite some time ago when we had a meetup - and in these covid times it would be online-only anyway :) We were mentioning this lately here and there at Cloudera. I think we could have a few talks spanning 2-3 hours or so. Are there any interest in it? I would be happy to talk about how hive-test-kube works and how hive-dev-box is employed during testing. cheers, Zoltan
Hive meetup
Hey All! It was quite some time ago when we had a meetup - and in these covid times it would be online-only anyway :) We were mentioning this lately here and there at Cloudera. I think we could have a few talks spanning 2-3 hours or so. Are there any interest in it? I would be happy to talk about how hive-test-kube works and how hive-dev-box is employed during testing. cheers, Zoltan
Re: Any plan for new hive 3 or 4 release?
Hey Michel! Yes it was a long time ago we had a release; we have quite a few new features in master. I think we are scaring people for some time now that we will be dropping MR support...I think we should do that. I would really like to see a new Hive release in the near future as well - there is no way for users to even try out new features. I was planning to add nightly builds to package the latest master's state into a deployable artifact - I think a service like may help pretest our next release; I think it won't take much to do it so I'll probably throw it together in the next couple days! cheers, Zoltan On 2/21/21 2:27 PM, Michel Sumbul wrote: Hi Guys, If I'm not wrong, the last release of Hive 3.x is 18 months old. I wanted to ask if you had any roadmap / plan to release a new version of Hive 3.x or Hive 4? Thanks, Michel
[jira] [Created] (HIVE-24810) Use JDK 8 String Switch in TruncDateFromTimestamp
David Mollitor created HIVE-24810: - Summary: Use JDK 8 String Switch in TruncDateFromTimestamp Key: HIVE-24810 URL: https://issues.apache.org/jira/browse/HIVE-24810 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24809) Build failure in metastore-tools-common while resolving javax.el dependency
Stamatis Zampetakis created HIVE-24809: -- Summary: Build failure in metastore-tools-common while resolving javax.el dependency Key: HIVE-24809 URL: https://issues.apache.org/jira/browse/HIVE-24809 Project: Hive Issue Type: Bug Affects Versions: 4.0.0 Reporter: Stamatis Zampetakis Assignee: Stamatis Zampetakis Fix For: 4.0.0 The Hive build (mvn clean install -DskipTests) fails while trying to resolve a transitive dependency to org.glassfish:javax.el:jar:3.0.1-b06-SNAPSHOT. {noformat} [INFO] Reactor Summary: [INFO] [INFO] Hive Storage API 2.7.3-SNAPSHOT SUCCESS [ 2.906 s] [INFO] Hive 4.0.0-SNAPSHOT SUCCESS [ 1.086 s] [INFO] Hive Classifications 4.0.0-SNAPSHOT SUCCESS [ 0.182 s] [INFO] Hive Shims Common 4.0.0-SNAPSHOT ... SUCCESS [ 1.121 s] [INFO] Hive Shims 0.23 4.0.0-SNAPSHOT . SUCCESS [ 1.670 s] [INFO] Hive Shims Scheduler 4.0.0-SNAPSHOT SUCCESS [ 0.832 s] [INFO] Hive Shims 4.0.0-SNAPSHOT .. SUCCESS [ 0.576 s] [INFO] Hive Standalone Metastore 4.0.0-SNAPSHOT ... SUCCESS [ 1.134 s] [INFO] Hive Standalone Metastore Common Code 4.0.0-SNAPSHOT SUCCESS [ 9.868 s] [INFO] Hive Common 4.0.0-SNAPSHOT . SUCCESS [ 2.969 s] [INFO] Hive Service RPC 4.0.0-SNAPSHOT SUCCESS [ 1.206 s] [INFO] Hive Serde 4.0.0-SNAPSHOT .. SUCCESS [ 3.132 s] [INFO] Hive Metastore 4.0.0-SNAPSHOT .. SUCCESS [ 1.351 s] [INFO] Hive Vector-Code-Gen Utilities 4.0.0-SNAPSHOT .. SUCCESS [ 0.164 s] [INFO] Hive Parser 4.0.0-SNAPSHOT . SUCCESS [ 5.819 s] [INFO] Hive UDF 4.0.0-SNAPSHOT SUCCESS [ 0.955 s] [INFO] Hive Llap Common 4.0.0-SNAPSHOT SUCCESS [ 2.381 s] [INFO] Hive Llap Client 4.0.0-SNAPSHOT SUCCESS [ 1.734 s] [INFO] Hive Llap Tez 4.0.0-SNAPSHOT ... SUCCESS [ 1.765 s] [INFO] Hive Spark Remote Client 4.0.0-SNAPSHOT SUCCESS [ 2.134 s] [INFO] Hive Metastore Server 4.0.0-SNAPSHOT ... SUCCESS [ 9.440 s] [INFO] Hive Query Language 4.0.0-SNAPSHOT . SUCCESS [ 34.747 s] [INFO] Hive TestUtils 4.0.0-SNAPSHOT .. SUCCESS [ 0.294 s] [INFO] Hive Llap Server 4.0.0-SNAPSHOT SUCCESS [ 8.443 s] [INFO] Hive HPL/SQL 4.0.0-SNAPSHOT SUCCESS [ 4.635 s] [INFO] Hive Service 4.0.0-SNAPSHOT SUCCESS [ 4.901 s] [INFO] Hive Accumulo Handler 4.0.0-SNAPSHOT ... SUCCESS [ 3.679 s] [INFO] Hive JDBC 4.0.0-SNAPSHOT ... SUCCESS [ 12.405 s] [INFO] Hive Beeline 4.0.0-SNAPSHOT SUCCESS [ 3.108 s] [INFO] Hive CLI 4.0.0-SNAPSHOT SUCCESS [ 2.544 s] [INFO] Hive Contrib 4.0.0-SNAPSHOT SUCCESS [ 1.625 s] [INFO] Hive Druid Handler 4.0.0-SNAPSHOT .. SUCCESS [ 14.406 s] [INFO] Hive HBase Handler 4.0.0-SNAPSHOT .. SUCCESS [ 3.308 s] [INFO] Hive JDBC Handler 4.0.0-SNAPSHOT ... SUCCESS [ 1.938 s] [INFO] Hive HCatalog 4.0.0-SNAPSHOT ... SUCCESS [ 0.385 s] [INFO] Hive HCatalog Core 4.0.0-SNAPSHOT .. SUCCESS [ 3.754 s] [INFO] Hive HCatalog Pig Adapter 4.0.0-SNAPSHOT ... SUCCESS [ 2.907 s] [INFO] Hive HCatalog Server Extensions 4.0.0-SNAPSHOT . SUCCESS [ 2.595 s] [INFO] Hive HCatalog Webhcat Java Client 4.0.0-SNAPSHOT ... SUCCESS [ 2.947 s] [INFO] Hive HCatalog Webhcat 4.0.0-SNAPSHOT ... SUCCESS [ 6.764 s] [INFO] Hive Streaming 4.0.0-SNAPSHOT .. SUCCESS [ 3.268 s] [INFO] Hive Llap External Client 4.0.0-SNAPSHOT ... SUCCESS [ 2.512 s] [INFO] Hive Shims Aggregator 4.0.0-SNAPSHOT ... SUCCESS [ 0.061 s] [INFO] Hive Kryo Registrator 4.0.0-SNAPSHOT ... SUCCESS [ 2.050 s] [INFO] Hive Kudu Handler 4.0.0-SNAPSHOT ... SUCCESS [ 4.839 s] [INFO] Hive Kafka Storage Handler 4.0.0-SNAPSHOT .. SUCCESS [ 3.706 s] [INFO] Hive Packaging 4.0.0-SNAPSHOT .. SUCCESS [ 2.587 s] [INFO] Hive Metastore Tools 4.0.0-SNAPSHOT SUCCESS [ 0.008 s] [INFO] Hive Metastore Tools common libraries 4.0.0-SNAPSHOT FAILURE [ 1.897 s] [INFO] Hive metastore benchmarks 4.0.0-SNAPSHOT ... SKIPPED [INFO] Hive Upgrade Acid 4.0.0-SNAPSHOT ... SKIPPED [INFO] Hive Pre Upgrade Acid 4.0.0-SNAPSHOT ... SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total
[jira] [Created] (HIVE-24808) Cache Dates Parsed
David Mollitor created HIVE-24808: - Summary: Cache Dates Parsed Key: HIVE-24808 URL: https://issues.apache.org/jira/browse/HIVE-24808 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor Parsing Date strings should be cached since it requires some amount of work to do it, and there are only so many dates in a particular data set. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24807) Unknown column 'B0.CATALOG_NAME' in 'where clause'
zhuxiangqian created HIVE-24807: --- Summary: Unknown column 'B0.CATALOG_NAME' in 'where clause' Key: HIVE-24807 URL: https://issues.apache.org/jira/browse/HIVE-24807 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 3.1.0 Environment: hadoop3.1.1/hive3.1.0/ranger2.1.0 Reporter: zhuxiangqian I am trying hive-rel-release-3.1.0 on Hadoop 3.1.1 . when start hiveserver2, I get error log below: 2021-02-22T14:03:13,308 ERROR [pool-8-thread-33] metastore.RetryingHMSHandler: Retrying HMSHandler after 2000 ms (attempt 1 of 10) with error: javax.jdo.JDOException: Exception thrown when executing query : SELECT DISTINCT 'org.apache.hadoop.hive.metastore.model.MFunction' AS `NUCLEUS_TYPE`,`A0`.`CLASS_NAME`,`A0`.`CREATE_TIME`,`A0`.`FUNC_NAME`,`A0`.`FUNC_TYPE`,`A0`.`OWNER_NAME`,`A0`.`OWNER_TYPE`,`A0`.`FUNC_ID` FROM `FUNCS` `A0` LEFT OUTER JOIN `DBS` `B0` ON `A0`.`DB_ID` = `B0`.`DB_ID` WHERE `B0`.`CATALOG_NAME` = ? at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:677) at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:391) at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:228) at org.apache.hadoop.hive.metastore.ObjectStore.getAllFunctions(ObjectStore.java:9393) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) at com.sun.proxy.$Proxy26.getAllFunctions(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_all_functions(HiveMetaStore.java:7102) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) at com.sun.proxy.$Proxy27.get_all_functions(Unknown Source) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_all_functions.getResult(ThriftHiveMetastore.java:17242) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_all_functions.getResult(ThriftHiveMetastore.java:17226) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:636) at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:631) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:631) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) NestedThrowablesStackTrace: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Unknown column 'B0.CATALOG_NAME' in 'where clause' at sun.reflect.GeneratedConstructorAccessor45.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at com.mysql.jdbc.Util.handleNewInstance(Util.java:408) at com.mysql.jdbc.Util.getInstance(Util.java:383) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1062) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4226) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4158) at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2615) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2776) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2840) at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2082) at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:2212) at com.zaxxer.hikari.pool.ProxyPreparedStatement.executeQuery(ProxyPreparedStatement.java:52) at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeQuery(HikariProxyPreparedStatement.java) at
Updating BeeJU to Hive 3.x
Dear all, I am a contributor to BeeJU [1], a project that aims to make Hive metastore unit testing easier by providing JUnit 5 Extensions and JUnit 4 Rules. I have recently raised a PR [2] to upgrade the project to use Hive 3; all the tests pass, but I would still like to ask the Hive community for some feedback before releasing it, in case I missed something. Thank you very much, Nicola [1] https://github.com/HotelsDotCom/beeju [2] https://github.com/HotelsDotCom/beeju/pull/50
[jira] [Created] (HIVE-24806) Compactor: Initiator should lazy evaluate findUserToRunAs()
Rajesh Balamohan created HIVE-24806: --- Summary: Compactor: Initiator should lazy evaluate findUserToRunAs() Key: HIVE-24806 URL: https://issues.apache.org/jira/browse/HIVE-24806 Project: Hive Issue Type: Improvement Reporter: Rajesh Balamohan https://github.com/apache/hive/blob/64bb52316f19426ebea0087ee15e282cbde1d852/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java#L232 {noformat} cache.putIfAbsent(fullTableName, findUserToRunAs(sd.getLocation(), t)); {noformat} This ends up evaluating findUserToRunAs() everytime, and looks up from FileSystem on every call (thousands of times in large database). This can be lazy initialized instead (e.g computeIfAbsent); -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24805) Compactor: Initiator shouldn't fetch table details again and again for partitioned tables
Rajesh Balamohan created HIVE-24805: --- Summary: Compactor: Initiator shouldn't fetch table details again and again for partitioned tables Key: HIVE-24805 URL: https://issues.apache.org/jira/browse/HIVE-24805 Project: Hive Issue Type: Improvement Reporter: Rajesh Balamohan Initiator shouldn't be fetch table details for all its partitions. When there are large number of databases/tables, it takes lot of time for Initiator to complete its initial iteration and load on DB also goes higher. https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java#L129 https://github.com/apache/hive/blob/64bb52316f19426ebea0087ee15e282cbde1d852/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java#L456 For all the following partitions, table details would be the same. However, it ends up fetching table details from HMS again and again. {noformat} 2021-02-22 08:13:16,106 INFO org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2451899 2021-02-22 08:13:16,124 INFO org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2451830 2021-02-22 08:13:16,140 INFO org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2452586 2021-02-22 08:13:16,149 INFO org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2452698 2021-02-22 08:13:16,158 INFO org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2452063 {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)