Re: [EXTERNAL] Hive meetup

2021-02-22 Thread Aasha
+1

> On 22-Feb-2021, at 11:54 PM, Matt McCline 
>  wrote:
> 
> Definitely interested.
> 
> -Original Message-
> From: Zoltan Haindrich  
> Sent: Monday, February 22, 2021 10:17 AM
> To: dev@hive.apache.org
> Subject: [EXTERNAL] Hive meetup
> 
> Hey All!
> 
> It was quite some time ago when we had a meetup - and in these covid times it 
> would be online-only anyway :) We were mentioning this lately here and there 
> at Cloudera.
> I think we could have a few talks spanning 2-3 hours or so.
> 
> Are there any interest in it?
> 
> I would be happy to talk about how hive-test-kube works and how hive-dev-box 
> is employed during testing.
> 
> cheers,
> Zoltan


[jira] [Created] (HIVE-24811) Discover Other Areas that Can Benefit from Cached Dates

2021-02-22 Thread David Mollitor (Jira)
David Mollitor created HIVE-24811:
-

 Summary: Discover Other Areas that Can Benefit from Cached Dates
 Key: HIVE-24811
 URL: https://issues.apache.org/jira/browse/HIVE-24811
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor


Based on my work on [HIVE-24808], I noticed other places that call 
{{Date#valueOf}} that can probably also benefit from using this cache 
mechanism.  Locate those places and change calls to this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


RE: [EXTERNAL] Hive meetup

2021-02-22 Thread Matt McCline
Definitely interested.

-Original Message-
From: Zoltan Haindrich  
Sent: Monday, February 22, 2021 10:17 AM
To: dev@hive.apache.org
Subject: [EXTERNAL] Hive meetup

Hey All!

It was quite some time ago when we had a meetup - and in these covid times it 
would be online-only anyway :) We were mentioning this lately here and there at 
Cloudera.
I think we could have a few talks spanning 2-3 hours or so.

Are there any interest in it?

I would be happy to talk about how hive-test-kube works and how hive-dev-box is 
employed during testing.

cheers,
Zoltan


Hive meetup

2021-02-22 Thread Zoltan Haindrich

Hey All!

It was quite some time ago when we had a meetup - and in these covid times it 
would be online-only anyway :)
We were mentioning this lately here and there at Cloudera.
I think we could have a few talks spanning 2-3 hours or so.

Are there any interest in it?

I would be happy to talk about how hive-test-kube works and how hive-dev-box is 
employed during testing.

cheers,
Zoltan


Re: Any plan for new hive 3 or 4 release?

2021-02-22 Thread Zoltan Haindrich

Hey Michel!

Yes it was a long time ago we had a release; we have quite a few new features 
in master.
I think we are scaring people for some time now that we will be dropping MR 
support...I think we should do that.

I would really like to see a new Hive release in the near future as well - 
there is no way for users to even try out new features.
I was planning to add nightly builds to package the latest master's state into a deployable artifact - I think a service like may help pretest our next release; I think it 
won't take much to do it so I'll probably throw it together in the next couple days!


cheers,
Zoltan

On 2/21/21 2:27 PM, Michel Sumbul wrote:

Hi Guys,

If I'm not wrong, the last release of Hive 3.x is 18 months old.
I wanted to ask if you had any roadmap / plan to release a new version of
Hive 3.x or Hive 4?

Thanks,
Michel



[jira] [Created] (HIVE-24810) Use JDK 8 String Switch in TruncDateFromTimestamp

2021-02-22 Thread David Mollitor (Jira)
David Mollitor created HIVE-24810:
-

 Summary: Use JDK 8 String Switch in TruncDateFromTimestamp
 Key: HIVE-24810
 URL: https://issues.apache.org/jira/browse/HIVE-24810
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24809) Build failure in metastore-tools-common while resolving javax.el dependency

2021-02-22 Thread Stamatis Zampetakis (Jira)
Stamatis Zampetakis created HIVE-24809:
--

 Summary: Build failure in metastore-tools-common while resolving 
javax.el dependency
 Key: HIVE-24809
 URL: https://issues.apache.org/jira/browse/HIVE-24809
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Stamatis Zampetakis
Assignee: Stamatis Zampetakis
 Fix For: 4.0.0


The Hive build (mvn clean install -DskipTests) fails while trying to resolve a 
transitive dependency to org.glassfish:javax.el:jar:3.0.1-b06-SNAPSHOT. 

{noformat}
[INFO] Reactor Summary:
[INFO] 
[INFO] Hive Storage API 2.7.3-SNAPSHOT  SUCCESS [  2.906 s]
[INFO] Hive 4.0.0-SNAPSHOT  SUCCESS [  1.086 s]
[INFO] Hive Classifications 4.0.0-SNAPSHOT  SUCCESS [  0.182 s]
[INFO] Hive Shims Common 4.0.0-SNAPSHOT ... SUCCESS [  1.121 s]
[INFO] Hive Shims 0.23 4.0.0-SNAPSHOT . SUCCESS [  1.670 s]
[INFO] Hive Shims Scheduler 4.0.0-SNAPSHOT  SUCCESS [  0.832 s]
[INFO] Hive Shims 4.0.0-SNAPSHOT .. SUCCESS [  0.576 s]
[INFO] Hive Standalone Metastore 4.0.0-SNAPSHOT ... SUCCESS [  1.134 s]
[INFO] Hive Standalone Metastore Common Code 4.0.0-SNAPSHOT SUCCESS [  9.868 s]
[INFO] Hive Common 4.0.0-SNAPSHOT . SUCCESS [  2.969 s]
[INFO] Hive Service RPC 4.0.0-SNAPSHOT  SUCCESS [  1.206 s]
[INFO] Hive Serde 4.0.0-SNAPSHOT .. SUCCESS [  3.132 s]
[INFO] Hive Metastore 4.0.0-SNAPSHOT .. SUCCESS [  1.351 s]
[INFO] Hive Vector-Code-Gen Utilities 4.0.0-SNAPSHOT .. SUCCESS [  0.164 s]
[INFO] Hive Parser 4.0.0-SNAPSHOT . SUCCESS [  5.819 s]
[INFO] Hive UDF 4.0.0-SNAPSHOT  SUCCESS [  0.955 s]
[INFO] Hive Llap Common 4.0.0-SNAPSHOT  SUCCESS [  2.381 s]
[INFO] Hive Llap Client 4.0.0-SNAPSHOT  SUCCESS [  1.734 s]
[INFO] Hive Llap Tez 4.0.0-SNAPSHOT ... SUCCESS [  1.765 s]
[INFO] Hive Spark Remote Client 4.0.0-SNAPSHOT  SUCCESS [  2.134 s]
[INFO] Hive Metastore Server 4.0.0-SNAPSHOT ... SUCCESS [  9.440 s]
[INFO] Hive Query Language 4.0.0-SNAPSHOT . SUCCESS [ 34.747 s]
[INFO] Hive TestUtils 4.0.0-SNAPSHOT .. SUCCESS [  0.294 s]
[INFO] Hive Llap Server 4.0.0-SNAPSHOT  SUCCESS [  8.443 s]
[INFO] Hive HPL/SQL 4.0.0-SNAPSHOT  SUCCESS [  4.635 s]
[INFO] Hive Service 4.0.0-SNAPSHOT  SUCCESS [  4.901 s]
[INFO] Hive Accumulo Handler 4.0.0-SNAPSHOT ... SUCCESS [  3.679 s]
[INFO] Hive JDBC 4.0.0-SNAPSHOT ... SUCCESS [ 12.405 s]
[INFO] Hive Beeline 4.0.0-SNAPSHOT  SUCCESS [  3.108 s]
[INFO] Hive CLI 4.0.0-SNAPSHOT  SUCCESS [  2.544 s]
[INFO] Hive Contrib 4.0.0-SNAPSHOT  SUCCESS [  1.625 s]
[INFO] Hive Druid Handler 4.0.0-SNAPSHOT .. SUCCESS [ 14.406 s]
[INFO] Hive HBase Handler 4.0.0-SNAPSHOT .. SUCCESS [  3.308 s]
[INFO] Hive JDBC Handler 4.0.0-SNAPSHOT ... SUCCESS [  1.938 s]
[INFO] Hive HCatalog 4.0.0-SNAPSHOT ... SUCCESS [  0.385 s]
[INFO] Hive HCatalog Core 4.0.0-SNAPSHOT .. SUCCESS [  3.754 s]
[INFO] Hive HCatalog Pig Adapter 4.0.0-SNAPSHOT ... SUCCESS [  2.907 s]
[INFO] Hive HCatalog Server Extensions 4.0.0-SNAPSHOT . SUCCESS [  2.595 s]
[INFO] Hive HCatalog Webhcat Java Client 4.0.0-SNAPSHOT ... SUCCESS [  2.947 s]
[INFO] Hive HCatalog Webhcat 4.0.0-SNAPSHOT ... SUCCESS [  6.764 s]
[INFO] Hive Streaming 4.0.0-SNAPSHOT .. SUCCESS [  3.268 s]
[INFO] Hive Llap External Client 4.0.0-SNAPSHOT ... SUCCESS [  2.512 s]
[INFO] Hive Shims Aggregator 4.0.0-SNAPSHOT ... SUCCESS [  0.061 s]
[INFO] Hive Kryo Registrator 4.0.0-SNAPSHOT ... SUCCESS [  2.050 s]
[INFO] Hive Kudu Handler 4.0.0-SNAPSHOT ... SUCCESS [  4.839 s]
[INFO] Hive Kafka Storage Handler 4.0.0-SNAPSHOT .. SUCCESS [  3.706 s]
[INFO] Hive Packaging 4.0.0-SNAPSHOT .. SUCCESS [  2.587 s]
[INFO] Hive Metastore Tools 4.0.0-SNAPSHOT  SUCCESS [  0.008 s]
[INFO] Hive Metastore Tools common libraries 4.0.0-SNAPSHOT FAILURE [  1.897 s]
[INFO] Hive metastore benchmarks 4.0.0-SNAPSHOT ... SKIPPED
[INFO] Hive Upgrade Acid 4.0.0-SNAPSHOT ... SKIPPED
[INFO] Hive Pre Upgrade Acid 4.0.0-SNAPSHOT ... SKIPPED
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total 

[jira] [Created] (HIVE-24808) Cache Dates Parsed

2021-02-22 Thread David Mollitor (Jira)
David Mollitor created HIVE-24808:
-

 Summary: Cache Dates Parsed
 Key: HIVE-24808
 URL: https://issues.apache.org/jira/browse/HIVE-24808
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor


Parsing Date strings should be cached since it requires some amount of work to 
do it, and there are only so many dates in a particular data set.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24807) Unknown column 'B0.CATALOG_NAME' in 'where clause'

2021-02-22 Thread zhuxiangqian (Jira)
zhuxiangqian created HIVE-24807:
---

 Summary: Unknown column 'B0.CATALOG_NAME' in 'where clause'
 Key: HIVE-24807
 URL: https://issues.apache.org/jira/browse/HIVE-24807
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 3.1.0
 Environment: hadoop3.1.1/hive3.1.0/ranger2.1.0
Reporter: zhuxiangqian


I am trying hive-rel-release-3.1.0 on Hadoop 3.1.1 . when start hiveserver2, I 
get error log below:

 

2021-02-22T14:03:13,308 ERROR [pool-8-thread-33] metastore.RetryingHMSHandler: 
Retrying HMSHandler after 2000 ms (attempt 1 of 10) with error: 
javax.jdo.JDOException: Exception thrown when executing query : SELECT DISTINCT 
'org.apache.hadoop.hive.metastore.model.MFunction' AS 
`NUCLEUS_TYPE`,`A0`.`CLASS_NAME`,`A0`.`CREATE_TIME`,`A0`.`FUNC_NAME`,`A0`.`FUNC_TYPE`,`A0`.`OWNER_NAME`,`A0`.`OWNER_TYPE`,`A0`.`FUNC_ID`
 FROM `FUNCS` `A0` LEFT OUTER JOIN `DBS` `B0` ON `A0`.`DB_ID` = `B0`.`DB_ID` 
WHERE `B0`.`CATALOG_NAME` = ?
 at 
org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:677)
 at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:391)
 at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:228)
 at 
org.apache.hadoop.hive.metastore.ObjectStore.getAllFunctions(ObjectStore.java:9393)
 at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)
 at com.sun.proxy.$Proxy26.getAllFunctions(Unknown Source)
 at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_all_functions(HiveMetaStore.java:7102)
 at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
 at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
 at com.sun.proxy.$Proxy27.get_all_functions(Unknown Source)
 at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_all_functions.getResult(ThriftHiveMetastore.java:17242)
 at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_all_functions.getResult(ThriftHiveMetastore.java:17226)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
 at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:636)
 at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:631)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
 at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:631)
 at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
NestedThrowablesStackTrace:
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Unknown column 
'B0.CATALOG_NAME' in 'where clause'
 at sun.reflect.GeneratedConstructorAccessor45.newInstance(Unknown Source)
 at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
 at com.mysql.jdbc.Util.handleNewInstance(Util.java:408)
 at com.mysql.jdbc.Util.getInstance(Util.java:383)
 at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1062)
 at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4226)
 at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4158)
 at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2615)
 at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2776)
 at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2840)
 at 
com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2082)
 at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:2212)
 at 
com.zaxxer.hikari.pool.ProxyPreparedStatement.executeQuery(ProxyPreparedStatement.java:52)
 at 
com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeQuery(HikariProxyPreparedStatement.java)
 at 

Updating BeeJU to Hive 3.x

2021-02-22 Thread Nicola Vitucci
Dear all,

I am a contributor to BeeJU [1], a project that aims to make Hive metastore
unit testing easier by providing JUnit 5 Extensions and JUnit 4 Rules. I
have recently raised a PR [2] to upgrade the project to use Hive 3; all the
tests pass, but I would still like to ask the Hive community for some
feedback before releasing it, in case I missed something.

Thank you very much,

Nicola

[1] https://github.com/HotelsDotCom/beeju
[2] https://github.com/HotelsDotCom/beeju/pull/50


[jira] [Created] (HIVE-24806) Compactor: Initiator should lazy evaluate findUserToRunAs()

2021-02-22 Thread Rajesh Balamohan (Jira)
Rajesh Balamohan created HIVE-24806:
---

 Summary: Compactor: Initiator should lazy evaluate 
findUserToRunAs()
 Key: HIVE-24806
 URL: https://issues.apache.org/jira/browse/HIVE-24806
 Project: Hive
  Issue Type: Improvement
Reporter: Rajesh Balamohan


https://github.com/apache/hive/blob/64bb52316f19426ebea0087ee15e282cbde1d852/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java#L232

{noformat}
cache.putIfAbsent(fullTableName, findUserToRunAs(sd.getLocation(), t));
{noformat}

This ends up evaluating findUserToRunAs() everytime, and looks up from 
FileSystem on every call (thousands of times in large database).

This can be lazy initialized instead (e.g computeIfAbsent);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24805) Compactor: Initiator shouldn't fetch table details again and again for partitioned tables

2021-02-22 Thread Rajesh Balamohan (Jira)
Rajesh Balamohan created HIVE-24805:
---

 Summary: Compactor: Initiator shouldn't fetch table details again 
and again for partitioned tables
 Key: HIVE-24805
 URL: https://issues.apache.org/jira/browse/HIVE-24805
 Project: Hive
  Issue Type: Improvement
Reporter: Rajesh Balamohan


Initiator shouldn't be fetch table details for all its partitions. When there 
are large number of databases/tables, it takes lot of time for Initiator to 
complete its initial iteration and load on DB also goes higher.


https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java#L129

https://github.com/apache/hive/blob/64bb52316f19426ebea0087ee15e282cbde1d852/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java#L456

For all the following partitions, table details would be the same. However, it 
ends up fetching table details from HMS again and again.

{noformat}
2021-02-22 08:13:16,106 INFO  
org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see 
if we should compact 
tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2451899
2021-02-22 08:13:16,124 INFO  
org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see 
if we should compact 
tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2451830
2021-02-22 08:13:16,140 INFO  
org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see 
if we should compact 
tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2452586
2021-02-22 08:13:16,149 INFO  
org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see 
if we should compact 
tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2452698
2021-02-22 08:13:16,158 INFO  
org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see 
if we should compact 
tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2452063
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)