[jira] [Created] (HIVE-23111) MsckPartitionExpressionProxy should filter partitions

2020-03-30 Thread Sam An (Jira)
Sam An created HIVE-23111:
-

 Summary: MsckPartitionExpressionProxy should filter partitions
 Key: HIVE-23111
 URL: https://issues.apache.org/jira/browse/HIVE-23111
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Sam An
Assignee: Sam An


Currently MsckPartitionExpressionProxy does not filter partition names, this 
causes problem for partition auto discovery. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23110) Prevent NPE in ReExecDriver if the processing is aborted

2020-03-30 Thread Miklos Gergely (Jira)
Miklos Gergely created HIVE-23110:
-

 Summary: Prevent NPE in ReExecDriver if the processing is aborted
 Key: HIVE-23110
 URL: https://issues.apache.org/jira/browse/HIVE-23110
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Miklos Gergely
Assignee: Miklos Gergely






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Should we release Hive Storage API 2.7.2-rc0?

2020-03-30 Thread Owen O'Malley
In evaluating this RC, I discovered HIVE-22959, which is the only patch in
this RC.

I'm uncomfortable with the API added by HIVE-22959, because it is
duplicating a lot of the functionality from VectorizedRowBatch. I'll look
at the motivating ORC-577 tomorrow, but for now I'm -1 on releasing it.

.. Owen

On Mon, Mar 30, 2020 at 1:18 PM Vineet G  wrote:

> +1. Verified the signature, checksum and build.
>
> Vineet
>
> > On Mar 30, 2020, at 1:20 AM, Zoltan Haindrich  wrote:
> >
> > +1
> >
> > * verified checksum/etc
> > * built and run tests locally
> > * built orc/master against it
> > * there doesn't seem to be a staged nexus repo for this - but it seems
> like earlier releases also doesn't had that; meanwhile
> https://repo.maven.apache.org/maven2/org/apache/hive/hive-storage-api/2.7.1/
> seems to have them ; I assume it will be also uploaded there along with
> sources/etc
> >
> >
> > On 3/24/20 9:33 PM, Jesus Camacho Rodriguez wrote:
> >> All,
> >> I'd like to make a storage-api release with HIVE-22959
> >>  in it.
> >> Should we release the following artifacts as Hive Storage API 2.7.2?
> >> tar: http://home.apache.org/~jcamacho/hive-storage-2.7.2/
> >> tag:
> https://github.com/apache/hive/releases/tag/storage-release-2.7.2-rc0
> >> jiras: https://issues.apache.org/jira/projects/HIVE/versions/12347828
> >> Thanks!
> >> -Jesús
>
>


Re: [VOTE] Should we release Hive Storage API 2.7.2-rc0?

2020-03-30 Thread Vineet G
+1. Verified the signature, checksum and build.

Vineet

> On Mar 30, 2020, at 1:20 AM, Zoltan Haindrich  wrote:
> 
> +1
> 
> * verified checksum/etc
> * built and run tests locally
> * built orc/master against it
> * there doesn't seem to be a staged nexus repo for this - but it seems like 
> earlier releases also doesn't had that; meanwhile 
> https://repo.maven.apache.org/maven2/org/apache/hive/hive-storage-api/2.7.1/ 
> seems to have them ; I assume it will be also uploaded there along with 
> sources/etc
> 
> 
> On 3/24/20 9:33 PM, Jesus Camacho Rodriguez wrote:
>> All,
>> I'd like to make a storage-api release with HIVE-22959
>>  in it.
>> Should we release the following artifacts as Hive Storage API 2.7.2?
>> tar: http://home.apache.org/~jcamacho/hive-storage-2.7.2/
>> tag: https://github.com/apache/hive/releases/tag/storage-release-2.7.2-rc0
>> jiras: https://issues.apache.org/jira/projects/HIVE/versions/12347828
>> Thanks!
>> -Jesús



[jira] [Created] (HIVE-23109) Query-based compaction omits database

2020-03-30 Thread Karen Coppage (Jira)
Karen Coppage created HIVE-23109:


 Summary: Query-based compaction omits database
 Key: HIVE-23109
 URL: https://issues.apache.org/jira/browse/HIVE-23109
 Project: Hive
  Issue Type: Bug
Reporter: Karen Coppage
Assignee: Karen Coppage


E.g. MM major compaction query looks like:

{code:java}
insert into tmp_table select * from src_table;
{code}

it should be

{code:java}
insert into tmp_table select * from src_db.src_table;
{code}

Therefore compaction fails if db of source table isn't default.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23108) Cleanup HiveBaseResultSet.java

2020-03-30 Thread David Mollitor (Jira)
David Mollitor created HIVE-23108:
-

 Summary: Cleanup HiveBaseResultSet.java
 Key: HIVE-23108
 URL: https://issues.apache.org/jira/browse/HIVE-23108
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor


* Unify the code (there are several implementations of the same thing)
* Added better error messages
* In some cases, the code was throwing RuntimeExceptions which is against the 
JDBC Spec
* Make findColumn a bit more streamlined
* Remove non-javadoc comments
* Add {{@Override}} annotations where appropriate



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23107) Remove MIN_HISTORY_LEVEL table

2020-03-30 Thread Jira
László Pintér created HIVE-23107:


 Summary: Remove MIN_HISTORY_LEVEL table
 Key: HIVE-23107
 URL: https://issues.apache.org/jira/browse/HIVE-23107
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Reporter: László Pintér
Assignee: László Pintér


MIN_HISTORY_LEVEL table is used in two places:
 * Cleaner uses it to decide if the files can be removed - this could be 
replaced by adding a new column to compaction_queue storing the next_txn_id 
when the change was committed, and before cleaning checking the minimum open 
transaction id in the TXNS table
 * Initiator uses it to decide if some items from TXN_TO_WRITE_ID table can be 
removed. This could be replaced by using the WRITE_SET.WS_COMMIT_ID.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 72234: HIVE-22785

2020-03-30 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72234/#review220109
---




itests/src/test/resources/testconfiguration.properties
Lines 18 (patched)


Instead of adding it here, we should add it directly to 
`minillaplocal.query.files`: Those are only executed in MiniLlapLocalCliDriver.



ql/src/test/results/clientpositive/llap/sort_acid.q.out
Line 48 (original), 48 (patched)


Nice!



ql/src/test/results/clientnegative/materialized_view_no_cbo_rewrite.q.out
Line 22 (original), 22 (patched)


Can we make this comment more user friendly; the main reason for this is 
that the MV contains SORT BY but the new message can be cryptic for users. (We 
can defer this to a follow-up).



ql/src/test/results/clientnegative/update_notnull_constraint.q.out
Lines 29 (patched)


This error message seems less descriptive too. Are we hitting same 
exception as we were hitting previously, and it is just a matter of printing 
the error correctly? If that is the case, we can tackle in follow-up.



ql/src/test/results/clientpositive/llap/vector_outer_reference_windowed.q.out
Line 458 (original), 458 (patched)


Can you check the logs for this test to make sure CBO is executed now and 
that is the main reason for this change?


- Jesús Camacho Rodríguez


On March 26, 2020, 7:51 p.m., Krisztian Kasa wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72234/
> ---
> 
> (Updated March 26, 2020, 7:51 p.m.)
> 
> 
> Review request for hive and Jesús Camacho Rodríguez.
> 
> 
> Bugs: HIVE-22785
> https://issues.apache.org/jira/browse/HIVE-22785
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Update/delete/merge statements not optimized through CBO
> 
> 
> Diffs
> -
> 
>   
> itests/hive-blobstore/src/test/results/clientpositive/map_join_on_filter.q.out
>  653faab00a 
>   itests/src/test/resources/testconfiguration.properties 3510016c07 
>   ql/src/java/org/apache/hadoop/hive/ql/QueryProperties.java 9c61b316e2 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelDistribution.java
>  e5f4c8492e 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelFactories.java 
> 04b3888a25 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelJson.java 
> PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelJsonImpl.java 
> 0d45eb0c61 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOptUtil.java 
> e647b88961 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveSortExchange.java
>  880cae70f9 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveProjectSortExchangeTransposeRule.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveProjectSortTransposeRule.java
>  871c411e70 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
>  53d68e872a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveSortLimitPullUpConstantsRule.java
>  e51b2b6ebc 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveSortPullUpConstantsRule.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/ASTConverter.java
>  e03e96ff12 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/PlanModifierForASTConv.java
>  31619c0314 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/opconventer/HiveSortExchangeVisitor.java
>  68227db1ee 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/opconventer/JoinVisitor.java
>  0286d54ea0 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 6589eeb39b 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/RewriteSemanticAnalyzer.java 
> 31068cb8c3 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 679ae2e1e6 
>   ql/src/test/queries/clientpositive/authorization_view_disable_cbo_1.q 
> be50b69830 
>   ql/src/test/queries/clientpositive/sort.q cab2712810 
>   ql/src/test/queries/clientpositive/sort_acid.q PRE-CREATION 
>   ql/src/test/results/clientnegative/materialized_view_no_cbo_rewrite.q.out 
> 2b7ff65c7a 
>   ql/src/test/results/clientnegative/materialized_view_no_cbo_rewrite_2.q.out 
> 6850290412 
>   ql/src/test/results/clientnegative/update_notnull_constraint.q.out 
> 86bfc67480 
>   

[jira] [Created] (HIVE-23106) Cleanup CalcitePlanner genOPTree exception handling

2020-03-30 Thread John Sherman (Jira)
John Sherman created HIVE-23106:
---

 Summary: Cleanup CalcitePlanner genOPTree exception handling
 Key: HIVE-23106
 URL: https://issues.apache.org/jira/browse/HIVE-23106
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: John Sherman
Assignee: John Sherman


The logic where genOPTree handles exceptions during CBO planning is a bit 
twisty and could use some cleanup and comments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 72276: HIVE-23084: Implement kill query in multiple HS2 environment

2020-03-30 Thread Peter Varga via Review Board


> On March 30, 2020, 9:38 a.m., Adam Szita wrote:
> > Looking pretty good overall, I just have a few questions/comments.

I fixed the issues, added a little more logging and fixed the weird annotation 
formatting.


> On March 30, 2020, 9:38 a.m., Adam Szita wrote:
> > service/src/java/org/apache/hive/service/server/KillQueryImpl.java
> > Line 150 (original), 176 (patched)
> > 
> >
> > Shouldn't we return if there are no ops to kill? I think a subsequent 
> > killOperations() call here might throw an NPE.

We can not return otherwise the remote kill would not happen. The 
killOperations is not called, when the query is not found, I made the code more 
readable to reflect that.


> On March 30, 2020, 9:38 a.m., Adam Szita wrote:
> > service/src/java/org/apache/hive/service/server/KillQueryZookeeperManager.java
> > Lines 449 (patched)
> > 
> >
> > Shouldn't we clear the progress flag here?

We break out the loop immediatly after this. The progress flag is not used 
after that.


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72276/#review220102
---


On March 27, 2020, 10:08 a.m., Peter Varga wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72276/
> ---
> 
> (Updated March 27, 2020, 10:08 a.m.)
> 
> 
> Review request for hive and Adam Szita.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> KILL  command was implemented in:
> 
> https://issues.apache.org/jira/browse/HIVE-17483
> https://issues.apache.org/jira/browse/HIVE-20549
> But it is not working in an environment where service discovery is enabled 
> and more than one HS2 instance is running (except for manually sending the 
> kill query to all HS2 instance).
> 
> Solution:
> 
> If a HS2 instance can't kill a query locally, it should post a kill query 
> request to the Zookeeper
> Every HS2 should watch the Zookeeper for kill query requests and if its 
> running on that instance kill it
> Authorization of kill query should work the same
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 34df01e60e 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/BaseJdbcWithMiniLlap.java 
> 3973ec9270 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniLlapArrow.java
>  68a515ccbe 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithServiceDiscovery.java
>  PRE-CREATION 
>   
> itests/hive-unit/src/test/java/org/apache/hive/service/cli/thrift/TestMiniHS2StateWithNoZookeeper.java
>  99e681e5b2 
>   
> itests/hive-unit/src/test/java/org/apache/hive/service/server/TestKillQueryZookeeperManager.java
>  PRE-CREATION 
>   itests/util/src/main/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java 
> 1b60a51ebd 
>   jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java db965e7a22 
>   
> ql/src/java/org/apache/hadoop/hive/ql/ddl/process/kill/KillQueriesOperation.java
>  afde1a4762 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java 
> 8becef1cd3 
>   service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
> 9e497545b5 
>   service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
> 277519cba5 
>   service/src/java/org/apache/hive/service/server/HiveServer2.java 181ea5d6d5 
>   service/src/java/org/apache/hive/service/server/KillQueryImpl.java 
> 883e32bd2e 
>   
> service/src/java/org/apache/hive/service/server/KillQueryZookeeperManager.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/common/ZooKeeperHiveHelper.java
>  71d8651712 
> 
> 
> Diff: https://reviews.apache.org/r/72276/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Peter Varga
> 
>



Re: Review Request 72283: HIVE-23076 Add batching for openTxn

2020-03-30 Thread Peter Vary via Review Board


> On márc. 30, 2020, 10:36 de, Denys Kuzmenko wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
> > Lines 603 (patched)
> > 
> >
> > Could we rename it to nextTxnId/firstTxnId? Not clear what first.

Sure, agree will be done


> On márc. 30, 2020, 10:36 de, Denys Kuzmenko wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
> > Line 615 (original), 617 (patched)
> > 
> >
> > I wasn't sure, maybe you know, can we modify meta-prop in runtime? if 
> > not maybe we should move batchSize to constructor?

There is a way. We should decide if we want to do it or, not.
Maybe we should handle is as part of HIVE-23093?

Your thoughts?


> On márc. 30, 2020, 10:36 de, Denys Kuzmenko wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
> > Lines 634 (patched)
> > 
> >
> > why (i-1) ? txnId starts from 1, right?

Updated as discussed


> On márc. 30, 2020, 10:36 de, Denys Kuzmenko wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
> > Line 641 (original), 648 (patched)
> > 
> >
> > Would be great if we could extract query construction to query constants

Let's talk about this.
Part of me prefers this in the place where we call the query so I can see the 
whole picture in one place, another part of me prefers it collected in a 
constant.
Moved the big one to the top, since the formatting/parsing and kept this one 
because of this is used only here, and only once


> On márc. 30, 2020, 10:36 de, Denys Kuzmenko wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
> > Line 666 (original), 673 (patched)
> > 
> >
> > Could we move queryStr to constants?

Let's talk about this.
Part of me prefers this in the place where we call the query so I can see the 
whole picture in one place, another part of me prefers it collected in a 
constant.
Moved the big one to the top, since the formatting/parsing and kept this one 
because of this is used only here, and only once


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72283/#review220103
---


On márc. 30, 2020, 9:51 de, Peter Vary wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72283/
> ---
> 
> (Updated márc. 30, 2020, 9:51 de)
> 
> 
> Review request for hive, Denys Kuzmenko and Marton Bod.
> 
> 
> Bugs: HIVE-23076
> https://issues.apache.org/jira/browse/HIVE-23076
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Add batching for openTxn request for better performance
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  74ef88545e 
> 
> 
> Diff: https://reviews.apache.org/r/72283/diff/1/
> 
> 
> Testing
> ---
> 
> Tested it locally against all of the supported RDBMS types:
> mysql no patch
> Operation  Mean Med  Min  Max  Err%
> openTxn0-1 2.0941.8211.4624.78631.06   
> openTxn0-2 2.4192.1611.7205.86732.43   
> openTxn0-102.5782.2891.9737.20428.74   
> openTxn0-100   6.9486.8355.25411.0315.91   
> openTxn0-1000  51.3150.4933.5693.1016.27   
> openTxn115k-1  26.9423.6922.24169.656.13   
> openTxn115k-2  25.2623.8122.4250.6816.90   
> openTxn115k-10 26.2024.2923.0160.7321.94   
> openTxn125k-10029.1428.1825.8143.6311.16 
> 
> mysql patch
> Operation  Mean Med  Min  Max  Err%
> openTxn0-1 2.2641.9641.6526.02335.59   
> openTxn0-2 2.5382.2891.9326.01329.41   
> openTxn0-102.9822.6412.1778.82932.54   
> openTxn0-100   6.7756.3865.01221.7327.10   
> openTxn0-1000  42.9642.9330.8961.9214.46   
> 

[DISCUSS] Drop Oracle 11g support in favor of 12c

2020-03-30 Thread Peter Vary
Hi Team,

With several team members we are working on optimizing ACID transaction related 
metastore calls.
Specifically aiming to have non-blocking openTxns calls (so 2 parallel openTxns 
can run without blocking each other) which could increase the throughput of 
Hive a lot.

Currently based on this document 
(https://cwiki.apache.org/confluence/display/Hive/AdminManual+Metastore+Administration
 
)
 we support:

MySQL   5.6.17  mysql
Postgres9.1.13  
postgres
 
Oracle  11g oracle  hive.metastore.orm.retrieveMapNullsAsEmptyStrings 

MS SQL Server   2008 R2 mssql

All of the databases above support one way or another to generate IDENTITY/AUTO 
INCREMENT values, but Oracle 11g. We could use different SQL queries for Oracle 
11g (like SEQUENCE + NEXT_VAL), but that would mean that we have to generate 
different queries for Oracle backward compatibility.

Oracle 11g Extended support due to expire in 31st December 2020, see: 
https://www.oracle.com/webfolder/community/oracle_database/3905940.html 

I do not see too much overlap between Oracle 11g, and Hive 4.0.0.

Since Oracle 12c supports IDENTITY columns as well, I propose that for Hive 
4.0.0 we support only Oracle 12c instead of adding quickly outdated complexity.

Any thoughts, ideas are welcome.

Thanks,
Peter







[jira] [Created] (HIVE-23105) HiveServer2 regression breaks getUpdateCount / getMoreResult API contract

2020-03-30 Thread Arnaud Linz (Jira)
Arnaud Linz created HIVE-23105:
--

 Summary: HiveServer2 regression breaks getUpdateCount / 
getMoreResult API contract
 Key: HIVE-23105
 URL: https://issues.apache.org/jira/browse/HIVE-23105
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 2.1.1
Reporter: Arnaud Linz


Migrating from CDH 5.16 (Hive 1.1.0+) to CDH 6.3 (Hive 2.1.1+) introduced a 
regression in the the JDBC driver.

It was detected in a "agnostic" jdbc handling service which works for several 
DBMS including Teradata, Impala, and the former Hive driver.

 

 

Statement JDBC Method :
{code:java}
 /** 
 *  Retrieves the current result as an update count; 
 *  if the result is a ResultSet object or there are no more 
results, -1 
 *  is returned. This method should be called only once per result. 
 * 
 * @return the current result as an update count; -1 if the current result 
is a 
 * ResultSet object or there are no more results 
 * @exception SQLException if a database access error occurs or 
 * this method is called on a closed Statement 
 * @see #execute 
 */ 
int getUpdateCount() throws SQLException; {code}
    Does not return -1 when it should it rather throws :

 
{code:java}
Caused by: java.sql.SQLException: 
org.apache.thrift.protocol.TProtocolException: Required field 'operationHandle' 
is unset! Struct:TGetOperationStatusReq(operationHandle:null) 
at 
org.apache.hive.jdbc.HiveStatement.waitForOperationToComplete(HiveStatement.java:395)
 
at 
org.apache.hive.jdbc.HiveStatement.getUpdateCount(HiveStatement.java:688) 
... 30 more 
Caused by: org.apache.thrift.protocol.TProtocolException: Required field 
'operationHandle' is unset! Struct:TGetOperationStatusReq(operationHandle:null) 
at 
org.apache.hive.service.rpc.thrift.TGetOperationStatusReq.validate(TGetOperationStatusReq.java:294)
 
at 
org.apache.hive.service.rpc.thrift.TCLIService$GetOperationStatus_args.validate(TCLIService.java:12587)
 
at 
org.apache.hive.service.rpc.thrift.TCLIService$GetOperationStatus_args$GetOperationStatus_argsStandardScheme.write(TCLIService.java:12644)
 
at 
org.apache.hive.service.rpc.thrift.TCLIService$GetOperationStatus_args$GetOperationStatus_argsStandardScheme.write(TCLIService.java:12613)
 
at 
org.apache.hive.service.rpc.thrift.TCLIService$GetOperationStatus_args.write(TCLIService.java:12564)
 
at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:71) 
at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) 
at 
org.apache.hive.service.rpc.thrift.TCLIService$Client.send_GetOperationStatus(TCLIService.java:461)
 
at 
org.apache.hive.service.rpc.thrift.TCLIService$Client.GetOperationStatus(TCLIService.java:453)
 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
at java.lang.reflect.Method.invoke(Method.java:498) 
at 
org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(HiveConnection.java:1415)
 
at com.sun.proxy.$Proxy20.GetOperationStatus(Unknown Source) 
at 
org.apache.hive.jdbc.HiveStatement.waitForOperationToComplete(HiveStatement.java:364)
 
... 33 more {code}
 

And method:
{code:java}
    /** 
 * Moves to this Statement object's next result, returns 
 * true if it is a ResultSet object, and 
 * implicitly closes any current ResultSet 
 * object(s) obtained with the method getResultSet. 
 * 
 * There are no more results when the following is true: 
 * {@code 
 * // stmt is a Statement object 
 * ((stmt.getMoreResults() == false) && (stmt.getUpdateCount() == -1)) 
 * } 
 * 
 * @return true if the next result is a ResultSet 
 * object; false if it is an update count or there are 
 * no more results 
 * @exception SQLException if a database access error occurs or 
 * this method is called on a closed Statement 
 * @see #execute 
 */ 
boolean getMoreResults() throws SQLException; 
{code}
Always returns true if the statement is not a result set whereas false is 
expected (especially since the javadoc's ((stmt.getMoreResults() == false) && 
(stmt.getUpdateCount() == -1)) throws an Exception...) 

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 72283: HIVE-23076 Add batching for openTxn

2020-03-30 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72283/#review220103
---



LGTM, just minor comments


standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 603 (patched)


Could we rename it to nextTxnId/firstTxnId? Not clear what first.



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Line 615 (original), 617 (patched)


I wasn't sure, maybe you know, can we modify meta-prop in runtime? if not 
maybe we should move batchSize to constructor?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 634 (patched)


why (i-1) ? txnId starts from 1, right?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Line 641 (original), 648 (patched)


Would be great if we could extract query construction to query constants



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Line 666 (original), 673 (patched)


Could we move queryStr to constants?


- Denys Kuzmenko


On March 30, 2020, 9:51 a.m., Peter Vary wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72283/
> ---
> 
> (Updated March 30, 2020, 9:51 a.m.)
> 
> 
> Review request for hive, Denys Kuzmenko and Marton Bod.
> 
> 
> Bugs: HIVE-23076
> https://issues.apache.org/jira/browse/HIVE-23076
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Add batching for openTxn request for better performance
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  74ef88545e 
> 
> 
> Diff: https://reviews.apache.org/r/72283/diff/1/
> 
> 
> Testing
> ---
> 
> Tested it locally against all of the supported RDBMS types:
> mysql no patch
> Operation  Mean Med  Min  Max  Err%
> openTxn0-1 2.0941.8211.4624.78631.06   
> openTxn0-2 2.4192.1611.7205.86732.43   
> openTxn0-102.5782.2891.9737.20428.74   
> openTxn0-100   6.9486.8355.25411.0315.91   
> openTxn0-1000  51.3150.4933.5693.1016.27   
> openTxn115k-1  26.9423.6922.24169.656.13   
> openTxn115k-2  25.2623.8122.4250.6816.90   
> openTxn115k-10 26.2024.2923.0160.7321.94   
> openTxn125k-10029.1428.1825.8143.6311.16 
> 
> mysql patch
> Operation  Mean Med  Min  Max  Err%
> openTxn0-1 2.2641.9641.6526.02335.59   
> openTxn0-2 2.5382.2891.9326.01329.41   
> openTxn0-102.9822.6412.1778.82932.54   
> openTxn0-100   6.7756.3865.01221.7327.10   
> openTxn0-1000  42.9642.9330.8961.9214.46   
> openTxn115k-1  24.2923.2722.4073.6221.64   
> openTxn115k-2  24.0523.5822.4628.605.651   
> openTxn115k-10 24.4824.0222.9429.976.075   
> openTxn125k-10027.9127.5125.7842.506.905   
> 
> postgres no patch
> Operation  Mean Med  Min  Max  Err%
> openTxn0-1 3.7342.8832.50611.4655.16   
> openTxn0-2 3.8343.1112.63315.5053.22   
> openTxn0-105.0054.1783.44916.8047.56   
> openTxn0-100   9.8237.7556.83379.3479.96   
> openTxn0-1000  75.5172.0358.62207.923.98   
> openTxn115k-1  21.7919.4518.4366.7629.10   
> openTxn115k-2  21.9120.1418.8851.4220.92   
> openTxn115k-10 22.4320.8519.3845.1818.58   
> openTxn125k-10027.7125.3623.1954.9921.46   
> 
> postgres patch
> Operation  Mean Med  Min  Max  Err%
> openTxn0-1 1.688 

[jira] [Created] (HIVE-23104) Minimize critical paths of TxnHandler::commitTxn and abortTxn

2020-03-30 Thread Marton Bod (Jira)
Marton Bod created HIVE-23104:
-

 Summary: Minimize critical paths of TxnHandler::commitTxn and 
abortTxn
 Key: HIVE-23104
 URL: https://issues.apache.org/jira/browse/HIVE-23104
 Project: Hive
  Issue Type: Improvement
Reporter: Marton Bod


Investigate whether any code sections in TxnHandler::commitTxn and abortTxn can 
be lifted out/executed async in order to reduce the overall execution time of 
these methods.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23103) Oracle statement batching

2020-03-30 Thread Peter Vary (Jira)
Peter Vary created HIVE-23103:
-

 Summary: Oracle statement batching
 Key: HIVE-23103
 URL: https://issues.apache.org/jira/browse/HIVE-23103
 Project: Hive
  Issue Type: Improvement
  Components: Transactions
Reporter: Peter Vary
Assignee: Peter Vary


Examine how to really get better performance for oracle statement batches.

[Oracle JDBC 
doc|https://docs.oracle.com/cd/E11882_01/java.112/e16548/oraperf.htm#JJDBC28752]
 describes:

{quote}The Oracle implementation of standard update batching does not implement 
true batching for generic statements and callable statements. Even though 
Oracle JDBC supports the use of standard batching for {{Statement}} and 
{{CallableStatement}} objects, you are unlikely to see performance improvement.
{quote}

I would look for connection properties to set, so it is handled anyway, or if 
not, then use:
{code}
begin
  query1;
  query2;
  query3;
end;
{code}
to we will have only a single roundtrip for the db.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Review Request 72283: HIVE-23076 Add batching for openTxn

2020-03-30 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72283/
---

Review request for hive, Denys Kuzmenko and Marton Bod.


Bugs: HIVE-23076
https://issues.apache.org/jira/browse/HIVE-23076


Repository: hive-git


Description
---

Add batching for openTxn request for better performance


Diffs
-

  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
 74ef88545e 


Diff: https://reviews.apache.org/r/72283/diff/1/


Testing
---

Tested it locally against all of the supported RDBMS types:
mysql no patch
Operation  Mean Med  Min  Max  Err%
openTxn0-1 2.0941.8211.4624.78631.06   
openTxn0-2 2.4192.1611.7205.86732.43   
openTxn0-102.5782.2891.9737.20428.74   
openTxn0-100   6.9486.8355.25411.0315.91   
openTxn0-1000  51.3150.4933.5693.1016.27   
openTxn115k-1  26.9423.6922.24169.656.13   
openTxn115k-2  25.2623.8122.4250.6816.90   
openTxn115k-10 26.2024.2923.0160.7321.94   
openTxn125k-10029.1428.1825.8143.6311.16 

mysql patch
Operation  Mean Med  Min  Max  Err%
openTxn0-1 2.2641.9641.6526.02335.59   
openTxn0-2 2.5382.2891.9326.01329.41   
openTxn0-102.9822.6412.1778.82932.54   
openTxn0-100   6.7756.3865.01221.7327.10   
openTxn0-1000  42.9642.9330.8961.9214.46   
openTxn115k-1  24.2923.2722.4073.6221.64   
openTxn115k-2  24.0523.5822.4628.605.651   
openTxn115k-10 24.4824.0222.9429.976.075   
openTxn125k-10027.9127.5125.7842.506.905   

postgres no patch
Operation  Mean Med  Min  Max  Err%
openTxn0-1 3.7342.8832.50611.4655.16   
openTxn0-2 3.8343.1112.63315.5053.22   
openTxn0-105.0054.1783.44916.8047.56   
openTxn0-100   9.8237.7556.83379.3479.96   
openTxn0-1000  75.5172.0358.62207.923.98   
openTxn115k-1  21.7919.4518.4366.7629.10   
openTxn115k-2  21.9120.1418.8851.4220.92   
openTxn115k-10 22.4320.8519.3845.1818.58   
openTxn125k-10027.7125.3623.1954.9921.46   

postgres patch
Operation  Mean Med  Min  Max  Err%
openTxn0-1 1.6881.4231.1307.81455.91   
openTxn0-2 1.9821.6621.3067.78647.13   
openTxn0-102.6802.5641.7615.06926.93   
openTxn0-100   8.3407.5355.35130.0037.97   
openTxn0-1000  41.7337.5524.38107.833.87   
openTxn115k-1  12.2411.6510.2126.2319.75   
openTxn115k-2  13.0711.8610.7668.9547.37   
openTxn115k-10 13.0312.2311.0654.8834.23   
openTxn125k-10015.6214.0312.46102.958.21   

Oracle no patch
Operation  Mean Med  Min  Max  Err%
openTxn0-1 14.8513.9111.5027.2619.49   
openTxn0-2 17.8917.1314.5627.0013.53   
openTxn0-1023.1221.3817.9167.3725.46   
openTxn0-100   114.199.0382.62214.035.61   
openTxn0-1000  4123 3952 3593 5790 15.96   
openTxn115k-1  16.7416.8814.0121.7514.52   
openTxn115k-2  20.2818.3416.5130.3423.09   
openTxn115k-10 22.4221.0719.8731.3915.74   
openTxn125k-10088.1387.8878.95100.47.990   

Oracle patch
Operation  Mean Med  Min  Max  Err%
openTxn0-1 15.8714.0712.2180.4448.32   
openTxn0-2 17.0616.1412.8033.5219.47   
openTxn0-1016.8915.6212.3437.9225.18   
openTxn0-100   18.9920.0315.6921.4610.72   

[jira] [Created] (HIVE-23102) waitForCompactionToFinish can potentially wait for too long

2020-03-30 Thread Jira
Zoltán Borók-Nagy created HIVE-23102:


 Summary: waitForCompactionToFinish can potentially wait for too 
long
 Key: HIVE-23102
 URL: https://issues.apache.org/jira/browse/HIVE-23102
 Project: Hive
  Issue Type: Bug
Reporter: Zoltán Borók-Nagy


AlterTableCompactOperation.waitForCompactionToFinish() has the following code 
fragment:

 
{noformat}
 //double wait time until 5min
 waitTimeMs = waitTimeMs*2;
 waitTimeMs = Math.max(waitTimeMs, waitTimeOut);
{noformat}
Based on the comment ("double wait time until 5min") I think it should use 
Math.min() instead of Math.max().

It also affects the runtime of Impala tests that use Hive compaction, because 
they hang for at least 5 mins each time we compact a table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 72276: HIVE-23084: Implement kill query in multiple HS2 environment

2020-03-30 Thread Adam Szita via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72276/#review220102
---



Looking pretty good overall, I just have a few questions/comments.


itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniLlapArrow.java
Lines 415 (patched)


Might be worth to extract the expected string as a constant?



itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithServiceDiscovery.java
Lines 94 (patched)


Can be private if not used elsewhere



service/src/java/org/apache/hive/service/server/KillQueryImpl.java
Line 150 (original), 176 (patched)


Shouldn't we return if there are no ops to kill? I think a subsequent 
killOperations() call here might throw an NPE.



service/src/java/org/apache/hive/service/server/KillQueryZookeeperManager.java
Lines 106 (patched)


nit: typo: namespace



service/src/java/org/apache/hive/service/server/KillQueryZookeeperManager.java
Lines 243 (patched)


I'm fine with not exposing these in HiveConf (that's already a monster) but 
we could at least extract these as constants in this class.



service/src/java/org/apache/hive/service/server/KillQueryZookeeperManager.java
Lines 449 (patched)


Shouldn't we clear the progress flag here?



service/src/java/org/apache/hive/service/server/KillQueryZookeeperManager.java
Lines 453 (patched)


this is a no-op here


- Adam Szita


On March 27, 2020, 10:08 a.m., Peter Varga wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72276/
> ---
> 
> (Updated March 27, 2020, 10:08 a.m.)
> 
> 
> Review request for hive and Adam Szita.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> KILL  command was implemented in:
> 
> https://issues.apache.org/jira/browse/HIVE-17483
> https://issues.apache.org/jira/browse/HIVE-20549
> But it is not working in an environment where service discovery is enabled 
> and more than one HS2 instance is running (except for manually sending the 
> kill query to all HS2 instance).
> 
> Solution:
> 
> If a HS2 instance can't kill a query locally, it should post a kill query 
> request to the Zookeeper
> Every HS2 should watch the Zookeeper for kill query requests and if its 
> running on that instance kill it
> Authorization of kill query should work the same
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 34df01e60e 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/BaseJdbcWithMiniLlap.java 
> 3973ec9270 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniLlapArrow.java
>  68a515ccbe 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithServiceDiscovery.java
>  PRE-CREATION 
>   
> itests/hive-unit/src/test/java/org/apache/hive/service/cli/thrift/TestMiniHS2StateWithNoZookeeper.java
>  99e681e5b2 
>   
> itests/hive-unit/src/test/java/org/apache/hive/service/server/TestKillQueryZookeeperManager.java
>  PRE-CREATION 
>   itests/util/src/main/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java 
> 1b60a51ebd 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java 
> 8becef1cd3 
>   service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
> 9e497545b5 
>   service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
> 277519cba5 
>   service/src/java/org/apache/hive/service/server/HiveServer2.java 181ea5d6d5 
>   service/src/java/org/apache/hive/service/server/KillQueryImpl.java 
> 883e32bd2e 
>   
> service/src/java/org/apache/hive/service/server/KillQueryZookeeperManager.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/common/ZooKeeperHiveHelper.java
>  71d8651712 
> 
> 
> Diff: https://reviews.apache.org/r/72276/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Peter Varga
> 
>



Review Request 72282: HIVE-23101

2020-03-30 Thread Krisztian Kasa

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72282/
---

Review request for hive, Jesús Camacho Rodríguez and Zoltan Haindrich.


Bugs: HIVE-23101
https://issues.apache.org/jira/browse/HIVE-23101


Repository: hive-git


Description
---

Fix topnkey_grouping_sets


Diffs
-

  itests/src/test/resources/testconfiguration.properties 3510016c07 
  ql/src/test/queries/clientpositive/topnkey_grouping_sets.q e8c5401ee5 
  ql/src/test/results/clientpositive/llap/topnkey_grouping_sets.q.out 
41a8c3a5b2 
  ql/src/test/results/clientpositive/topnkey_grouping_sets.q.out 27998efafc 


Diff: https://reviews.apache.org/r/72282/diff/1/


Testing
---

mvn test -Dtest.output.overwrite -DskipSparkTests 
-Dtest=TestMiniLlapLocalCliDriver -Dqfile=topnkey_grouping_sets.q -pl 
itests/qtest -Pitests


Thanks,

Krisztian Kasa



Re: [VOTE] Should we release Hive Storage API 2.7.2-rc0?

2020-03-30 Thread Zoltan Haindrich

+1

* verified checksum/etc
* built and run tests locally
* built orc/master against it
* there doesn't seem to be a staged nexus repo for this - but it seems like earlier releases also doesn't had that; meanwhile 
https://repo.maven.apache.org/maven2/org/apache/hive/hive-storage-api/2.7.1/ seems to have them ; I assume it will be also uploaded there along with sources/etc



On 3/24/20 9:33 PM, Jesus Camacho Rodriguez wrote:

All,

I'd like to make a storage-api release with HIVE-22959
 in it.

Should we release the following artifacts as Hive Storage API 2.7.2?

tar: http://home.apache.org/~jcamacho/hive-storage-2.7.2/
tag: https://github.com/apache/hive/releases/tag/storage-release-2.7.2-rc0
jiras: https://issues.apache.org/jira/projects/HIVE/versions/12347828

Thanks!

-Jesús



[jira] [Created] (HIVE-23101) Fix topnkey_grouping_sets

2020-03-30 Thread Krisztian Kasa (Jira)
Krisztian Kasa created HIVE-23101:
-

 Summary: Fix topnkey_grouping_sets
 Key: HIVE-23101
 URL: https://issues.apache.org/jira/browse/HIVE-23101
 Project: Hive
  Issue Type: Sub-task
Reporter: Krisztian Kasa
Assignee: Krisztian Kasa
 Fix For: 4.0.0


Test *topnkey_grouping_sets* fails intermittently.

Queries which project 2 columns but order by only one of them can have more 
than one good result set:
{code}
CREATE TABLE t_test_grouping_sets(
  a int,
  b int,
  c int
);

INSERT INTO t_test_grouping_sets VALUES
(NULL, NULL, NULL),
(5, 2, 3),
(10, 11, 12),
(NULL, NULL, NULL),
(NULL, NULL, NULL),
(6, 2, 1),
(7, 8, 4), (7, 8, 4), (7, 8, 4),
(5, 1, 2), (5, 1, 2), (5, 1, 2),
(NULL, NULL, NULL);

SELECT a, b FROM t_test_grouping_sets GROUP BY GROUPING SETS ((a, b), (a), (b), 
()) ORDER BY a LIMIT 10;
{code}
{code}
5   NULL
5   2
5   1
6   2
6   NULL
7   8
7   NULL
10  NULL
10  11
NULL1
{code}
{code}
5   NULL
5   2
5   1
6   2
6   NULL
7   8
7   NULL
10  NULL
10  11
NULLNULL
{code}
Since we don't order by *b* both result sets are valid.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Review Request 72281: HIVE-22971: Eliminate file rename in insert-only compactor

2020-03-30 Thread Karen Coppage via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72281/
---

Review request for hive and Laszlo Pinter.


Bugs: HIVE-22971
https://issues.apache.org/jira/browse/HIVE-22971


Repository: hive-git


Description
---

File rename is expensive for object stores, so MM (insert-only) compaction 
should skip that step when committing and write directly to base_x_cZ or 
delta_x_y_cZ.

This also fixes the issue that for MM QB compaction the temp tables were stored 
under the table directory, and these temp dirs were never cleaned up.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 34df01e60e 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
 95fa6641f2 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
 9659a3f048 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
543ec0b991 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MajorQueryCompactor.java 
f47c23a6de 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MinorQueryCompactor.java 
1bf0beea40 
  
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMajorQueryCompactor.java 
114b6f7a74 
  
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMinorQueryCompactor.java 
383891bfad 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java 
7f3ccfa04e 
  
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactorFactory.java 
6542eef58a 


Diff: https://reviews.apache.org/r/72281/diff/1/


Testing
---


Thanks,

Karen Coppage