[
https://issues.apache.org/jira/browse/HIVE-24545?focusedWorklogId=687338&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-687338
]
ASF GitHub Bot logged work on HIVE-24545:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 29/Nov/21 14:18
Start Date: 29/Nov/21 14:18
Worklog Time Spent: 10m
Work Description: kgyrtkirk commented on a change in pull request #1789:
URL: https://github.com/apache/hive/pull/1789#discussion_r758402670
##########
File path: jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java
##########
@@ -587,6 +587,26 @@ public int getUpdateCount() throws SQLException {
return (int) numModifiedRows;
}
+ @Override
+ public long getLargeUpdateCount() throws SQLException {
+ checkConnection("getLargeUpdateCount");
+ /**
+ * Poll on the operation status, till the operation is complete. We want
to ensure that since a
+ * client might end up using executeAsync and then call this to check if
the query run is
+ * finished.
+ */
+ long numModifiedRows = -1L;
+ TGetOperationStatusResp resp = waitForOperationToComplete();
+ if (resp != null) {
+ numModifiedRows = resp.getNumModifiedRows();
+ }
+ if (numModifiedRows == -1L || numModifiedRows > Long.MAX_VALUE) {
+ LOG.warn("Invalid number of updated rows: {}", numModifiedRows);
+ return -1;
Review comment:
I'm not sure if returning `-1` is the best way to signal this
problems... especially in the old `getUpdateCount` method
##########
File path: jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java
##########
@@ -587,6 +587,26 @@ public int getUpdateCount() throws SQLException {
return (int) numModifiedRows;
}
+ @Override
+ public long getLargeUpdateCount() throws SQLException {
+ checkConnection("getLargeUpdateCount");
+ /**
+ * Poll on the operation status, till the operation is complete. We want
to ensure that since a
+ * client might end up using executeAsync and then call this to check if
the query run is
+ * finished.
+ */
+ long numModifiedRows = -1L;
+ TGetOperationStatusResp resp = waitForOperationToComplete();
+ if (resp != null) {
+ numModifiedRows = resp.getNumModifiedRows();
+ }
+ if (numModifiedRows == -1L || numModifiedRows > Long.MAX_VALUE) {
Review comment:
is `-2` valid?
we could reuse the newly implemented method in the old `getUpdateCount` to
reduce code duplication
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 687338)
Time Spent: 50m (was: 40m)
> jdbc.HiveStatement: Number of rows is greater than Integer.MAX_VALUE
> --------------------------------------------------------------------
>
> Key: HIVE-24545
> URL: https://issues.apache.org/jira/browse/HIVE-24545
> Project: Hive
> Issue Type: Bug
> Reporter: László Bodor
> Assignee: László Bodor
> Priority: Minor
> Labels: pull-request-available
> Time Spent: 50m
> Remaining Estimate: 0h
>
> I found this while IOW on TPCDS 10TB:
> {code}
> ----------------------------------------------------------------------------------------------
> VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING
> FAILED KILLED
> ----------------------------------------------------------------------------------------------
> Map 1 .......... llap SUCCEEDED 4210 4210 0 0
> 0 362
> Reducer 2 ...... llap SUCCEEDED 101 101 0 0
> 0 2
> Reducer 3 ...... llap SUCCEEDED 1009 1009 0 0
> 0 1
> ----------------------------------------------------------------------------------------------
> VERTICES: 03/03 [==========================>>] 100% ELAPSED TIME: 12613.62 s
> ----------------------------------------------------------------------------------------------
> 20/12/16 01:37:36 [main]: WARN jdbc.HiveStatement: Number of rows is greater
> than Integer.MAX_VALUE
> {code}
> my scenario was:
> {code}
> set hive.exec.max.dynamic.partitions=2000;
> drop table if exists test_sales_2;
> create table test_sales_2 like
> tpcds_bin_partitioned_acid_orc_10000.store_sales;
> insert overwrite table test_sales_2 select * from
> tpcds_bin_partitioned_acid_orc_10000.store_sales where ss_sold_date_sk >
> 2451868;
> {code}
> regarding affected row numbers:
> {code}
> select count(*) from tpcds_bin_partitioned_acid_orc_10000.store_sales where
> ss_sold_date_sk > 2451868;
> +--------------+
> | _c0 |
> +--------------+
> | 12287871907 |
> +--------------+
> {code}
> I guess we should switch to long
--
This message was sent by Atlassian Jira
(v8.20.1#820001)