[ 
https://issues.apache.org/jira/browse/HIVE-24545?focusedWorklogId=687338&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-687338
 ]

ASF GitHub Bot logged work on HIVE-24545:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 29/Nov/21 14:18
            Start Date: 29/Nov/21 14:18
    Worklog Time Spent: 10m 
      Work Description: kgyrtkirk commented on a change in pull request #1789:
URL: https://github.com/apache/hive/pull/1789#discussion_r758402670



##########
File path: jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java
##########
@@ -587,6 +587,26 @@ public int getUpdateCount() throws SQLException {
     return (int) numModifiedRows;
   }
 
+  @Override
+  public long getLargeUpdateCount() throws SQLException {
+    checkConnection("getLargeUpdateCount");
+    /**
+     * Poll on the operation status, till the operation is complete. We want 
to ensure that since a
+     * client might end up using executeAsync and then call this to check if 
the query run is
+     * finished.
+     */
+    long numModifiedRows = -1L;
+    TGetOperationStatusResp resp = waitForOperationToComplete();
+    if (resp != null) {
+      numModifiedRows = resp.getNumModifiedRows();
+    }
+    if (numModifiedRows == -1L || numModifiedRows > Long.MAX_VALUE) {
+      LOG.warn("Invalid number of updated rows: {}", numModifiedRows);
+      return -1;

Review comment:
       I'm not sure if returning `-1` is the best way to signal this 
problems... especially in the old `getUpdateCount` method

##########
File path: jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java
##########
@@ -587,6 +587,26 @@ public int getUpdateCount() throws SQLException {
     return (int) numModifiedRows;
   }
 
+  @Override
+  public long getLargeUpdateCount() throws SQLException {
+    checkConnection("getLargeUpdateCount");
+    /**
+     * Poll on the operation status, till the operation is complete. We want 
to ensure that since a
+     * client might end up using executeAsync and then call this to check if 
the query run is
+     * finished.
+     */
+    long numModifiedRows = -1L;
+    TGetOperationStatusResp resp = waitForOperationToComplete();
+    if (resp != null) {
+      numModifiedRows = resp.getNumModifiedRows();
+    }
+    if (numModifiedRows == -1L || numModifiedRows > Long.MAX_VALUE) {

Review comment:
       is `-2` valid?
   we could reuse the newly implemented method in the old `getUpdateCount` to 
reduce code duplication




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 687338)
    Time Spent: 50m  (was: 40m)

> jdbc.HiveStatement: Number of rows is greater than Integer.MAX_VALUE
> --------------------------------------------------------------------
>
>                 Key: HIVE-24545
>                 URL: https://issues.apache.org/jira/browse/HIVE-24545
>             Project: Hive
>          Issue Type: Bug
>            Reporter: László Bodor
>            Assignee: László Bodor
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> I found this while IOW on TPCDS 10TB:
> {code}
> ----------------------------------------------------------------------------------------------
>         VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED
> ----------------------------------------------------------------------------------------------
> Map 1 ..........      llap     SUCCEEDED   4210       4210        0        0  
>      0     362
> Reducer 2 ......      llap     SUCCEEDED    101        101        0        0  
>      0       2
> Reducer 3 ......      llap     SUCCEEDED   1009       1009        0        0  
>      0       1
> ----------------------------------------------------------------------------------------------
> VERTICES: 03/03  [==========================>>] 100%  ELAPSED TIME: 12613.62 s
> ----------------------------------------------------------------------------------------------
> 20/12/16 01:37:36 [main]: WARN jdbc.HiveStatement: Number of rows is greater 
> than Integer.MAX_VALUE
> {code}
> my scenario was:
> {code}
> set hive.exec.max.dynamic.partitions=2000;
> drop table if exists test_sales_2;
> create table test_sales_2 like 
> tpcds_bin_partitioned_acid_orc_10000.store_sales;
> insert overwrite table test_sales_2 select * from 
> tpcds_bin_partitioned_acid_orc_10000.store_sales where ss_sold_date_sk > 
> 2451868;
> {code}
> regarding affected row numbers:
> {code}
> select count(*) from tpcds_bin_partitioned_acid_orc_10000.store_sales where 
> ss_sold_date_sk > 2451868;
> +--------------+
> |     _c0      |
> +--------------+
> | 12287871907  |
> +--------------+
> {code}
> I guess we should switch to long



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to