[jira] [Updated] (HBASE-28595) Losing exception from scan RPC can lead to partial results

2024-05-18 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-28595:
--
Component/s: (was: Client)

> Losing exception from scan RPC can lead to partial results
> --
>
> Key: HBASE-28595
> URL: https://issues.apache.org/jira/browse/HBASE-28595
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, Scanners
>Reporter: Csaba Ringhofer
>Assignee: Csaba Ringhofer
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 2.4.18, 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.9
>
>
> This was discovered in Apache Impala using HBase 2.2 based branch hbase 
> client and server. It is not clear yet whether other branches are also 
> affected.
> The issue happens if the server side of the scan throws an exception and 
> closes the scanner, but at the same time, the client gets an rpc connection 
> closed error and doesn't process the exception sent by the server. Client 
> then thinks it got a network error, which leads to retrying the RPC instead 
> of opening a new scanner. But then when the client retry reaches the server, 
> the server returns an empty ScanResponse instead of an error, leading to 
> closing the scanner on client side without returning any error.
> A few pointers to critical parts:
> region server:
> 1st call throws exception leading to closing (but not deleting) scanner:
> [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3539]
> 2nd call (retry of 1st) returns empty results:
> [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3403]
> client:
> some exceptions are handled as non-retriable at RPC level and are only 
> handled through opening a new scanner:
> [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java#L214]
> [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java#L367]
> This mechanism in the client only works if it gets the exception from the 
> server. If there are connection issues during the RPC then the client won't 
> really know the state of the server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28595) Losing exception from scan RPC can lead to partial results

2024-05-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HBASE-28595:
---
Labels: pull-request-available  (was: )

> Losing exception from scan RPC can lead to partial results
> --
>
> Key: HBASE-28595
> URL: https://issues.apache.org/jira/browse/HBASE-28595
> Project: HBase
>  Issue Type: Bug
>  Components: Client, regionserver, Scanners
>Reporter: Csaba Ringhofer
>Assignee: Csaba Ringhofer
>Priority: Critical
>  Labels: pull-request-available
>
> This was discovered in Apache Impala using HBase 2.2 based branch hbase 
> client and server. It is not clear yet whether other branches are also 
> affected.
> The issue happens if the server side of the scan throws an exception and 
> closes the scanner, but at the same time, the client gets an rpc connection 
> closed error and doesn't process the exception sent by the server. Client 
> then thinks it got a network error, which leads to retrying the RPC instead 
> of opening a new scanner. But then when the client retry reaches the server, 
> the server returns an empty ScanResponse instead of an error, leading to 
> closing the scanner on client side without returning any error.
> A few pointers to critical parts:
> region server:
> 1st call throws exception leading to closing (but not deleting) scanner:
> [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3539]
> 2nd call (retry of 1st) returns empty results:
> [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3403]
> client:
> some exceptions are handled as non-retriable at RPC level and are only 
> handled through opening a new scanner:
> [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java#L214]
> [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java#L367]
> This mechanism in the client only works if it gets the exception from the 
> server. If there are connection issues during the RPC then the client won't 
> really know the state of the server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28595) Losing exception from scan RPC can lead to partial results

2024-05-15 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28595:
-
Description: 
This was discovered in Apache Impala using HBase 2.2 based branch hbase client 
and server. It is not clear yet whether other branches are also affected.

The issue happens if the server side of the scan throws an exception and closes 
the scanner, but at the same time, the client gets an rpc connection closed 
error and doesn't process the exception sent by the server. Client then thinks 
it got a network error, which leads to retrying the RPC instead of opening a 
new scanner. But then when the client retry reaches the server, the server 
returns an empty ScanResponse instead of an error, leading to closing the 
scanner on client side without returning any error.

A few pointers to critical parts:
region server:
1st call throws exception leading to closing (but not deleting) scanner:
[https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3539]
2nd call (retry of 1st) returns empty results:
[https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3403]

client:
some exceptions are handled as non-retriable at RPC level and are only handled 
through opening a new scanner:
[https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java#L214]
[https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java#L367]

This mechanism in the client only works if it gets the exception from the 
server. If there are connection issues during the RPC then the client won't 
really know the state of the server.

  was:
This was discovered in Apache Impala using HBase 2.2 based branch hbase client 
and server. It is not clear yet whether other branches are also affected.

The issue happens if the server side of the scan throws an exception and closes 
the scanner, but the client doesn't get the exact exception and it treats it as 
network error, which leads to retrying the RPC instead of opening a new 
scanner. In this case  the server returns an empty ScanResponse instead of an 
error when the RPC is retried, leading to closing the scanner on client side 
without returning any error.

A few pointers to critical parts:
region server:
1st call throws exception leading to closing (but not deleting) scanner:
https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3539
2nd call (retry of 1st) returns empty results:
https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3403

client:
some exceptions are handled as non-retriable at RPC level and are only handled 
through opening a new scanner:
https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java#L214
https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java#L367

This mechanism in the client only works if it gets the exception from the 
server. If there are connection issues during the RPC then the client won't 
really know the state of the server.


> Losing exception from scan RPC can lead to partial results
> --
>
> Key: HBASE-28595
> URL: https://issues.apache.org/jira/browse/HBASE-28595
> Project: HBase
>  Issue Type: Bug
>  Components: Client, regionserver, Scanners
>Reporter: Csaba Ringhofer
>Assignee: Csaba Ringhofer
>Priority: Critical
>
> This was discovered in Apache Impala using HBase 2.2 based branch hbase 
> client and server. It is not clear yet whether other branches are also 
> affected.
> The issue happens if the server side of the scan throws an exception and 
> closes the scanner, but at the same time, the client gets an rpc connection 
> closed error and doesn't process the exception sent by the server. Client 
> then thinks it got a network error, which leads to retrying the RPC instead 
> of opening a new scanner. But then when the client retry reaches the server, 
> the server returns an empty ScanResponse instead of an error, leading to 
> closing the scanner on client side without returning any error.
> A few pointers to critical parts:
> region server:
> 1st call throws 

[jira] [Updated] (HBASE-28595) Losing exception from scan RPC can lead to partial results

2024-05-15 Thread Csaba Ringhofer (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer updated HBASE-28595:

Summary: Losing exception from scan RPC can lead to partial results  (was: 
Loosing exception from scan RPC can lead to partial results)

> Losing exception from scan RPC can lead to partial results
> --
>
> Key: HBASE-28595
> URL: https://issues.apache.org/jira/browse/HBASE-28595
> Project: HBase
>  Issue Type: Bug
>  Components: Client, regionserver, Scanners
>Reporter: Csaba Ringhofer
>Priority: Critical
>
> This was discovered in Apache Impala using HBase 2.2 based branch hbase 
> client and server. It is not clear yet whether other branches are also 
> affected.
> The issue happens if the server side of the scan throws an exception and 
> closes the scanner, but the client doesn't get the exact exception and it 
> treats it as network error, which leads to retrying the RPC instead of 
> opening a new scanner. In this case  the server returns an empty ScanResponse 
> instead of an error when the RPC is retried, leading to closing the scanner 
> on client side without returning any error.
> A few pointers to critical parts:
> region server:
> 1st call throws exception leading to closing (but not deleting) scanner:
> https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3539
> 2nd call (retry of 1st) returns empty results:
> https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3403
> client:
> some exceptions are handled as non-retriable at RPC level and are only 
> handled through opening a new scanner:
> https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java#L214
> https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java#L367
> This mechanism in the client only works if it gets the exception from the 
> server. If there are connection issues during the RPC then the client won't 
> really know the state of the server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)