[
https://issues.apache.org/jira/browse/SPARK-52673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alex Khakhlyuk updated SPARK-52673:
-----------------------------------
Description:
Spark Connect Client has a set of retry policies that specify which errors
coming from the Server can be retried.
This change adds the capability for the Spark Connect Client to use
server-provided retry information according to the grpc standards:
[https://github.com/googleapis/googleapis/blob/master/google/rpc/error_details.proto#L91.]
The server can include `RetryInfo` gRPC message containing `retry_delay` field
in its error response. The Client will now use `RetryInfo` message to classify
the error as retriable and will use `retry_delay` to calculate the next time to
wait. This behavior is in line with the gRPC standard for client-server
communication.
The change is needed for two reasons:
1) If the Server is under heavy load or a task takes more time, it can tell the
client to wait longer using the `retry_delay` field.
2) If the Server needs to introduce a new retryable error, it can simply
include `RetryInfo` in the error message. The error message will be retried
automatically by the client. No changes to the client-side retry policies are
needed to retry the new error.
The changes should be introduced both to the Python and Scala clients.
was:
Spark Connect Client has a set of retry policies that specify which errors
coming from the Server can be retried.
This change adds the capability for the Spark Connect Client to use
server-provided retry information. The server can include `RetryInfo` gRPC
message containing `retry_delay` field in its error response. The Client will
now use `RetryInfo` message to classify the error as retriable and will use
`retry_delay` to calculate the next time to wait. This behavior is in line with
the gRPC standard for client-server communication.
The change is needed for two reasons:
1) If the Server is under heavy load or a task takes more time, it can tell the
client to wait longer using the `retry_delay` field.
2) If the Server needs to introduce a new retryable error, it can simply
include `RetryInfo` in the error message. The error message will be retried
automatically by the client. No changes to the client-side retry policies are
needed to retry the new error.
The changes should be introduced both to the Python and Scala clients.
> [CONNECT][CLIENT] Add grpc RetryInfo handling to Spark Connect retry policies
> -----------------------------------------------------------------------------
>
> Key: SPARK-52673
> URL: https://issues.apache.org/jira/browse/SPARK-52673
> Project: Spark
> Issue Type: Improvement
> Components: Connect
> Affects Versions: 4.1.0
> Reporter: Alex Khakhlyuk
> Priority: Major
> Labels: pull-request-available
> Original Estimate: 336h
> Remaining Estimate: 336h
>
> Spark Connect Client has a set of retry policies that specify which errors
> coming from the Server can be retried.
> This change adds the capability for the Spark Connect Client to use
> server-provided retry information according to the grpc standards:
> [https://github.com/googleapis/googleapis/blob/master/google/rpc/error_details.proto#L91.]
> The server can include `RetryInfo` gRPC message containing `retry_delay`
> field in its error response. The Client will now use `RetryInfo` message to
> classify the error as retriable and will use `retry_delay` to calculate the
> next time to wait. This behavior is in line with the gRPC standard for
> client-server communication.
> The change is needed for two reasons:
> 1) If the Server is under heavy load or a task takes more time, it can tell
> the client to wait longer using the `retry_delay` field.
> 2) If the Server needs to introduce a new retryable error, it can simply
> include `RetryInfo` in the error message. The error message will be retried
> automatically by the client. No changes to the client-side retry policies are
> needed to retry the new error.
>
> The changes should be introduced both to the Python and Scala clients.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]