[
https://issues.apache.org/jira/browse/SPARK-44833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dongjoon Hyun closed SPARK-44833.
---------------------------------
> Spark Connect reattach when initial ExecutePlan didn't reach server doing too
> eager Reattach
> --------------------------------------------------------------------------------------------
>
> Key: SPARK-44833
> URL: https://issues.apache.org/jira/browse/SPARK-44833
> Project: Spark
> Issue Type: Improvement
> Components: Connect
> Affects Versions: 3.5.0
> Reporter: Juliusz Sompolski
> Assignee: Juliusz Sompolski
> Priority: Major
> Fix For: 3.5.1, 4.0.0
>
>
> In
> {code:java}
> case ex: StatusRuntimeException
> if Option(StatusProto.fromThrowable(ex))
> .exists(_.getMessage.contains("INVALID_HANDLE.OPERATION_NOT_FOUND")) =>
> if (lastReturnedResponseId.isDefined) {
> throw new IllegalStateException(
> "OPERATION_NOT_FOUND on the server but responses were already received
> from it.",
> ex)
> }
> // Try a new ExecutePlan, and throw upstream for retry.
> -> iter = rawBlockingStub.executePlan(initialRequest)
> -> throw new GrpcRetryHandler.RetryException {code}
> we call executePlan, and throw RetryException to have an exception handled
> upstream.
> Then it goes to
> {code:java}
> retry {
> if (firstTry) {
> // on first try, we use the existing iter.
> firstTry = false
> } else {
> // on retry, the iter is borked, so we need a new one
> -> iter = rawBlockingStub.reattachExecute(createReattachExecuteRequest())
> } {code}
> and because it's not firstTry, immediately does reattach.
> This causes no failure - the reattach will work and attach to the query, the
> original executePlan will get detached. But it could be improved.
> Same issue is also present in python reattach.py.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]