uros-b commented on code in PR #56497:
URL: https://github.com/apache/spark/pull/56497#discussion_r3430520073


##########
python/pyspark/sql/connect/client/reattach.py:
##########
@@ -310,6 +339,19 @@ def _call_iter(self, iter_fun: Callable) -> Any:
                 )
                 self._iterator = None
                 raise RetryException() from e
+            elif (
+                e.code() == grpc.StatusCode.PERMISSION_DENIED
+                and not self._permission_denied_retried
+            ):
+                # Treat the first mid-stream PERMISSION_DENIED as a 
token-expiry signal; reattach
+                # once so the metadata provider can produce a fresh credential.
+                logger.debug(
+                    f"PERMISSION_DENIED on stream for operation 
{self._operation_id}; "
+                    f"allowing one reattach with refreshed metadata."
+                )
+                self._permission_denied_retried = True
+                self._iterator = None
+                raise RetryException() from e

Review Comment:
   Posting a drive-by note here: PERMISSION_DENIED raises RetryException, which 
retries with no backoff and never consumes retry budget (retries.py:233,258). 
Combined with the latch reset on the OPERATION_NOT_FOUND path (line 330), a 
token that can't be refreshed to a valid one spins forever with no forward 
progress: ExecutePlan → PERMISSION_DENIED → ReattachExecute → 
OPERATION_NOT_FOUND → fresh ExecutePlan (resets latch) → PERMISSION_DENIED → …, 
hammering token-refresh + 2 RPCs per lap.
   
   The mock tests only terminate because the scripted ops run out. This is also 
a regression - today this case fails fast. Suggest a hard per-iterator cap 
(counter, not a boolean reset on every response/OPERATION_NOT_FOUND) and/or 
real backoff. cc @HyukjinKwon @zhengruifeng who have more context to 
investigate this further.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to