cdkrot opened a new pull request, #43314:
URL: https://github.com/apache/spark/pull/43314

   ### What changes were proposed in this pull request?
   
   1. Make add_artifact request idempotent i.e. subsequent requests will 
succeed if the same content is provided. This makes retrying more safe.
   2. Fix existing error handling mechanism:
   
   Before the update the error looks like that
   ```
   >>> spark.addArtifact("tmp.py", pyfile=True)
   >>> spark.addArtifact("tmp.py", pyfile=True) # fails
   2023-10-09 15:55:30,352 82873 DEBUG __iter__ Will retry call after 
60014.279746934524 ms sleep (error: <_InactiveRpcError of RPC that terminated 
with:
           status = StatusCode.UNKNOWN
           details = ""
           debug_error_string = "UNKNOWN:Error received from peer  
{grpc_message:"", grpc_status:2, 
created_time:"2023-10-09T15:55:30.351541+02:00"}"
   >)
   (this is also getting retried)
   ```
   
   Now it looks:
   
   ```
   >>> spark.addArtifact("abc.sh", file=True)
   >>> spark.addArtifact("abc.sh", file=True) # passes
   >>> # update file's content
   >>> spark.addArtifact("abc.sh", file=True) # now fails
   Traceback (most recent call last):
   [...]
   grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated 
with:
           status = StatusCode.ALREADY_EXISTS
           details = "Duplicate Artifact: files/env.sh. Artifacts cannot be 
overwritten."
           debug_error_string = "UNKNOWN:Error received from peer  
{grpc_message:"Duplicate Artifact: files/abc.sh. Artifacts cannot be 
overwritten.", grpc_status:6, created_time:"2023-10-10T01:25:38.231317+02:00"}"
   >
   
   ```
   
   ### Why are the changes needed?
   
   Makes retrying more robust, adds user-friendly error (see above).
   
   ### Does this PR introduce _any_ user-facing change?
   
   Mostly internal improvements
   
   
   ### How was this patch tested?
   Unit testing, testing against server
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to