cdkrot opened a new pull request, #43314:
URL: https://github.com/apache/spark/pull/43314
### What changes were proposed in this pull request?
1. Make add_artifact request idempotent i.e. subsequent requests will
succeed if the same content is provided. This makes retrying more safe.
2. Fix existing error handling mechanism:
Before the update the error looks like that
```
>>> spark.addArtifact("tmp.py", pyfile=True)
>>> spark.addArtifact("tmp.py", pyfile=True) # fails
2023-10-09 15:55:30,352 82873 DEBUG __iter__ Will retry call after
60014.279746934524 ms sleep (error: <_InactiveRpcError of RPC that terminated
with:
status = StatusCode.UNKNOWN
details = ""
debug_error_string = "UNKNOWN:Error received from peer
{grpc_message:"", grpc_status:2,
created_time:"2023-10-09T15:55:30.351541+02:00"}"
>)
(this is also getting retried)
```
Now it looks:
```
>>> spark.addArtifact("abc.sh", file=True)
>>> spark.addArtifact("abc.sh", file=True) # passes
>>> # update file's content
>>> spark.addArtifact("abc.sh", file=True) # now fails
Traceback (most recent call last):
[...]
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated
with:
status = StatusCode.ALREADY_EXISTS
details = "Duplicate Artifact: files/env.sh. Artifacts cannot be
overwritten."
debug_error_string = "UNKNOWN:Error received from peer
{grpc_message:"Duplicate Artifact: files/abc.sh. Artifacts cannot be
overwritten.", grpc_status:6, created_time:"2023-10-10T01:25:38.231317+02:00"}"
>
```
### Why are the changes needed?
Makes retrying more robust, adds user-friendly error (see above).
### Does this PR introduce _any_ user-facing change?
Mostly internal improvements
### How was this patch tested?
Unit testing, testing against server
### Was this patch authored or co-authored using generative AI tooling?
No
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]