cdkrot opened a new pull request, #42949:
URL: https://github.com/apache/spark/pull/42949
### What changes were proposed in this pull request?
Add error logging into `addArtifact` (see example in "How this is tested).
The logging code is moved into separate file to avoid circular dependency.
### Why are the changes needed?
Currently, in case `addArtifact` is executed with the file which doesn't
exist, the user gets cryptic error
```grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that
terminated with:
status = StatusCode.UNKNOWN
details = "Exception iterating requests!"
debug_error_string = "None"
>
```
Which is impossible to debug without deep digging into the subject.
This happens because addArtifact is implemented as client-side streaming and
the actual error happens during grpc consuming iterator generating requests.
Unfortunately grpc doesn't print any debug information for user to understand
the problem.
### Does this PR introduce _any_ user-facing change?
Additional logging which is opt-in same way as before with
`SPARK_CONNECT_LOG_LEVEL` environment variable.
### How was this patch tested?
```
>>> s.addArtifact("XYZ", file=True)
2023-09-15 17:06:40,078 11789 ERROR _create_requests Failed to execute
addArtifact: [Errno 2] No such file or directory:
'/Users/alice.sayutina/apache_spark/python/XYZ'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File
"/Users/alice.sayutina/apache_spark/python/pyspark/sql/connect/session.py",
line 743, in addArtifacts
self._client.add_artifacts(*path, pyfile=pyfile, archive=archive,
file=file)
[....]
File
"/Users/alice.sayutina/oss-venv/lib/python3.11/site-packages/grpc/_channel.py",
line 910, in _end_unary_response_blocking
raise _InactiveRpcError(state) # pytype: disable=not-instantiable
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated
with:
status = StatusCode.UNKNOWN
details = "Exception iterating requests!"
debug_error_string = "None"
>
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]