Ma77Ball opened a new issue, #5718: URL: https://github.com/apache/texera/issues/5718
### What happened? Python UDFs that read a dataset file go through `DatasetFileDocument.get_presigned_url` and `read_file`, both of which call `requests.get(...)` with no `timeout`. Python's `requests` defaults to no timeout, so if the file-service or object store accepts the TCP connection but then stalls (hung load balancer, half-open socket, network black-hole), the call blocks forever and the worker thread hangs with no recovery. There is also no retry for transient transport blips. ### How to reproduce? 1. Configure a UDF that reads a dataset file via `DatasetFileDocument`. 2. Point the presign endpoint (or the returned presigned URL host) at an endpoint that accepts the connection then never responds (e.g. a firewalled/black-holed host). 3. Run the workflow. Expected: the read fails after a bounded wait. Actual: `requests.get` blocks indefinitely and the worker hangs. ### Version/Branch main (commit 13b584ce0) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
