Hi Folks,

Manas here from Data Platform Team of CRED <https://cred.club/>

We have been running spark connect 4.0.0 in production and facing the
following issue!

A gRPC connection failure occurs when executing a PySpark DataFrame action
that involves a complex, dynamically generated schema transformation. The
Spark Connect server terminates the connection, leading to an Encountered
end-of-stream mid-frame error.
The issue is reproducible in a minimal Docker environment and persists even
after eliminating network factors and increasing gRPC message size limits,
indicating a potential core issue rather than a configuration problem.

JIRA Issue - https://issues.apache.org/jira/browse/SPARK-52965

Possible to look into this?

Reply via email to