Xi Lyu created SPARK-54194:
------------------------------
Summary: Spark Connect Proto Plan Compression
Key: SPARK-54194
URL: https://issues.apache.org/jira/browse/SPARK-54194
Project: Spark
Issue Type: Improvement
Components: Connect
Affects Versions: 4.1.0
Environment: Currently, we enforce gRPC message limits on both the
client and the server. These limits are largely meant to protect the server
from potential OOMs by rejecting abnormally large messages. However, there are
several cases where genuine messages exceed the limit (e.g.ES-1378440,
ES-1381295, ES-1422075) and cause execution failures. While the client-side
limits can be tuned manually in DB Connect, there is limited visibility/support
on Serverless Notebooks/Jobs.
To improve Spark Connect stability, this PR implements compressing proto plans
to mitigate the issue of oversized messages from the client to the server. The
compression applies to ExecutePlan and AnalyzePlan - the only two methods that
might hit the message limit. The other issue of message limit from the server
to the client is a different issue, and it’s out of the scope of this PR. For
details, see ODD.
Reporter: Xi Lyu
Fix For: 4.1.0
Currently, Spark Connect enforce gRPC message limits on both the client and the
server. These limits are largely meant to protect the server from potential
OOMs by rejecting abnormally large messages. However, there are several cases
where genuine messages exceed the limit and cause execution failures.
To improve Spark Connect stability, we can compress unresolved proto plans to
mitigate the issue of oversized messages from the client to the server. The
compression applies to ExecutePlan and AnalyzePlan - the only two methods that
might hit the message limit. The other issue of message limit from the server
to the client is a different issue, and it’s out of the scope (that one is
already fixed in https://github.com/apache/spark/pull/52271).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]