[
https://issues.apache.org/jira/browse/SPARK-51667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated SPARK-51667:
-----------------------------------
Labels: pull-request-available (was: )
> [TWS + Python] Disable Nagle's algorithm between Python worker and State
> Server
> -------------------------------------------------------------------------------
>
> Key: SPARK-51667
> URL: https://issues.apache.org/jira/browse/SPARK-51667
> Project: Spark
> Issue Type: Improvement
> Components: Structured Streaming
> Affects Versions: 4.0.0, 4.1.0
> Reporter: Jungtaek Lim
> Priority: Major
> Labels: pull-request-available
>
> During testing TWS + Python, we figured out the case where the socket
> communication for state interaction had delayed for more than 40ms, for
> certain type of state, e.g. ListState.put(), ListState.get(),
> ListState.appendList(), etcetc.
> The root cause is figured out as the combination of Nagle's algorithm and
> delayed ACK. The sequence is following:
> # Python worker sends the proto message to JVM, and flushes the socket.
> # Additionally, Python worker sends the follow-up data to JVM, and flushes
> the socket.
> # JVM reads the proto message, and realizes there is follow-up data.
> # JVM reads the follow-up data.
> # JVM processes the request, and sends the response back to Python worker.
> Due to delayed ACK, even after 3, ACK is not sent back from JVM to Python
> worker. It is waiting for some data or multiple ACKs to be sent, but JVM is
> not going to send the data during that phase.
> Due to Nagle's algorithm, the message from 2 is not sent to JVM since there
> is no ACK for the message from 1.
> This deadlock situation is resolved after the timeout of delayed ACK, which
> is 40ms (minimum duration) in Linux. After the timeout, ACK is sent back from
> JVM to Python worker, hence Nagle's algorithm allows the message from 2 to be
> finally sent to JVM.
> See below articles for more general explanation:
> * [https://engineering.avast.io/40-millisecond-bug/]
> ** Start reading from Nagle's algorithm section
> * [https://brooker.co.za/blog/2024/05/09/nagle.html]
> Nagle's algorithm helps to reduce a lot of small packets, which the above
> article states it could help the router from overloaded. We connect to
> "localhost" here.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]