otterc commented on issue #22173: [SPARK-24355] Spark external shuffle server improvement to better handle block fetch requests. URL: https://github.com/apache/spark/pull/22173#issuecomment-578347705 @Victsm @tgravescs I removed the `await` and tested with our internal stress testing framework. I started seeing SASL requests timing out. In this test, I observed more than 2 minutes delay between channel registration and when the first bytes are read from the channel. ``` 2020-01-24 22:53:34,019 DEBUG org.spark_project.io.netty.handler.logging.LoggingHandler: [id: 0xd475f5ff, L:/10.150.16.27:7337 - R:/10.150.16.44:11388] REGISTERED 2020-01-24 22:53:34,019 DEBUG org.spark_project.io.netty.handler.logging.LoggingHandler: [id: 0xd475f5ff, L:/10.150.16.27:7337 - R:/10.150.16.44:11388] ACTIVE 2020-01-24 22:55:05,207 DEBUG org.spark_project.io.netty.handler.logging.LoggingHandler: [id: 0xd475f5ff, L:/10.150.16.27:7337 - R:/10.150.16.44:11388] READ: 48B 2020-01-24 22:55:05,207 DEBUG org.spark_project.io.netty.handler.logging.LoggingHandler: [id: 0xd475f5ff, L:/10.150.16.27:7337 - R:/10.150.16.44:11388] WRITE: org.apache.spark.network.protocol.MessageWithHeader@27e59ee9 2020-01-24 22:55:05,207 DEBUG org.spark_project.io.netty.handler.logging.LoggingHandler: [id: 0xd475f5ff, L:/10.150.16.27:7337 - R:/10.150.16.44:11388] FLUSH 2020-01-24 22:55:05,207 INFO org.apache.spark.network.server.OutgoingChannelHandler: OUTPUT request 5929104419960968526 channel d475f5ff request_rec 1579906505207 transport_rec 1579906505207 flush 1579906505207 receive-transport 0 transport-flush 0 total 0 ``` Since there is a delay in reading the channel, I suspect this is because the hardcoding in netty code `SingleThreadEventExecutor.runAllTask()` that checks time only after 64 tasks. `WriteAndFlush` tasks are bulky tasks. With `await` there will be just 1 `WriteAndFlushTask` per channel in the IO thread's pending queue and the rest of the tasks will be smaller tasks. However, without `await` there are more `WriteAndFlush` tasks per channel in the IO thread's queue. Since it processes 64 tasks and then checks time, this time increases with more `WriteAndFlush` tasks. ``` / Check timeout every 64 tasks because nanoTime() is relatively expensive. // XXX: Hard-coded value - will make it configurable if it is really a problem. if ((runTasks & 0x3F) == 0) { lastExecutionTime = ScheduledFutureTask.nanoTime(); if (lastExecutionTime >= deadline) { break; } } ``` I can test this theory by lowering this number in a fork of netty and building spark against it. However, for now we can't remove `await()`.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
