[GitHub] [spark] xCASx commented on issue #24732: [SPARK-27868][core] Better default value and documentation for socket server backlog.
xCASx commented on issue #24732: [SPARK-27868][core] Better default value and documentation for socket server backlog. URL: https://github.com/apache/spark/pull/24732#issuecomment-575006840 Pull request has been sent. Not sure if this case requires a separate Jira ticket. Reused original SPARK-27868. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] xCASx commented on issue #24732: [SPARK-27868][core] Better default value and documentation for socket server backlog.
xCASx commented on issue #24732: [SPARK-27868][core] Better default value and documentation for socket server backlog. URL: https://github.com/apache/spark/pull/24732#issuecomment-574898994 @vanzin will do. @srowen to be honest, I don't see any flaws in previous implementation (except poor documentation, which is already resolved). Meanwhile doing any assumptions without a context will lead to bigger problems anyway. There is basically no issue which need to be addressed by adjusting the default value. So, let it be `-1` as it was before unless there is strong opinion on why it should be changed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] xCASx commented on issue #24732: [SPARK-27868][core] Better default value and documentation for socket server backlog.
xCASx commented on issue #24732: [SPARK-27868][core] Better default value and documentation for socket server backlog. URL: https://github.com/apache/spark/pull/24732#issuecomment-574883825 First of all, sorry, "the default JVM value" is just a poor wording from my side. I blindly repeated it from the initial post of this thread. What I meant is **OS default**. Here's what it looks like: `org.apache.spark.network.server.TransportServer#init` creates `NioServerSocketChannel`. The channel uses `NioServerSocketChannelConfig`, which is a successor of `io.netty.channel.socket.DefaultServerSocketChannelConfig`. The latter [contains two member fields](https://github.com/netty/netty/blob/0cde4d9cb4d19ddc0ecafc5be7c5f7c781a1f6e9/transport/src/main/java/io/netty/channel/socket/DefaultServerSocketChannelConfig.java#L43): ```java protected final ServerSocket javaSocket; private volatile int backlog = NetUtil.SOMAXCONN; ``` The first one is in fact a `ServerSocket`, which defines default value of `backlog` equal to `50`. But this doesn't matter, because even being defined, this value is not used per se. The real value for backlog comes from the above-mentioned `io.netty.channel.socket.DefaultServerSocketChannelConfig#backlog`, as you can see it [here](https://github.com/netty/netty/blob/6c05d16967df5cadcebcc498223eec3f25bacf89/transport/src/main/java/io/netty/channel/socket/nio/NioServerSocketChannel.java#L132): ```java protected void doBind(SocketAddress localAddress) throws Exception { if (PlatformDependent.javaVersion() >= 7) { javaChannel().bind(localAddress, config.getBacklog()); } else { javaChannel().socket().bind(localAddress, config.getBacklog()); } } ``` And the value of `NetUtil.SOMAXCONN` is [OS dependent](https://github.com/netty/netty/blob/4.1/common/src/main/java/io/netty/util/NetUtil.java#L253). For `EpollServerSocketChannel` the logic is similar. So, the default value in fact neither `50` (I assume, it has never been since Spark migration to netty v4.x), nor even hardcoded (except for Windows it is `200`). For Linux the value is taken from `/proc/sys/net/core/somaxconn`. If the file doesn't exist, it will be resolved from kernel attributes `kern.ipc.somaxconn` and `kern.ipc.soacceptqueue`. Only if all mentioned sources aren't defined, the hardcoded value of `128` will be used. Not only the default value is twice (more than three times for Windows) bigger than `64`, but also we forced `/proc/sys/net/core/somaxconn` to be `8192` long time ago... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] xCASx commented on issue #24732: [SPARK-27868][core] Better default value and documentation for socket server backlog.
xCASx commented on issue #24732: [SPARK-27868][core] Better default value and documentation for socket server backlog. URL: https://github.com/apache/spark/pull/24732#issuecomment-573605559 This change, despite being > such a small change that it shouldn't affect anybody in fact broke our production configuration and led to almost a month of instability of our Spark workflows, while we were investigating the root cause (just because we had a totally different suspect and it took some time to realise that it wasn't a single contributing factor). In our set up we had the default JVM value overridden, and `spark.shuffle.io.backLog` left untouched. Of course with the change our set up got broken, because Sparks's new default of `64` got precedence and it is much less than we actually need. While we learned our own lesson out of this story, I'd like to emphasise, that one should not omit mentioning a change in release notes just because it is believed to be insignificant. Especially if you change names or values of default properties. [Release notes of Spark 2.4.4](https://spark.apache.org/releases/spark-release-2-4-4.html) has no even a hint of this change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org