[GitHub] [spark] xCASx commented on issue #24732: [SPARK-27868][core] Better default value and documentation for socket server backlog.

2020-01-15 Thread GitBox
xCASx commented on issue #24732: [SPARK-27868][core] Better default value and 
documentation for socket server backlog.
URL: https://github.com/apache/spark/pull/24732#issuecomment-575006840
 
 
   Pull request has been sent. Not sure if this case requires a separate Jira 
ticket. Reused original SPARK-27868.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] xCASx commented on issue #24732: [SPARK-27868][core] Better default value and documentation for socket server backlog.

2020-01-15 Thread GitBox
xCASx commented on issue #24732: [SPARK-27868][core] Better default value and 
documentation for socket server backlog.
URL: https://github.com/apache/spark/pull/24732#issuecomment-574898994
 
 
   @vanzin will do.
   
   @srowen to be honest, I don't see any flaws in previous implementation 
(except poor documentation, which is already resolved). Meanwhile doing any 
assumptions without a context will lead to bigger problems anyway. There is 
basically no issue which need to be addressed by adjusting the default value. 
So, let it be `-1` as it was before unless there is strong opinion on why it 
should be changed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] xCASx commented on issue #24732: [SPARK-27868][core] Better default value and documentation for socket server backlog.

2020-01-15 Thread GitBox
xCASx commented on issue #24732: [SPARK-27868][core] Better default value and 
documentation for socket server backlog.
URL: https://github.com/apache/spark/pull/24732#issuecomment-574883825
 
 
   First of all, sorry, "the default JVM value" is just a poor wording from my 
side. I blindly repeated it from the initial post of this thread. What I meant 
is **OS default**.
   
   Here's what it looks like:
   
   `org.apache.spark.network.server.TransportServer#init` creates 
`NioServerSocketChannel`. The channel uses `NioServerSocketChannelConfig`, 
which is a successor of 
`io.netty.channel.socket.DefaultServerSocketChannelConfig`.
   The latter [contains two member 
fields](https://github.com/netty/netty/blob/0cde4d9cb4d19ddc0ecafc5be7c5f7c781a1f6e9/transport/src/main/java/io/netty/channel/socket/DefaultServerSocketChannelConfig.java#L43):
   
   ```java
   protected final ServerSocket javaSocket;
   private volatile int backlog = NetUtil.SOMAXCONN;
   ```
   
   The first one is in fact a `ServerSocket`, which defines default value of 
`backlog` equal to `50`. But this doesn't matter, because even being defined, 
this value is not used per se.
   The real value for backlog comes from the above-mentioned 
`io.netty.channel.socket.DefaultServerSocketChannelConfig#backlog`, as you can 
see it 
[here](https://github.com/netty/netty/blob/6c05d16967df5cadcebcc498223eec3f25bacf89/transport/src/main/java/io/netty/channel/socket/nio/NioServerSocketChannel.java#L132):
   
   ```java
   protected void doBind(SocketAddress localAddress) throws Exception {
   if (PlatformDependent.javaVersion() >= 7) {
   javaChannel().bind(localAddress, config.getBacklog());
   } else {
   javaChannel().socket().bind(localAddress, config.getBacklog());
   }
   }
   ```
   
   And the value of `NetUtil.SOMAXCONN` is [OS 
dependent](https://github.com/netty/netty/blob/4.1/common/src/main/java/io/netty/util/NetUtil.java#L253).
   
   For `EpollServerSocketChannel` the logic is similar.
   
   So, the default value in fact neither `50` (I assume, it has never been 
since Spark migration to netty v4.x), nor even hardcoded (except for Windows it 
is `200`). For Linux the value is taken from `/proc/sys/net/core/somaxconn`. If 
the file doesn't exist, it will be resolved from kernel attributes 
`kern.ipc.somaxconn` and `kern.ipc.soacceptqueue`. Only if all mentioned 
sources aren't defined, the hardcoded value of `128` will be used.
   
   Not only the default value is twice (more than three times for Windows) 
bigger than `64`, but also we forced  `/proc/sys/net/core/somaxconn` to be 
`8192` long time ago...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] xCASx commented on issue #24732: [SPARK-27868][core] Better default value and documentation for socket server backlog.

2020-01-13 Thread GitBox
xCASx commented on issue #24732: [SPARK-27868][core] Better default value and 
documentation for socket server backlog.
URL: https://github.com/apache/spark/pull/24732#issuecomment-573605559
 
 
   This change, despite being
   
   > such a small change that it shouldn't affect anybody
   
   in fact broke our production configuration and led to almost a month of 
instability of our Spark workflows, while we were investigating the root cause 
(just because we had a totally different suspect and it took some time to 
realise that it wasn't a single contributing factor).
   
   In our set up we had the default JVM value overridden, and 
`spark.shuffle.io.backLog` left untouched. Of course with the change our set up 
got broken, because Sparks's new default of `64` got precedence and it is much 
less than we actually need.
   
   While we learned our own lesson out of this story, I'd like to emphasise, 
that one should not omit mentioning a change in release notes just because it 
is believed to be insignificant. Especially if you change names or values of 
default properties.
   [Release notes of Spark 
2.4.4](https://spark.apache.org/releases/spark-release-2-4-4.html) has no even 
a hint of this change.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org