RE: reuseAddress default in Solr jetty.xml

Ramkumar R. Aiyengar Mon, 02 Mar 2015 16:09:12 -0800

No, reuseAddress doesn't allow you to have two processes, old and new,
listen to the same port. There's no option which allows you to do that.


Tl;DR This can happen when you have a connection to a server which gets
killed hard and comes back up immediately

So here's what happens.

When a server normally shuts down, it triggers an active close on all open
TCP connections it has. That sends a three way msg exchange with the remote
recipient (FIN, FIN+ACK, ACK) at the end of which the socket is closed and
the kernel puts it in a TIME_WAIT state for a few minutes in the background
(depends on the OS, maximum tends to be 4 mins). This is needed to allow
for reordered older packets to reach the machine just in case. Now
typically if the server restarts within that period and tries to bind again
to the same port, the kernel is smart enough to not complain that there is
an existing socket in TIME_WAIT, because it knows the last sequence number
it used for the final message in the previous process, and since sequence
numbers are always increasing, it can reject any messages before that
sequence number as a new process has now taken the port.

Trouble is with abnormal shutdown. There's no time for a proper goodbye, so
the kernel marks the socket to respond to remote packets with a rude RST
(reset). Since there has been no goodbye with the remote end, it also
doesn't know the last sequence number to delineate if a new process binds
to the same port. Hence by default it denies binding to the new port for
the TIME_WAIT period to avoid the off chance a stray packet gets picked up
by the new process and utterly confuses it. By setting reuseAddress, you
are essentially waiving off this protection. Note that this possibility of
confusion is unbelievably miniscule in the first place (both the source and
destination host:port should be the same and the client port is generally
randomly allocated). If the port we are talking of is a local port, it's
almost impossible -- you have bigger problems if a TCP packet is lost or
delayed within the same machine!

As to Shawn's point, for Solr's stop port, you essentially need to be
trying to actively shutdown the server using the stop port, or be within a
few minutes of such an attempt while the server is killed. Just the server
being killed without any active connection to it is not going to cause this
issue.

Hi Ram,



It appears the problem is that the old solr/jetty process is actually still
running when the new solr/jetty process is started.   That’s the problem
that needs fixing.



This is not a rare problem in systems with worker threads dedicated to
different tasks.   These threads need to wake up in response to the
shutdown signal/command, as well the normal inputs.



It’s a bug I’ve created and fixed a couple times over the years … :-)    I
wouldn’t know where to start with Solr.  But, as I say, re-using the port
is a band-aid.  I’ve yet to see a case where it is the best solution.



best,

Charlie



*From:* Ramkumar R. Aiyengar [mailto:andyetitmo...@gmail.com]
*Sent:* Saturday, February 28, 2015 8:15 PM
*To:* dev@lucene.apache.org
*Subject:* Re: reuseAddress default in Solr jetty.xml



Hey Charles, see my explanation above on why this is needed. If Solr has to
be killed, it would generally be immediately restarted. This would normally
not the case, except when things are potentially misconfigured or if there
is a bug, but not doing so makes the impact worse..

In any case, turns out really that reuseAddress is true by default for the
connectors we use, so that really isn't the issue. The issue more
specifically is that the stop port doesn't do it, so the actual port by
itself starts just fine on a restart, but the stop port fails to bind --
and there's no way currently in Jetty to configure that.

Based on my question in the jetty mailing list, I have now created an issue
for them..

https://bugs.eclipse.org/bugs/show_bug.cgi?id=461133



On Fri, Feb 27, 2015 at 3:03 PM, Reitzel, Charles <
charles.reit...@tiaa-cref.org> wrote:

Disclaimer: I’m not a Solr committer.  But, as a developer, I’ve never seen
a good case for reusing the listening port.   Better to find and fix the
root cause on the zombie state (or just slow shutdown, sometimes) and
release the port.



*From:* Mark Miller [mailto:markrmil...@gmail.com]
*Sent:* Thursday, February 26, 2015 5:28 PM
*To:* dev@lucene.apache.org
*Subject:* Re: reuseAddress default in Solr jetty.xml



+1

- Mark



On Thu, Feb 26, 2015 at 1:54 PM Ramkumar R. Aiyengar <
andyetitmo...@gmail.com> wrote:

The jetty.xml we currently ship by default doesn't set reuseAddress=true.
If you are having a bad GC day with things going OOM and resulting in Solr
not even being able to shutdown cleanly (or the oom_solr.sh script killing
it), whatever external service management mechanism you have is probably
going to try respawn it and fail with the default config because the ports
will be in TIME_WAIT. I guess there's the usual disclaimer with
reuseAddress causing stray packets to reach the restarted server, but
sounds like at least the default should be true..

I can raise a JIRA, but just wanted to check if anyone has any opinions
either way..




*************************************************************************
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately
and then delete it.

TIAA-CREF
*************************************************************************




-- 

Not sent from my iPhone or my Blackberry or anyone else's


*************************************************************************
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately
and then delete it.

TIAA-CREF
*************************************************************************

RE: reuseAddress default in Solr jetty.xml

Reply via email to