I agree, sigkill is typically the last resort..
On 3 Mar 2015 00:49, "Reitzel, Charles" <[email protected]>
wrote:

>  My bad.  Too long away from sockets since cleaning up those shutdown
> handlers.  Your point is well taken, on the server side the risks of
> consuming a stray echo packet are fairly low (but non-zero, if you’ve ever
> spent any quality time with tcpdump/wireshark).
>
>
>
> Still, in a production setting, SIGKILL (aka “kill -9”) should be a last
> resort after more reasonable methods (e.g. SIGINT, SIGTERM, SIGSTOP) have
> failed.
>
>
>
> *From:* Ramkumar R. Aiyengar [mailto:[email protected]]
> *Sent:* Monday, March 02, 2015 7:00 PM
> *To:* [email protected]
> *Subject:* RE: reuseAddress default in Solr jetty.xml
>
>
>
> No, reuseAddress doesn't allow you to have two processes, old and new,
> listen to the same port. There's no option which allows you to do that.
>
> Tl;DR This can happen when you have a connection to a server which gets
> killed hard and comes back up immediately
>
> So here's what happens.
>
> When a server normally shuts down, it triggers an active close on all open
> TCP connections it has. That sends a three way msg exchange with the remote
> recipient (FIN, FIN+ACK, ACK) at the end of which the socket is closed and
> the kernel puts it in a TIME_WAIT state for a few minutes in the background
> (depends on the OS, maximum tends to be 4 mins). This is needed to allow
> for reordered older packets to reach the machine just in case. Now
> typically if the server restarts within that period and tries to bind again
> to the same port, the kernel is smart enough to not complain that there is
> an existing socket in TIME_WAIT, because it knows the last sequence number
> it used for the final message in the previous process, and since sequence
> numbers are always increasing, it can reject any messages before that
> sequence number as a new process has now taken the port.
>
> Trouble is with abnormal shutdown. There's no time for a proper goodbye,
> so the kernel marks the socket to respond to remote packets with a rude RST
> (reset). Since there has been no goodbye with the remote end, it also
> doesn't know the last sequence number to delineate if a new process binds
> to the same port. Hence by default it denies binding to the new port for
> the TIME_WAIT period to avoid the off chance a stray packet gets picked up
> by the new process and utterly confuses it. By setting reuseAddress, you
> are essentially waiving off this protection. Note that this possibility of
> confusion is unbelievably miniscule in the first place (both the source and
> destination host:port should be the same and the client port is generally
> randomly allocated). If the port we are talking of is a local port, it's
> almost impossible -- you have bigger problems if a TCP packet is lost or
> delayed within the same machine!
>
> As to Shawn's point, for Solr's stop port, you essentially need to be
> trying to actively shutdown the server using the stop port, or be within a
> few minutes of such an attempt while the server is killed. Just the server
> being killed without any active connection to it is not going to cause this
> issue.
>
> Hi Ram,
>
>
>
> It appears the problem is that the old solr/jetty process is actually
> still running when the new solr/jetty process is started.   That’s the
> problem that needs fixing.
>
>
>
> This is not a rare problem in systems with worker threads dedicated to
> different tasks.   These threads need to wake up in response to the
> shutdown signal/command, as well the normal inputs.
>
>
>
> It’s a bug I’ve created and fixed a couple times over the years … :-)    I
> wouldn’t know where to start with Solr.  But, as I say, re-using the port
> is a band-aid.  I’ve yet to see a case where it is the best solution.
>
>
>
> best,
>
> Charlie
>
>
>
> *From:* Ramkumar R. Aiyengar [mailto:[email protected]]
> *Sent:* Saturday, February 28, 2015 8:15 PM
> *To:* [email protected]
> *Subject:* Re: reuseAddress default in Solr jetty.xml
>
>
>
> Hey Charles, see my explanation above on why this is needed. If Solr has
> to be killed, it would generally be immediately restarted. This would
> normally not the case, except when things are potentially misconfigured or
> if there is a bug, but not doing so makes the impact worse..
>
> In any case, turns out really that reuseAddress is true by default for the
> connectors we use, so that really isn't the issue. The issue more
> specifically is that the stop port doesn't do it, so the actual port by
> itself starts just fine on a restart, but the stop port fails to bind --
> and there's no way currently in Jetty to configure that.
>
> Based on my question in the jetty mailing list, I have now created an
> issue for them..
>
> https://bugs.eclipse.org/bugs/show_bug.cgi?id=461133
>
>
>
> On Fri, Feb 27, 2015 at 3:03 PM, Reitzel, Charles <
> [email protected]> wrote:
>
> Disclaimer: I’m not a Solr committer.  But, as a developer, I’ve never
> seen a good case for reusing the listening port.   Better to find and fix
> the root cause on the zombie state (or just slow shutdown, sometimes) and
> release the port.
>
>
>
> *From:* Mark Miller [mailto:[email protected]]
> *Sent:* Thursday, February 26, 2015 5:28 PM
> *To:* [email protected]
> *Subject:* Re: reuseAddress default in Solr jetty.xml
>
>
>
> +1
>
> - Mark
>
>
>
> On Thu, Feb 26, 2015 at 1:54 PM Ramkumar R. Aiyengar <
> [email protected]> wrote:
>
> The jetty.xml we currently ship by default doesn't set reuseAddress=true.
> If you are having a bad GC day with things going OOM and resulting in Solr
> not even being able to shutdown cleanly (or the oom_solr.sh script killing
> it), whatever external service management mechanism you have is probably
> going to try respawn it and fail with the default config because the ports
> will be in TIME_WAIT. I guess there's the usual disclaimer with
> reuseAddress causing stray packets to reach the restarted server, but
> sounds like at least the default should be true..
>
> I can raise a JIRA, but just wanted to check if anyone has any opinions
> either way..
>
>
>
>
> *************************************************************************
> This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender
> immediately and then delete it.
>
> TIAA-CREF
> *************************************************************************
>
>
>
>
> --
>
> Not sent from my iPhone or my Blackberry or anyone else's
>
>
> *************************************************************************
> This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender
> immediately and then delete it.
>
> TIAA-CREF
> *************************************************************************
>
>
> *************************************************************************
> This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender
> immediately and then delete it.
>
> TIAA-CREF
> *************************************************************************
>

Reply via email to