Forgot to reply to this. It looks like the socket linger issue was
causing this. I didn't track it completely since disabling linger
solved the issue. My guess is that the sockets were hung on close for
2 seconds each, which somehow caused the server to kill other
connections (maybe a lock is held?).
Jared
On Jul 2, 2011, at 10:55 AM, Ted Dunning <[email protected]> wrote:
Has anybody understood this scenario yet?
On Thu, Jun 30, 2011 at 8:15 AM, Jared Cantwell <[email protected]
>wrote:
Hello again,
I am seeing a strange issue that I'm hoping someone can give me
insight
into. For simple testing, I have a standalone server setup. I
connected
to
this server from 2 nodes, one of which is the node hosting the
standalone
server. After opening a small number of connections from each node
(3 or 4
clients/node), I powered off the node not hosting the standalone
server.
As
expected, the logs show the server expiring all sessions for
connections to
that node. The problem comes 10 seconds later when the server
decides to
also expire all local connections too. As a result, the clients on
the
node
that is still alive (and hosting the standalone server) all try
reconnecting, but their connections are denied for having expired--
over
and
over again.
I am working on getting some consolidated logs, so I'll reply to
this when
I
have them. I was wondering if anyone knows of an issue or has any
initial
thoughts?
Some things I am going to try:
1. Start a 3 node quorum and connect clients from a 4th node. Then
kill
the
4th node and see if other connections are killed too. If this
works OK
then
it would point to an issue with the standalone server mode.
2. Connect 3 nodes to my standalone server. Power off one node and
see if
connections to the other node is killed. This will determine if its
killing
all other connections, or just local connections for some strange
reasons.
~Jared