[ 
https://issues.apache.org/jira/browse/DERBY-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jørgen Løland updated DERBY-3527:
---------------------------------

    Attachment: derby-3527-1a.stat
                derby-3527-1a.diff

Patch 1a checks if the network connection is up by sending a ping message from 
the slave to the master. The shipment of the message has to happen in a 
separate thread because TCP timeout for send message is 2 minutes, not 
configurable. Also added a message reader thread on the master that currently 
accepts two kinds of messages: ping and ack. This receiver thread should later 
be modified to also accept other messages from the slave - like "slave is 
stopped due to a system shutdown".

I have tested the patch by hand on two computers. I have tried to pull the 
plug, and this patch fixes the reported problem (i.e., failover after pulled 
plug is accepted only with the patch applied). The replication test suite 
passed; currently running the other tests.

Requesting review.

> The slave will not notice that a network cable is unplugged and will 
> therefore reject failover/stopSlave commands
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-3527
>                 URL: https://issues.apache.org/jira/browse/DERBY-3527
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0, 10.5.0.0
>            Reporter: Jørgen Løland
>            Assignee: Jørgen Løland
>         Attachments: derby-3527-1a.diff, derby-3527-1a.stat
>
>
> If a network cable between the master and slave is unplugged (or a switch 
> crashes etc), ObjectInputStream#readObject will not get an exception. Neither 
> the socket nor the input stream can be queried for information on whether or 
> not the connection is working. AFAIK, the only way to find out if the network 
> is down is to send a message.
> The slave commands stopSlave and failover are rejected if the network 
> connection is working. To be absolutely sure that the connection is working, 
> we need to ping the master when these commands are requested.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to