For slave recovery to work, it is expected to not change its config.
On Thu, Jul 2, 2015 at 2:10 PM, Philippe Laflamme <[email protected]>
wrote:
> Hi,
>
> I'm trying to roll out an upgrade from 0.20.0 to 0.21.0 with slaves
> configured with checkpointing and with "reconnect" recovery.
>
> I was investigating why the slaves would successfully re-register with the
> master and recover, but would subsequently be asked to shutdown ("health
> check timeout").
>
> It turns out that our slaves had been unintentionally configured to use
> port 5050 in the previous configuration. We decided to fix that during the
> upgrade and have them use the default 5051 port.
>
> This change seems to make the health checks fail and eventually kills the
> slave due to inactivity.
>
> I've confirmed that leaving the port to what it was in the previous
> configuration makes the slave successfully re-register and is not asked to
> shutdown later on.
>
> Is this a known issue? I haven't been able to find a JIRA ticket for this.
> Maybe it's the expected behaviour? Should I create a ticket?
>
> Thanks,
> Philippe
>