We’ve setup a three node cluster with linstor where all three nodes are primary.

The one of test is reboot one machine. After the machine comes up again, it 
won’t reconnect. It’s state remain in "Outdated" while it tries to connect to 
the other hosts (which are still both primary).

It does not matter what drbdadm command we execute,like drbdadm down,drbdadm up 
,drbdadm disconnect ,drbdadm connect --discard-my-data, The state won’t change. 

The only thing which works as workaround is putting one of the other two 
primarys to secondary. After this the rebooted host will connect and start 
syncing. But in a real world scenario this is not practicable to down a 
resource on one of the both survivors.

What's the right way after node failure in a tripple primary setup?


linstor version : linstor 1.0.1; GIT-hash: 
d8c9a43d4eab20749132147ad61a2ee821645be2
drbd version : 9.0.18-1
We've done a lot of testing in this release, and we don't want to upgrade to a 
newer version.



satellite node log :
Dec 21 15:02:59 com30-dev kernel: drbd voting sto34-dev: Handshake to peer 1 
successful: Agreed network protocol version 115
Dec 21 15:02:59 com30-dev kernel: drbd voting sto34-dev: Feature flags enabled 
on protocol level: 0xf TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES.
Dec 21 15:02:59 com30-dev kernel: drbd voting sto34-dev: Peer authenticated 
using 20 bytes HMAC
Dec 21 15:02:59 com30-dev kernel: drbd voting sto34-dev: Starting ack_recv 
thread (from drbd_r_voting [30419])
Dec 21 15:02:59 com30-dev kernel: drbd voting sto34-dev: Rejecting concurrent 
remote state change 3425499901 because of state change 697879097
Dec 21 15:02:59 com30-dev kernel: drbd voting sto34-dev: sock was shut down by 
peer
Dec 21 15:02:59 com30-dev kernel: drbd voting sto34-dev: conn( Connecting -> 
BrokenPipe )
Dec 21 15:02:59 com30-dev kernel: drbd voting sto34-dev: ack_receiver terminated
Dec 21 15:02:59 com30-dev kernel: drbd voting sto34-dev: Terminating ack_recv 
thread
Dec 21 15:02:59 com30-dev kernel: drbd voting sto34-dev: Restarting sender 
thread
Dec 21 15:02:59 com30-dev kernel: drbd voting sto34-dev: Connection closed
Dec 21 15:02:59 com30-dev kernel: drbd voting sto34-dev: conn( BrokenPipe -> 
Unconnected )
Dec 21 15:02:59 com30-dev kernel: drbd voting sto34-dev: Restarting receiver 
thread
Dec 21 15:02:59 com30-dev kernel: drbd voting sto34-dev: conn( Unconnected -> 
Connecting )
Dec 21 15:02:59 com30-dev kernel: drbd redo sto34-dev: Handshake to peer 1 
successful: Agreed network protocol version 115
Dec 21 15:02:59 com30-dev kernel: drbd redo sto34-dev: Feature flags enabled on 
protocol level: 0xf TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES.
Dec 21 15:02:59 com30-dev kernel: drbd redo sto34-dev: Peer authenticated using 
20 bytes HMAC
Dec 21 15:02:59 com30-dev kernel: drbd redo sto34-dev: Starting ack_recv thread 
(from drbd_r_redo [30413])
Dec 21 15:02:59 com30-dev kernel: drbd voting sto31-dev: Handshake to peer 0 
successful: Agreed network protocol version 115
Dec 21 15:02:59 com30-dev kernel: drbd voting sto31-dev: Feature flags enabled 
on protocol level: 0xf TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES.
Dec 21 15:02:59 com30-dev kernel: drbd voting sto31-dev: Peer authenticated 
using 20 bytes HMAC
Dec 21 15:02:59 com30-dev kernel: drbd voting sto31-dev: Starting ack_recv 
thread (from drbd_r_voting [30417])
Dec 21 15:02:59 com30-dev kernel: drbd voting sto31-dev: Rejecting concurrent 
remote state change 1767491989 because of state change 697879097
Dec 21 15:02:59 com30-dev kernel: drbd voting sto31-dev: sock was shut down by 
peer
Dec 21 15:02:59 com30-dev kernel: drbd voting sto31-dev: conn( Connecting -> 
BrokenPipe )
Dec 21 15:02:59 com30-dev kernel: drbd voting sto31-dev: ack_receiver terminated
Dec 21 15:02:59 com30-dev kernel: drbd voting sto31-dev: Terminating ack_recv 
thread
Dec 21 15:02:59 com30-dev kernel: drbd voting sto31-dev: Restarting sender 
thread
Dec 21 15:02:59 com30-dev kernel: drbd voting sto31-dev: Connection closed
Dec 21 15:02:59 com30-dev kernel: drbd voting sto31-dev: conn( BrokenPipe -> 
Unconnected )
Dec 21 15:02:59 com30-dev kernel: drbd voting sto31-dev: Restarting receiver 
thread
Dec 21 15:02:59 com30-dev kernel: drbd voting sto31-dev: conn( Unconnected -> 
Connecting )
Dec 21 15:02:59 com30-dev kernel: drbd redo sto31-dev: Handshake to peer 0 
successful: Agreed network protocol version 115
Dec 21 15:02:59 com30-dev kernel: drbd redo sto31-dev: Feature flags enabled on 
protocol level: 0xf TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES.
Dec 21 15:02:59 com30-dev kernel: drbd redo sto31-dev: Peer authenticated using 
20 bytes HMAC
Dec 21 15:02:59 com30-dev kernel: drbd redo sto31-dev: Starting ack_recv thread 
(from drbd_r_redo [30411])

if you need any information ,please tell me.

_______________________________________________
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
[email protected]
https://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to