Re: [DRBD-user] Interesting issue with drbd 9 and fencing

Lars Ellenberg Tue, 13 Feb 2018 06:40:36 -0800

On Sun, Feb 11, 2018 at 02:43:27AM -0500, Digimer wrote:
> On 2018-02-11 01:42 AM, Digimer wrote:
> 
> 
>     Hi all,
> 
>       I've setup a 3-node cluster (config below). Basically, Node 1 & 2 are 
> protocol C and have resource-and-stonith fencing. Node 1 -> 3 and 2 -> 3 are 
> protocol A and fencing is 'dont-care' (it's
>     not part of the cluster and would only ever be promoted manually).
> 
>       When I crash node 2 via 'echo c > /proc/sysrq-trigger', pacemaker 
> detected the faults and so does DRBD. DRBD invokes the fence-handler as 
> expected and all is good. However, I want to test
>     breaking just DRBD, so on node 2 I used 'iptables -I INPUT -p tcp -m tcp 
> --dport 7788:7790 -j DROP' to interrupt DRBD traffic. When this is done, the 
> fence handler is not invoked.


iptables command may need to be changed, to also drop --sport,
and for good measure, add the same to the OUTPUT chain.
DRBD connections (are can be) established in both directions.
You blocked only one direction.

Maybe do it more like this:
iptables -X drbd
iptables -N drbd
iptables -I INPUT -p tcp --dport 7788:7790 -j drbd
iptables -I INPUT -p tcp --sport 7788:7790 -j drbd
iptables -I OUTPUT -p tcp --dport 7788:7790 -j drbd
iptables -I OUTPUT -p tcp --sport 7788:7790 -j drbd

Then toggle:
break: iptables -I drbd -j DROP
 heal: iptables -F drbd

(beware of typos, I just typed this directly into the email)

>       Issue the iptables command on node 2. Journald output;
> 
>     ====
>     -- Logs begin at Sat 2018-02-10 17:51:59 GMT. --
>     Feb 11 06:20:18 m3-a02n01.alteeve.com crmd[2817]:   notice: State 
> transition S_TRANSITION_ENGINE -> S_IDLE
>     Feb 11 06:28:57 m3-a02n01.alteeve.com kernel: drbd srv01-c7_0 
> m3-a02n02.alteeve.com: PingAck did not arrive in time.
>     Feb 11 06:28:57 m3-a02n01.alteeve.com kernel: drbd srv01-c7_0: susp-io( 
> no -> fencing)

Goes for suspend-io due to fencing policies, as configured.

>     Feb 11 06:28:57 m3-a02n01.alteeve.com kernel: drbd srv01-c7_0 
> m3-a02dr01.alteeve.com: Preparing remote state change 1400759070 
> (primary_nodes=1, weak_nodes=FFFFFFFFFFFFFFFA)
>     Feb 11 06:28:57 m3-a02n01.alteeve.com kernel: drbd srv01-c7_0 
> m3-a02dr01.alteeve.com: Committing remote state change 1400759070
>     Feb 11 06:28:57 m3-a02n01.alteeve.com kernel: drbd srv01-c7_0/0 drbd0 
> m3-a02n02.alteeve.com: pdsk( DUnknown -> Outdated )

But state changes are relayed through all connected nodes,
and node02 confirms that it now knows it is Outdated.

>     Feb 11 06:28:57 m3-a02n01.alteeve.com kernel: drbd srv01-c7_0/0 drbd0: 
> new current UUID: 769A55B47EB143CD weak: FFFFFFFFFFFFFFFA
>     Feb 11 06:28:57 m3-a02n01.alteeve.com kernel: drbd srv01-c7_0: susp-io( 
> fencing -> no)

Which means we may resume-io after bumping the data generation uuid tag,
and don't have to call out to any additional handler "helper" scripts.

>     Feb 11 06:28:57 m3-a02n01.alteeve.com kernel: drbd srv01-c7_0 
> m3-a02n02.alteeve.com: conn( Unconnected -> Connecting )
>     Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd srv01-c7_0 
> m3-a02n02.alteeve.com: Handshake to peer 1 successful: Agreed network 
> protocol version 112

And since you only blocked one direction,
we can establish a new one anyways in the other direction.

we do a micro (in this case even: empty) resync:

>     Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd srv01-c7_0/0 drbd0 
> m3-a02n02.alteeve.com: pdsk( Outdated -> Inconsistent ) repl( WFBitMapS -> 
> SyncSource )
>     Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd srv01-c7_0/0 drbd0 
> m3-a02n02.alteeve.com: Began resync as SyncSource (will sync 0 KB [0 bits 
> set]).
>     Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd srv01-c7_0/0 drbd0 
> m3-a02n02.alteeve.com: updated UUIDs 
> 769A55B47EB143CD:0000000000000000:4CF0E17ADD9D1E0E:4161585F99D3837C
>     Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd srv01-c7_0/0 drbd0 
> m3-a02n02.alteeve.com: Resync done (total 1 sec; paused 0 sec; 0 K/sec)
>     Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd srv01-c7_0/0 drbd0 
> m3-a02n02.alteeve.com: pdsk( Inconsistent -> UpToDate ) repl( SyncSource -> 
> Established )
>     Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd srv01-c7_0 
> m3-a02n02.alteeve.com: helper command: /sbin/drbdadm unfence-peer
>     Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd srv01-c7_0 
> m3-a02n02.alteeve.com: helper command: /sbin/drbdadm unfence-peer exit code 0 
> (0x0)

And call out to "unfence" just in case.

>     The cluster still thinks all is well, too.

Pacemaker "status" shows DRBD in Master or Slave role,
but cannot show and "disconnected" aspect of DRBD anyways.

>     To verify, I can't connect to node 2;
> 
>     ==== [root@m3-a02n01 ~]# telnet m3-a02n02.sn 7788

But node 2 could (and did) still connect to you ;-)

> Note: I down'ed the dr node (node 3) an repeated the test. This time,
> the fence-handler was invoked. So I assume that DRBD did route through
> the third node. Impressive!

Yes, "sort of".

> So, is the Protocol C between 1 <-> 2 maintained, when there is an 
> intermediary node that is Protocol A?

"cluster wide state changes" need to propagate via all available
connections, and need to be relayed.

Data is NOT (yet) relayed.
One of those items listed in the todo book volume two :-)

-- 
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker

DRBD® and LINBIT® are registered trademarks of LINBIT
__
please don't Cc me, but send to list -- I'm subscribed
_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Re: [DRBD-user] Interesting issue with drbd 9 and fencing

Reply via email to