On 04/17/2018 03:43 AM, Roland Kammerer wrote:
On Mon, Apr 16, 2018 at 12:44:22PM -0400, dehacked wrote:
Greetings,

I have a small cluster used for Openstack (Newton on centos 7 nodes). I have
2 main storage nodes, 1 openstack controller node and 5 'diskless'
hypervisors. It's configured with the hypervisors as satellite nodes and the
3 remaining servers as management nodes with the management volume, though
only the 2 storage nodes actually hold the rest of the user data.

I'm finding that drbdmanage hangs frequently trying to communicate with the
service. Even 'drbdmanage ping' will timeout. Examining the service process
I see it apparently busy connecting to another host which is itself hung.

Any ideas what's wrong or what troubleshooting steps I should be taking here?

Usually this is a sign that at least one of them is busy and tries to do
the same thing (e.g., create a resource, delete a resource,...) over and
over again. Usually that stops after a fail-count is reached. But if it
even takes longer than the TCP timeout we set, a node might not even be
able to report back that it failed doing something. And then this loops.
There have been fixes in that regard and the latest version has a
configurable TCP timeout.

Enable debugging, check if you detect such a "busy loop" in the syslogs.

Thanks for the suggestion. This did help track it down.

The issue ended up being LVM related - all nodes were checking all block devices for LVM labels and were getting hung up on all the DRBD devices that were being created and in some cases not properly configured. Pruning the LVM filters fixed it all up.

Maybe these are possible bugs to be fixed? A node with no storage still runs vgscan, and opening /dev/drbdXX with no connections and diskless still waits for what I think is the autopromote timeout.

Although I only updated drbdmanage so far, so if some of these are already fixed, disregard.


Thanks

drbdmanage version 0.99.14
kernel driver version 9.0.9
drbd-utils version 9.1.1
all built from source tarballs

Every single one of them is outdated. At least try the latest drbdmange.

Regards, rck
_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to