Hi Lars, thx for answer

W dniu 18.06.2018 o 17:10, Lars Ellenberg pisze:
On Wed, Jun 13, 2018 at 01:03:53PM +0200, Artur Kaszuba wrote:
I know about 3 node solution and i have used it for some time (from ~9.0.8),
but i had problems with stability and decided to change configuration to
stacked configuration, with hope it will work more stable. As a last
solution i will downgrade to drbd8 where never had any problems with
stability, but i would like to stay with 9 and after some time switch again
to 3 node config.

By stability i mean such situation:
- last version of drbd9 (9.0.14)
- kernel 4.13.0-43-generic on Ubuntu 16.04
- high disk usage/IO on drbd devices
- 3 node configuration
- random system crash on "drbdam disconnect/connect" command
When i disable one node everything works without problems and
disconnect/connect works perfectly. Before 9.0.14 i dont had such crashes,
but had other which are fixed now.

And you cannot be bothered to report "such crashes"
in a way that makes it possible to understand and fix those?

"random system crash" is not good enough :-/


Yep, i know it is not enough to find a reason of this crashes, and that why i don't reported this separately, i asked only why stacking solution does not work in my case :).

Sorry but i cannot wrote to much more, this happening on production environment and i cannot make tests there.
I can add:
- simple tests to reproduce this situation, but without high disk usage does not create crashes - problems started after upgrade from drbd 9.0.12 to 9.0.14 and drbd-utils 9.3.0-1ppa1~xenial1 to 9.4.0-1ppa1~xenial1, before this we dont had such crashes - we have ~15 drbd resources on this environment, with high IO in random pattern (databases, indexers, git, file servers, kvm etc)

Unfortunately i cannot wait for next fix,
i need stable environment.

"I want it all, and I want it now" :-)

For the benefit of those that can afford to wait for the next fix,
maybe you should still report the crashes in a way that we can work with.


Sorry if i wrote it in wrong way, English is not my native language and i did not want to be sound rude.
I only wrote about such situation:
- system works without crashes for months
- system is core production environment in company
- drbd upgrade causes random crashes (3 node configuration for drbd9)
- we cannot manage/create drbd resources because system could crash on any drbdadm connect/disconnect command (what already happened in the middle of day when we trying to reconnect backup server :/)

Such situation does not allow me to wait for next fix, i need to find other solution/workaround.

I prefer to use stacking configuration, even when it is deprecated in
DRBD9.  I decided to write this post because stacked configuration is
still described in documentation and should work? Unfortunately for
now it is not possible to create such configuration or i missed
something :/

I know there are DRBD 9 users using "stacked" configurations out there.


Hmm, maybe they created resources some time ago and drbd works for already created resources. That what i found is problem with initial synchronization to backup server:
- source servers pair are up and one is primary
- backup server try to synchronize data (first time)
- primary server try to enter into Source state for stacked device, at this moment it end with error:

[1636671.252028] drbd system-test-U/0 drbd113 z1: helper command: /sbin/drbdadm before-resync-source [1636671.255933] drbd system-test-U/0 drbd113: before-resync-source handler returned 1, dropping connection. [1636671.255942] drbd system-test-U z1: conn( Connected -> Disconnecting ) peer( Secondary -> Unknown )

- the same error (error code) happened when i executed drbdadm before-resync-source directly:
'system-test-U' is a stacked resource, and not available in normal mode.

I think it could be a problem in drbd-utils or in drbd module:
- drbdadm before-resync-source should detect type of resource (not requiring --stacking switch) - or drbd module should execute drbdadm before-resync-source command with "--stacking" switch when start before-resync-source handler for stacked resources

Of course, it could be caused by my misconfiguration (test config in initial mail), but i cannot find error there :(

Maybe you missed to upgrade your drbd-utils?
Current drbd-utils version would be 9.4.0


Im already use latest version of drbd-utils and drdb-dkms module:
ii drbd-dkms 9.0.14-1ppa1~xenial1 all RAID 1 over TCP/IP for Linux module source ii drbd-utils 9.4.0-1ppa1~xenial1 amd64 RAID 1 over TCP/IP for Linux (user utilities)

If someone could help me to understand this situation i will be really grateful.

--
Artur Kaszuba

_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to