On 2018-08-14 08:36 AM, Philipp Reisner wrote: > Hi, > > This is an upgrade ever drbd-9 user should follow. It has two important > fixes in the areas of > > * handling IO errors reported by the backing device > > handling of IO errors on the backend was completely broken since > drbd-9.0 including the recovery options like replacing a failed > disk. When the disk was replaced even worse it was possible that > DRBD would read from the new disk before the full sync finished. > All fixed now, but very embarrassing. > > * correctly handle UUIDs in case of live-migration > > That was the root cause for various strange behavior. E.g. a node > considering some other as not up-to-date while the peer considers > itself as up-to-date. > > The goody of the release is that the submit code path was optimized > a bit, and that gives up to 30% increase (depending on CPU model and > performance of the backing device) in IOPs. > > A lot of effort was spend to write more tests for the drbd9 test suite > ( https://github.com/LINBIT/drbd9-tests ). DRBD-8.4 had its own, which > was more complete at its time, but now it is overdue to have a testing > coverage at least as good for the drbd-9 code base. > > Apart form wok on the testsuite we will continue to put effort into > optimizing the IO submit code path. Very fast NVMe devices keep the > pressure on us to be able to fully utilize them when used as backing > device for DRBD. > > Note: We will update the PPA on Thursday (Aug 16). Sorry for the delay > (vacations and a bank holiday are the reasons) > > 9.0.15 (api:genl2/proto:86-114/transport:14) > -------- > * fix tracking of changes (on a secondary) against the lost disk of a > primary and also fix re-attaching in case the disk is replaced (has > new meta-data) > * fix live migrate of VMs on DRBD when migrated to/from diskless > nodes; before that fix a race condition can lead to one of the nodes > seeing the other one as consistent only > * fix an IO deadlock in DRBD when the activity log on a secondary runs full; > In the real world, this was very seldom triggered but can be easily > reproduced with a workload that touches one block every 4M and writes > them all in a burst > * fix hanging demote after IO error followed by attaching the disk again > and the corresponding resync > * fix DRBD dropping connection after an IO error on the secondary node > * new module parameter to disable support for older protocol versions, > an in case you configured peers that are not expected to connect it > might have positive effects because then this node does not need to > assume that such peer is ancient > * improve details when online changing devices from diskless to with disk and > vice versa. (Including peers freeing bitmap slots) > * remove no longer relevant compat tests > * expose openers via debugfs; that helps to answer the question why does > DRBD not demote to secondary, why does it give tell me "Device is held > open by someone" > * optimize IO submit code path; this can improve IOPs up to 30% on a system > with fast backend storage; lowers CPU load caused by DRBD on every workload > * compat for v4.18 kernel > > http://www.linbit.com/downloads/drbd/9.0/drbd-9.0.15-1.tar.gz > https://github.com/LINBIT/drbd-9.0/releases/tag/drbd-9.0.15 > > best regards, > Phil
Congrats on the release! -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould _______________________________________________ drbd-user mailing list [email protected] http://lists.linbit.com/mailman/listinfo/drbd-user
