Re: [DRBD-user] ZFS storage backend failed

2018-02-21 Thread Julien Escario
Le 21/02/2018 à 04:07, Igor Cicimov a écrit :
> 
> 
> On Tue, Feb 20, 2018 at 9:55 PM, Julien Escario  > wrote:
> 
> Le 10/02/2018 à 04:39, Igor Cicimov a écrit :
> > Did you tell it
> > to? 
> https://docs.linbit.com/doc/users-guide-84/s-configure-io-error-behavior/
> 
> 
> 
> Sorry for the late answer : I moved on performance tests with a ZFS RAID1
> backend. I'll retry backend failure a little later.
> 
> But ... as far as I understand, 'detach' behavior should be the default 
> no ?
> 
> 
> ​I think the default is/was for DRBD to "pass-on" the error to the higher 
> layer
> that should decide it self how to handle it.

Perhaps, or I notice a strange parameter in zpool attributes :
drbdpool  failmode   wait   default

failmode = wait ? That's something that could lead the DRBD stack not beeing
informed of the zpool failure.

So, 2 more things to test as soon as I have finished the horringbly long list of
performance parameters.

Best regards,
Julien Escario
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] ZFS storage backend failed

2018-02-20 Thread Igor Cicimov
On Tue, Feb 20, 2018 at 9:55 PM, Julien Escario  wrote:

> Le 10/02/2018 à 04:39, Igor Cicimov a écrit :
> > Did you tell it
> > to? https://docs.linbit.com/doc/users-guide-84/s-
> configure-io-error-behavior/
>
> Sorry for the late answer : I moved on performance tests with a ZFS RAID1
> backend. I'll retry backend failure a little later.
>
> But ... as far as I understand, 'detach' behavior should be the default no
> ?
>

​I think the default is/was for DRBD to "pass-on" the error to the higher
layer that should decide it self how to handle it.


> My tought is that DRBD wasn't notified or didn't detect the blocked IOs on
> the
> backend. Perhaps a specific bahevior of ZFS.
>
> More tests to come.
>
> Best regards,
> Julien Escario
> ___
> drbd-user mailing list
> drbd-user@lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
>
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] ZFS storage backend failed

2018-02-20 Thread Julien Escario
Le 10/02/2018 à 04:39, Igor Cicimov a écrit :
> Did you tell it
> to? https://docs.linbit.com/doc/users-guide-84/s-configure-io-error-behavior/

Sorry for the late answer : I moved on performance tests with a ZFS RAID1
backend. I'll retry backend failure a little later.

But ... as far as I understand, 'detach' behavior should be the default no ?

My tought is that DRBD wasn't notified or didn't detect the blocked IOs on the
backend. Perhaps a specific bahevior of ZFS.

More tests to come.

Best regards,
Julien Escario
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] ZFS storage backend failed

2018-02-09 Thread Igor Cicimov
On 10 Feb 2018 5:02 am, "Julien Escario"  wrote:

Hello,
I'm just doing a lab about zpool as storage backend for DRBD (storing VM
images
with Proxmox).

Right now, it's pretty good once tuned and I've been able to achieve 500MB/s
write speed with just a little curiosity about concurrent write from both
hypervisors cluster but that's not the point here.

To complete resiliancy tests, I simplify unplugged a disk from a node. My
toughs
was DRBD was just going to detect ZFS failure and detach the ressources from
failed device.


Did you tell it to?
https://docs.linbit.com/doc/users-guide-84/s-configure-io-error-behavior/


But ... nothing. I/O just hangs on VMs ran on the 'failed' node.

My zpool status :

NAMESTATE READ WRITE CKSUM
drbdpoolUNAVAIL  0 0 0  insufficient replicas
  sda   UNAVAIL  0 0 0

but drbdadm show this for locally hosted VM (on the failed node) :
vm-101-disk-1 role:Primary
  disk:UpToDate
  hyper-test-02 role:Secondary
peer-disk:UpToDate

and remote VM (on the 'sane' node from failed node point of view) :
vm-104-disk-1 role:Secondary
  disk:Consistent
  hyper-test-02 connection:NetworkFailure


So it seems that DRBD didn't detect the I/O failure.

Is there a way to force automatic failover in this case ? I probably missed
a
detection mecanism.

Best regards,
Julien Escario

___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user