Re: [ceph-users] reproducable rbd-nbd crashes

Marc Schöchlin Mon, 22 Jul 2019 22:29:17 -0700

Hi Mike,

Am 22.07.19 um 16:48 schrieb Mike Christie:
> On 07/22/2019 06:00 AM, Marc Schöchlin wrote:
>>> With older kernels no timeout would be set for each command by default,
>>> so if you were not running that tool then you would not see the nbd
>>> disconnect+io_errors+xfs issue. You would just see slow IOs.
>>>
>>> With newer kernels, like 4.15, nbd.ko always sets a per command timeout
>>> even if you do not set it via a nbd ioctl/netlink command. By default
>>> the timeout is 30 seconds. After the timeout period then the kernel does
>>> that disconnect+IO_errors error handling which causes xfs to get errors.
>>>
>> Did i get you correctly: Setting a unlimited timeout should prevent crashes 
>> on kernel 4.15?
> It looks like with newer kernels there is no way to turn it off.
>
> You can set it really high. There is no max check and so it depends on
> various calculations and what some C types can hold and how your kernel
> is compiled. You should be able to set the timer to an hour.


Okay, i already experimented with high timeouts (i.e 600 seconds). As i can 
remember this leaded to pretty unusable system if i put high amounts of io on 
the ec volume.
This system also runs als krbd volume which saturates the system with ~30-60% 
iowait - this volume never had a problem.

A comment writer in https://tracker.ceph.com/issues/40822#change-141205 
suggests me to reduce the rbd cache.
What do you think about that?

>
>> For testing purposes i set the timeout to unlimited ("nbd_set_ioctl 
>> /dev/nbd0 0", on already mounted device).
>> I re-executed the problem procedure and discovered that the 
>> compression-procedure crashes not at the same file, but crashes 30 seconds 
>> later with the same crash behavior.
>>
> 0 will cause the default timeout of 30 secs to be used.

Okay, then the usage description of 
https://github.com/OnApp/nbd-kernel_mod/blob/master/nbd_set_timeout.c not seems 
to be correct :-)

Regards
Marc

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] reproducable rbd-nbd crashes

Reply via email to