Right, I actually ended up deadlocking rbd-nbd, that's why I switched
over to rbd-replay.
The flow was
rbd-nbd map &
unmap()
{
rbd-nbd unmap
}
while true; do
lsblk --noempty /dev/nbd0
r=$?
[ $r -eq 32 ] && continue
[ $r -eq 0 ] && break
done
dd if=/dev/random of=/dev/nbd0 bs=4096 count=1 oflag=sync
What I did was to ctrl+c the process directly as I started. Maybe
adding the following just before dd would be enough.
Sadly I have to reboot the whole VM afterwards :)
deadlock()
{
sleep 0.1
exit 1
}
deadlock &
On Wed, Dec 21, 2022 at 10:22 PM Sam Perman <[email protected]> wrote:
>
> Thanks, i'll take a look at that. For reference, the deadlock we are seeing
> looks similar to the one described at the bottom of this issue:
> https://tracker.ceph.com/issues/52088
>
> thanks
> sam
>
> On Wed, Dec 21, 2022 at 4:04 PM Josef Johansson <[email protected]> wrote:
>>
>> Hi,
>>
>> I made some progress with my testing on a similat issue. Maybe the test will
>> be easy to adapt tonyour case.
>>
>> https://tracker.ceph.com/issues/57396
>>
>> What I can say though is that I don't see the deadlock problem in my testing.
>>
>> Cheers
>> -Josef
>>
>> On Wed, 21 Dec 2022 at 22:00, Sam Perman <[email protected]> wrote:
>>>
>>> Hello!
>>>
>>> I'm trying to chase down a deadlock we occasionally see on the client side
>>> when using rbd-nbd and have a question about a lingering process we are
>>> seeing.
>>>
>>> I have a simple test script that will execute the following in order:
>>>
>>> * use rbd to create a new image
>>> * use rbd-nbd to map the image locally
>>> * mkfs a file system
>>> * mount the image locally
>>> * use dd to write some dummy data
>>> * unmount the device
>>> * use rbd-nbd to unmap the image
>>> * use rbd to remove the image
>>>
>>> After this is all done, there is a lingering process that I'm curious
>>> about.
>>>
>>> The process is called "[kworker/u9:0-knbd0-recv]" (in state "I") and is a
>>> child of "[kthreadd]" (in state "S").
>>>
>>> Is this normal? I don't see any specific problems with it but I'm
>>> eventually going to ramp up this test to use a lot of concurrency to see if
>>> I can reproduce the deadlock we are seeing, and want to make sure I'm
>>> starting clean.)
>>>
>>> Thanks for any insight you have!
>>> sam
>>> _______________________________________________
>>> ceph-users mailing list -- [email protected]
>>> To unsubscribe send an email to [email protected]
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]