Hi,

I’m working on getting a test without bitmap. To make things simple for myself: 
is it helpful if I just use “mdadm --grow --bitmap=none” to disable it or is 
that futile?

Christian

> On 23. Oct 2024, at 03:13, Yu Kuai <yuku...@huaweicloud.com> wrote:
> 
> Hi,
> 
> 在 2024/10/22 23:02, Christian Theune 写道:
>> Hi,
>> I had to put this issue aside and as Yu indicated he was busy I didn’t 
>> follow up yet.
>> @Yu: I don’t have new insights, but I have a basically identical machine 
>> that I will start adding new data with a similar structure soon.
>> I couldn’t directly reproduce the issue there - likely because the network 
>> is a bit slower as it’s connected from a remote side and has only 1G instead 
>> of 10G, due to the long distances.
>> Let me know if you’re interested in following up here and I’ll try to make 
>> room on my side to get you more input as needed.
> 
> Yes, sorry that I was totally busy with other things. :(
> 
> BTW, what is the result after bypassing bitmap(disable bitmap by
> kernel hacking)?
> 
> Thanks,
> Kuai
> 
>> Christian
>>> On 15. Aug 2024, at 13:14, Yu Kuai <yuku...@huaweicloud.com> wrote:
>>> 
>>> Hi,
>>> 
>>> 在 2024/08/15 18:03, Christian Theune 写道:
>>>> Hi,
>>>> small insight: even given my dataset that can reliably trigger this (after 
>>>> around 1.5 hours of rsyncing) it does not trigger on a specific set of 
>>>> files. I’ve deleted the data and started the rsync on a fresh directory 
>>>> (not a fresh filesystem, I can’t delete that as it carries important data) 
>>>> but it doesn’t always get stuck on the same files, even though rsync 
>>>> processes them in a repeatable order.
>>>> I’m wondering how to generate more insights from that. Maybe keeping a 
>>>> blktrace log might help?
>>>> It sounds like the specific pattern relies on XFS doing a specific thing 
>>>> there …
>>>> Wild idea: maybe running the xfstest suite on an in-memory raid 6 setup 
>>>> could reproduce this?
>>>> I’m guessing that the xfs people do not regularly run their test suite on 
>>>> a layered setup like mine with encryption and software raid?
>>> 
>>> That sounds greate.
>>>>  Christian
>>>>> On 15. Aug 2024, at 08:19, Christian Theune <c...@flyingcircus.io> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>>> On 14. Aug 2024, at 10:53, Christian Theune <c...@flyingcircus.io> wrote:
>>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>>> On 12. Aug 2024, at 20:37, John Stoffel <j...@stoffel.org> wrote:
>>>>>>> 
>>>>>>> I'd probably just do the RAID6 tests first, get them out of the way.
>>>>>> 
>>>>>> Alright, those are running right now - I’ll let you know what happens.
>>>>> 
>>>>> I’m not making progress here. I can’t reproduce those on in-memory 
>>>>> loopback raid 6. However: i can’t fully produce the rsync. For me this 
>>>>> only triggered after around 1.5hs of progress on the NVMe which resulted 
>>>>> in the hangup. I can only create around 20 GiB worth of raid 6 volume on 
>>>>> this machine. I’ve tried running rsync until it exhausts the space, 
>>>>> deleting the content and running rsync again, but I feel like this isn’t 
>>>>> suffient to trigger the issue. :(
>>>>> 
>>>>> I’m trying to find whether any specific pattern in the files around the 
>>>>> time it locks up might be relevant here and try to run the rsync over that
>>>>> portion.
>>>>> 
>>>>> On the plus side, I have a script now that can create the various 
>>>>> loopback settings quickly, so I can try out things as needed. Not that 
>>>>> valuable without a reproducer, yet, though.
>>>>> 
>>>>> @Yu: you mentioned that you might be able to provide me a kernel that 
>>>>> produces more error logging to diagnose this? Any chance we could try 
>>>>> that route?
>>> 
>>> Yes, however, I still need some time to sort out the internal process of
>>> raid5. I'm quite busy with some other work stuff and I'm familiar with
>>> raid1/10, but not too much about raid5. :(
>>> 
>>> Main idea is to figure out why IO are not dispatched to underlying
>>> disks.
>>> 
>>> Thanks,
>>> Kuai
>>> 
>>>>> 
>>>>> Christian
>>>>> 
>>>>> -- 
>>>>> Christian Theune · c...@flyingcircus.io · +49 345 219401 0
>>>>> Flying Circus Internet Operations GmbH · https://flyingcircus.io
>>>>> Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
>>>>> HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian 
>>>>> Zagrodnick
>>>> Liebe Grüße,
>>>> Christian Theune
>> Liebe Grüße,
>> Christian Theune


Liebe Grüße,
Christian Theune

-- 
Christian Theune · c...@flyingcircus.io · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick


Reply via email to