rvalles wrote:

>On Thu, Jun 01, 2006 at 01:59:20AM +0200, rvalles wrote:
>  
>
>>>ftp://ftp.namesys.com/pub/reiser4-for-2.6/2.6.16/reiser4-for-2.6.16-3.patch.gz
>>> contains the most recent reiser4 code which is considered stable inside 
>>>Namesys.
>>>Please try it. Any feedback is welcome.
>>>      
>>>
>
>  
>
>>Finally, the "fsync/mmap >1minute writing to disk while halting all IO"
>>I reported a few weeks ago seems to be gone. Are you aware of that? what
>>caused it on the first place?
>>
>>01:53:17 up 1 day, 23:59, 11 users,  load average: 0.37, 0.14, 0.21
>>
>>This allows me to use a kernel newer than 2.6.12.6 for the first time.
>>Now, let's hope it stays ok and the bug doesn't show itself in just a
>>few hours.
>>    
>>
>03:57:54 up 7 days,  2:03, 11 users,  load average: 1.66, 1.64, 1.91
>
>Ok, it's been a week already. While something has improved, the problem
>seems to still be there; it triggers much less often, and behaves
>different.
>  
>
Well, we are getting close to this being at the top of the queue, so
over the next week can you collect data for us that says: I see such and
such delays with the new code?  I hope that we can then try to fix it. 
We need to write a patch that makes the generic write code pass to
reiser4 more than 4k at a time, change our read code to submit bios more
than 4k at a time, test the generic read code to see that it does not
let device congestion cause it to request from us 4k at a time, and then
we (I hope) can turn to these pauses and fsync optimization.  These
pauses and fsync optimization will be related code I think.  I think we
need to let users control whether fsync is lazy or aggressive, and (this
is speculative)  we need to refine our throttling of atom growth so that
when atoms are forced to flush they do so smoothly.  I need to
understand though, does it ever happen without fsync being waited on in
the new code?  If the problem is purely that the process doing fsync
waits too long, then that is much easier than if either it happens
without fsync or fsync causes every process waiting on IO to hang. 
Also, if it is only that fsync causes every process waiting on IO to
hang, it could be that IO scheduler tweaks could help it.  On the other
hand, if pauses happen every 600 seconds, then we have a deeper issue.

>For now, I've only managed to trigger it using mutt, the moment I "send"
>mail, it happens like 90% of the times. I can now, tho, edit files with vim
>without turning crazy.
>
>My brother, who uses the computer (I only ssh to it) hasn't noticed any
>problem, therefore if the problem isn't fixed, it is well hidden. Also,
>when I trigger it, it doesn't seem to affect whatever I/O is being done
>in paralel of the task that caused it, which makes me think it triggers
>far more often than I notice.
>
>Haven't tried -4, should I? I think I've heard it only fixes build-as-module
>problems, but I really don't know.
>
>Thanks,
>Roc Vallès Domènech
>
>  
>

Reply via email to