Hello,

On 02/09/10 17:28, Tina Friedrich wrote:
> Hi Andreas,
>
> thanks for your answer.
>
>>> Causing most grieve at the moment is that we sometimes see delays
>>> writing files. From the writing clients end, it simply looks as if I/O
>>> stops for a while (we've seen 'pauses' of anything up to 10 seconds).
>>> This appears to be independent of what client does the writing, and
>>> software doing the writing. We investigated this a bit using strace and
>>> dd; the 'slow' calls appear to always be either open, write, or close
>>> calls. Usually, these take well below 0.001s; in around 0.5% or 1% of
>>> cases, they take up to multiple seconds. It does not seem to be
>>> associated with any specific OST, OSS, client or anything; there is
>>> nothing in any log files or any exceptional load on MDS or OSS or any of
>>> the clients.
>>
>> This is most likely associated with delays in committing the journal on the 
>> MDT or OST, which can happen if the journal fills completely.  Having larger 
>> journals can help, if you have enough RAM to keep them all in memory and not 
>> overflow.  Alternately, if you make the journals small it will limit the 
>> latency, at the cost of reducing overall performance.  A third alternative 
>> might be to use SSDs for the journal devices.
>
> Just to double check - that would be the file system journal, I assume?
>
> That makes a lot of sense; is there a way to verify that this is the
> issue we're having?
>
> Journal size appears to be 400M - if we were to try increasing it, how
> would be determine what to best set it to?

That was meant to be 'if we were to try increasing or decreasing it' - 
sounds to us as if decreasing might be the better option (as in, if this 
is the journal flushing, having less journal to flush would probably be 
better - or is that the wrong idea?)


>>> The other issue is that we frequently see delays when trying to read a
>>> file. I sometimes takes more than 60s for a file to be visible on a
>>> machine after the initial write on a different machine has completed
>>> (both machines being Lustre clients). Again, there is nothing in the
>>> logs, nor exceptional load on any of the machines.
>>
>> This is probably just a manifestation of the first problem.  The issue 
>> likely isn't in the read, but a delay in flushing the data from the cache of 
>> the writing client.  There were fixes made in 1.8 to increase the IO 
>> priority for clients writing data under a lock that other clients are 
>> waiting on.
>
> We kind of suspected them to be related, yes.
>
> Tina
>


-- 
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to