On Sun, Sep 14, 2014 at 12:30 PM, Steven Hartland <[email protected]>
wrote:

>
> ----- Original Message ----- From: "Matthew Ahrens" <[email protected]>
> To: "illumos-zfs" <[email protected]>; "Steven Hartland" <
> [email protected]>
> Cc: "developer" <[email protected]>
> Sent: Sunday, September 14, 2014 6:31 PM
> Subject: Re: [zfs] ZFS Write Throttle Dirty Data Limit Interation with
> Free Memory
>
>
>
>  On Sun, Sep 14, 2014 at 10:00 AM, Steven Hartland via illumos-zfs <
>> [email protected]> wrote:
>>
>>  We've been investigating a problem with stalls on FreeBSD when
>>> using ZFS and one of the current theories which is producing some
>>> promising results is within the new IO scheduler, specifically
>>> around how the dirty data limit being static limit.
>>>
>>> The stalls occur when memory is close to the low water mark around
>>> where paging will be triggered. At this time if there is a burst of
>>> write IO, such as a copy from a remote location, ZFS can rapidly
>>> allocate memory until the dirty data limit is hit.
>>>
>>> This rapid memory consumption exacerbates the low memory situation
>>> resulting in increased swapping and more stalls to the point where
>>> the machine can be essentially become unusable for a good period
>>> of time.
>>>
>>> I will say its not clear if this only effects FreeBSD due to the
>>> variations in how the VM interacts with ZFS or not.
>>>
>>> Karl one of the FreeBSD community members who has been suffering
>>> from this issue on his production environments, has been playing
>>> with recalculating zfs_dirty_data_max at the start of
>>> dmu_tx_assign(..) to take into account free memory.
>>>
>>> While this has produced good results in his environment, eliminating
>>> the stalls totally while keep IO usage high, its not clear if the
>>> variation of zfs_dirty_data_max could have undesired side effects.
>>>
>>> Given both Adam and Matt read these lists I thought it would be an
>>> ideal place to raise this issue and get expert feedback on this
>>> problem and potential ways of addressing it.
>>>
>>> So the questions:
>>> 1. Is this a FreeBSD only issue or could other implementations
>>>   suffer from similar memory starvation situation due to rapid
>>>   consumption until dirty data max is hit?
>>> 2. Should dirty max or its consumers be made memory availability
>>>   aware to ensure that swapping due to IO busts are avoided?
>>>
>>>
>> This is probably the wrong solution.
>>
>> Are you sure that this only happens when writing, and not when reading?
>> All arc buffer allocation (including for writing) should go through
>> arc_get_data_buf(), which will evict from the ARC to make room for the new
>> buffer if necessary, based on arc_evict_needed().
>>
>
> The load is a mixture of reads and write, with the trigger in this test
> being
> a large amount of writes over samba by a backup process, so that doesn't
> mean that reads aren't a trigger for this ever.
>
> We've been investigating ARC allocation quite a bit and ARC does indeed
> get pushed back. Adjusting ARC's target for fee has helped but any
> significant adjustment on that has been demonstrated to cause other
> issues, such as ARC pushed back to min for a considerable amount of
> time, if not indefinitely as the VM never sees any pressure hence doesn't
> scan INACT entries.
>
> With regards to buffers being allocated by arc_get_data_buf() I can't see
> a path by which ARC will prevent a new buffer being allocated even when
> arc_evict_needed().
>

It won't, but it will evict an existing buffer, thus freeing up memory for
the new one.


>
> If thats the case can't we hit min ARC but yet still claim new buffers? If
> so we can suddenly demand up to 10% of the system memory all of which
> may required VM to page before it can provide said memory.
>
>
Sure, the ARC can grow up to the minimum size without restriction.  Is your
ARC below the minimum size?

--matt
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to