On Sun, Sep 14, 2014 at 12:30 PM, Steven Hartland <[email protected]> wrote:
> > ----- Original Message ----- From: "Matthew Ahrens" <[email protected]> > To: "illumos-zfs" <[email protected]>; "Steven Hartland" < > [email protected]> > Cc: "developer" <[email protected]> > Sent: Sunday, September 14, 2014 6:31 PM > Subject: Re: [zfs] ZFS Write Throttle Dirty Data Limit Interation with > Free Memory > > > > On Sun, Sep 14, 2014 at 10:00 AM, Steven Hartland via illumos-zfs < >> [email protected]> wrote: >> >> We've been investigating a problem with stalls on FreeBSD when >>> using ZFS and one of the current theories which is producing some >>> promising results is within the new IO scheduler, specifically >>> around how the dirty data limit being static limit. >>> >>> The stalls occur when memory is close to the low water mark around >>> where paging will be triggered. At this time if there is a burst of >>> write IO, such as a copy from a remote location, ZFS can rapidly >>> allocate memory until the dirty data limit is hit. >>> >>> This rapid memory consumption exacerbates the low memory situation >>> resulting in increased swapping and more stalls to the point where >>> the machine can be essentially become unusable for a good period >>> of time. >>> >>> I will say its not clear if this only effects FreeBSD due to the >>> variations in how the VM interacts with ZFS or not. >>> >>> Karl one of the FreeBSD community members who has been suffering >>> from this issue on his production environments, has been playing >>> with recalculating zfs_dirty_data_max at the start of >>> dmu_tx_assign(..) to take into account free memory. >>> >>> While this has produced good results in his environment, eliminating >>> the stalls totally while keep IO usage high, its not clear if the >>> variation of zfs_dirty_data_max could have undesired side effects. >>> >>> Given both Adam and Matt read these lists I thought it would be an >>> ideal place to raise this issue and get expert feedback on this >>> problem and potential ways of addressing it. >>> >>> So the questions: >>> 1. Is this a FreeBSD only issue or could other implementations >>> suffer from similar memory starvation situation due to rapid >>> consumption until dirty data max is hit? >>> 2. Should dirty max or its consumers be made memory availability >>> aware to ensure that swapping due to IO busts are avoided? >>> >>> >> This is probably the wrong solution. >> >> Are you sure that this only happens when writing, and not when reading? >> All arc buffer allocation (including for writing) should go through >> arc_get_data_buf(), which will evict from the ARC to make room for the new >> buffer if necessary, based on arc_evict_needed(). >> > > The load is a mixture of reads and write, with the trigger in this test > being > a large amount of writes over samba by a backup process, so that doesn't > mean that reads aren't a trigger for this ever. > > We've been investigating ARC allocation quite a bit and ARC does indeed > get pushed back. Adjusting ARC's target for fee has helped but any > significant adjustment on that has been demonstrated to cause other > issues, such as ARC pushed back to min for a considerable amount of > time, if not indefinitely as the VM never sees any pressure hence doesn't > scan INACT entries. > > With regards to buffers being allocated by arc_get_data_buf() I can't see > a path by which ARC will prevent a new buffer being allocated even when > arc_evict_needed(). > It won't, but it will evict an existing buffer, thus freeing up memory for the new one. > > If thats the case can't we hit min ARC but yet still claim new buffers? If > so we can suddenly demand up to 10% of the system memory all of which > may required VM to page before it can provide said memory. > > Sure, the ARC can grow up to the minimum size without restriction. Is your ARC below the minimum size? --matt
_______________________________________________ developer mailing list [email protected] http://lists.open-zfs.org/mailman/listinfo/developer
