"Stephen C. Tweedie" wrote:

> Hi,
>
> On Tue, 12 Oct 1999 03:14:03 +0400, Hans Reiser <[EMAIL PROTECTED]> said:
>
> >> Hans, you didn't mention a journal call that happens on sync, or
> >> sync_old_buffers...
>
> > I see two issues: how to respond to memory pressure, and how to sync.
> > I'll let you articulate our sync needs.
>
> There are actually two separate memory pressure concerns.

>  The first is
> how to clear out some dirty, pinned buffers when we need to free up some
> memory, and try_to_free_buffers/bdflush are the main mechanisms involved
> right now.
>
> With journaling, however, we have a new problem.  We can have large
> amounts of dirty data pinned in memory, but we cannot actualy write
> that data to disk without first allocating more memory.

Trivia: I don't think this is a feature of journaling, but rather a feature of a
particular implementation of journaling.  Chris will correct me if I err, but
Chris's journaling doesn't have this property.

Let us define a buffer's state as FLUSHTIME_NON_EXPANDING if flushing it
requires no additional memory, and FLUSHTIME_EXPANDING otherwise.

I see the following separate issues:

how to drive a kernel subsystem to flush some memory.  I advocate that the vm
system push, and the subsystems give it calls for doing the pushing.

How to ensure that there is at least largest_reservation buffers of
FLUSHTIME_NON_EXPANDING memory at all times, where largest_reservation is the
sum of the amount every kernel subsystem says it might need at maximum.  There
would be a reserve() and unreserve() for the kernel subsystems to call.
I hypothesize that if largest_reservation is unnecessarily large, so long as it
is not completely obscene performance will not suffer (and might gain), and the
code simplicity/performance will be improved as a result of using the maximum
possible to need rather than tracking the amount actually needed.

the interface for syncing commits.



>  I'm not sure
> about the reiserfs case, but in ext3 I certainly need to allocate
> buffers to describe control blocks in the journal, for example.
>
> This introduces a second memory pressure requirement: we must always
> restrict the amount of unrecoverable dirty pinned memory so that when we
> want to reclaim that memory, we have enough unpinned pages left to
> complete the commit operation.
>
> This came up in discussions with the XFS people [hence the linux-fsdevel
> cross post]: it matters to many filesystems.  In XFS it is a
> substantially more significant problem, because they are performing
> delayed allocation of written data and so they potentially need a lot
> more space in core for metadata updates before the data can be flushed
> to disk.
>
> This is much less of a problem for ext3 and will also probably not
> matter too much for reiserfs until you decide to move to lazy block
> allocation.

We will indeed move to flushtime block allocation.

>  However, a common mechanism for dealing with this would
> definitely let all three filesystems survive just that bit better under
> really serious memory pressure.
>
> --Stephen

For reiserfs, it would simplify our balancing code (fix_nodes() in particular)
and improve our performance if we could efficiently reserve.  Roma, think about
this.

Hans

--
Get Linux (http://www.kernel.org) plus ReiserFS
 (http://devlinux.org/namesys).  If you sell an OS or
internet appliance, buy a port of ReiserFS!  If you
need customizations and industrial grade support, we sell them.


Reply via email to