On Mon, 1 Nov 1999 [EMAIL PROTECTED] wrote:

> XFS on Irix caches file data in buffers, but not in the regular buffer cache,
> they are cached off the vnode and organized by logical file offset rather
> than by disk block number, the memory in these buffers comes from the page
> subsystem, the page tag being the vnode and file offset. These buffers do
> not have to have a physical disk block associated with them, XFS allows you
> to reserve blocks on the disk for a file without picking which blocks. At
> some point when the data needs to be written (memory pressure, or sync
> activity etc), the filesystem is asked to allocate physical blocks for the
> data, these are associated with the buffers and they get written out.
> Delaying the allocation allows us to collect together multiple small writes
> into one big allocation request. It also means that we can bypass allocation
> altogether if the file is truncated before it is flushed to disk.

the new 2.3 pagecache should enable this almost out-of-box. Apart from
memory pressure issues, the missing bit is to split up fs->get_block()
into a 'soft' and 'real' allocation branch. This means that whenever the
pagecache creates a new dirty page, it calls the 'soft' get_block()
variant, which is very fast and just bumps up some counters within XFS (so
we do not get asynchron out-of-space conditions). Then whenever
ll_rw_block() (or bdflush) sees a !buffer_mapped() but buffer_allocated()
block it will call the 'real' lowlevel handler to do the allocation for
real. 

i kept this in mind all along when doing the pagecache changes, and i
intend to do this for ext2fs. Splitting up get_block() is easy without
breaking filesystems, the last 'create' parameter can be made '2' to mean
'lazy create'. 

note that not all filesystems can know in advance how much space a new
inode block will take, but this is not a problem, the lazy-allocator can
safely 'overestimate' space needs. 

is this the kind of interface you need for XFS? i can make a prototype
patch for ext2fs (and the pagecache & bdflush), which should be easy to
adopt for XFS. 

-- mingo

Reply via email to