Martin Jambor wrote:

Hi all,

I am a member of a group that implements a filesystem that allocates
disk blocks to in-memory blocks lazily, that means, the decision is
made just before the data are actually sent to disk. Moreover, when
cached pages are modified, the data can be (and almost certainly will
be) written to a different place to from where it was read.

I was wondering, whether we could use the generic function
block_prepare_write at all. The function checks every buffer of the
page and if it is not mapped, it calls a fs supplied function that is
supposed to map the buffer, i.e. assign it a block on the device and
set its mapped flag.

This is where we would like to give an error if there is not enough
free disk space left but we cannot give a specific device block number
yet. Can we make one up, such as -1? What would that do to such dark
functions as unmap_underlying_metadata or any other? Would some other
part of kernel break if there was a bunch of buffers assigned to the
same spot on the disk?

On the other hand, if I understand buffer flags correctly, I need to
be able to emulate mapping of buffers to set them dirty, or em I
wrong?

Thanks for any insight or thoughts,

Yes. Its possible to do what you want to. I am currently working on adding "delayed allocation" support to ext3. As part of that, We are modifying generic helper routines to delay the allocation from prepare time to actual writeout time. (writepage).

Here is the basic idea:
=======================

The idea is to "reserve" a block at the prepare/commit write instead
of allocating the block. Do the actual allocation in writepage().
Sounds simple :)

Here are the issues:
====================

1) Currently none of the generic helper routines can handle this.
We need to add support to do these, but still somehow make the
routines generic enough for every ones use.

2) There is no easy way to find out if we "reserved" a block or
not in writepage() correctly. There are 2 paths to writepage().

        sys_write() -> prepare/commit()
                and later sync() ----> writepage()

        mmap() -> touch a page()
                and later --> writepage()

In order to do the correct accounting, we need to mark a page
to indicate if we reserved a block or not. One way to do this,
to use page->private to indicate this. But then, all the generic
routines will fail - since they assume that page->private represents
bufferheads. So we need a better way to do this.

3) We need add hooks into filesystem specific calls from these
generic routines to handle "journaling mode" requirements
(for ext3 and may be others).

So, what are your requirements ?  I am looking for a common
way to combine all the requirements and come out with a
saner "generic" routines to handle these.


Thanks, Badari

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to