Hi, I see the scope of the discussion here got quickly beyond the scope of my first posting :-) Anyway, the filesystem we're implementing is a variant of a classic log-structured filesystem which is quite similiar to unix filesystems in many aspects (like inodes and stuff) and we will have 0-1 (sort of) transactions so as far as this issue is concerned our case is probably very similiar to ext3 delayed allocation.
On 4/19/05, Badari Pulavarty <[EMAIL PROTECTED]> wrote: > The idea is to "reserve" a block at the prepare/commit write instead > of allocating the block. Do the actual allocation in writepage(). Exactly. > Here are the issues: > ==================== > > 1) Currently none of the generic helper routines can handle this. > We need to add support to do these, but still somehow make the > routines generic enough for every ones use. I'm quite happy about most of them. I can't see how we could use any generic form of writepage(s) as we write stuff in a quite different way from almost anybody else but all the others except block_prepare_write do pretty much exactly what we need (if I have not missed something). > 2) There is no easy way to find out if we "reserved" a block or > not in writepage() correctly. There are 2 paths to writepage(). > > sys_write() -> prepare/commit() > and later sync() ----> writepage() > > mmap() -> touch a page() > and later --> writepage() > > In order to do the correct accounting, we need to mark a page > to indicate if we reserved a block or not. One way to do this, > to use page->private to indicate this. But then, all the generic > routines will fail - since they assume that page->private represents > bufferheads. So we need a better way to do this. I didn't hope for a special bit in struct page so I wanted to simply fake the page/buffer mapping somehow. Since we don't really care whether a page is mapped or reserved as long as it is at least one of these when actually writing it (we write stuff to different places from where we have read it from), the PG_mappedtodisk is fine for us as long as no other kernel code thinks that having it set means we also have buffers which point to meaningful positions on the device because we don't. Is that the case? Of course, having a PG_RESERVED flag would be a nice and clean thing to use and we would be more than happy to do so. > 3) We need add hooks into filesystem specific calls from these > generic routines to handle "journaling mode" requirements Our fs is basically one big journal so we don't need any of these. Or at least I don't see any need for it at the moment. > So, what are your requirements ? I am looking for a common > way to combine all the requirements and come out with a > saner "generic" routines to handle these. I'm happy with most generic functions. we need to implement writepage(s) ourselves no matter what, the only problem is block_prepare_write and I can currently only see two options for us: 1) Implement it ourselves and use a flag in the struct page to mark it reserved. 2) Use block_prepare_write but enable the get_block function to mark an individual buffer as reserved so that it is trated as mapped (can be dirty and stuff) but no code assumes it is located somewhere on the disk (for example block_prepare_write would not call unmap_underlying_metadata). I think we'll go for the first method, but the second would make life easier for filesystems which can have pages consisting of both mapped and reserved blocks. Thank you very much for your reply, the whole thread has been well worth reading. Martin Jambor - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html