Re: [PATCH 3/3] AFS: Implement basic file write support
David Howells wrote: Nick Piggin <[EMAIL PROTECTED]> wrote: Why do you call SetPageUptodate when the page is not up to date? That leaks uninitialised data, AFAIKS. It only seems that way. If afs_prepare_write() is called, but doesn't return an error, then afs_commit_write() will be called, and it seems that the copy in of the data will be guaranteed not to fail by the caller. Not only does it seem that way, it is that way :) PG_uptodate is being set when the page is not uptodate, isn't it? Furthermore, afs_prepare_page() will have filled in the missing bits. And whilst all that is going on, the page lock will be help by the caller, so that no-one else can access the partially complete page. When a page is uptodate in pagecache, the generic read and nopage functions do not take the page lock. So how are you excluding those? -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] AFS: Implement basic file write support
David Howells wrote: Nick Piggin [EMAIL PROTECTED] wrote: Why do you call SetPageUptodate when the page is not up to date? That leaks uninitialised data, AFAIKS. It only seems that way. If afs_prepare_write() is called, but doesn't return an error, then afs_commit_write() will be called, and it seems that the copy in of the data will be guaranteed not to fail by the caller. Not only does it seem that way, it is that way :) PG_uptodate is being set when the page is not uptodate, isn't it? Furthermore, afs_prepare_page() will have filled in the missing bits. And whilst all that is going on, the page lock will be help by the caller, so that no-one else can access the partially complete page. When a page is uptodate in pagecache, the generic read and nopage functions do not take the page lock. So how are you excluding those? -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] AFS: Implement basic file write support
Nick Piggin <[EMAIL PROTECTED]> wrote: > Why do you call SetPageUptodate when the page is not up to date? > That leaks uninitialised data, AFAIKS. It only seems that way. If afs_prepare_write() is called, but doesn't return an error, then afs_commit_write() will be called, and it seems that the copy in of the data will be guaranteed not to fail by the caller. Furthermore, afs_prepare_page() will have filled in the missing bits. And whilst all that is going on, the page lock will be help by the caller, so that no-one else can access the partially complete page. I suppose I could call SetPageUptodate() in afs_commit_write() instead. David - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] AFS: Implement basic file write support
Nick Piggin [EMAIL PROTECTED] wrote: Why do you call SetPageUptodate when the page is not up to date? That leaks uninitialised data, AFAIKS. It only seems that way. If afs_prepare_write() is called, but doesn't return an error, then afs_commit_write() will be called, and it seems that the copy in of the data will be guaranteed not to fail by the caller. Furthermore, afs_prepare_page() will have filled in the missing bits. And whilst all that is going on, the page lock will be help by the caller, so that no-one else can access the partially complete page. I suppose I could call SetPageUptodate() in afs_commit_write() instead. David - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] AFS: Implement basic file write support
David Howells wrote: +/* + * prepare a page for being written to + */ +static int afs_prepare_page(struct afs_vnode *vnode, struct page *page, + struct key *key, unsigned offset, unsigned to) +{ + unsigned eof, tail, start, stop, len; + loff_t i_size, pos; + void *p; + int ret; + + _enter(""); + + if (offset == 0 && to == PAGE_SIZE) + return 0; + + p = kmap(page); + + i_size = i_size_read(>vfs_inode); + pos = (loff_t) page->index << PAGE_SHIFT; + if (pos >= i_size) { + /* partial write, page beyond EOF */ + _debug("beyond"); + if (offset > 0) + memset(p, 0, offset); + if (to < PAGE_SIZE) + memset(p + to, 0, PAGE_SIZE - to); + kunmap(page); + return 0; + } + + if (i_size - pos >= PAGE_SIZE) { + /* partial write, page entirely before EOF */ + _debug("before"); + tail = eof = PAGE_SIZE; + } else { + /* partial write, page overlaps EOF */ + eof = i_size - pos; + _debug("overlap %u", eof); + tail = max(eof, to); + if (tail < PAGE_SIZE) + memset(p + tail, 0, PAGE_SIZE - tail); + if (offset > eof) + memset(p + eof, 0, PAGE_SIZE - eof); + } + + kunmap(p); + + ret = 0; + if (offset > 0 || eof > to) { + /* need to fill one or two bits that aren't going to be written +* (cover both fillers in one read if there are two) */ + start = (offset > 0) ? 0 : to; + stop = (eof > to) ? eof : offset; + len = stop - start; + _debug("wr=%u-%u av=0-%u [EMAIL PROTECTED]", + offset, to, eof, start, len); + ret = afs_fill_page(vnode, key, start, len, page); + } + + _leave(" = %d", ret); + return ret; +} + +/* + * prepare to perform part of a write to a page + * - the caller holds the page locked, preventing it from being written out or + * modified by anyone else + */ +int afs_prepare_write(struct file *file, struct page *page, + unsigned offset, unsigned to) +{ + struct afs_writeback *candidate, *wb; + struct afs_vnode *vnode = AFS_FS_I(file->f_dentry->d_inode); + struct key *key = file->private_data; + pgoff_t index; + int ret; + + _enter("{%x:%u},{%lx},%u,%u", + vnode->fid.vid, vnode->fid.vnode, page->index, offset, to); + + candidate = kzalloc(sizeof(*candidate), GFP_KERNEL); + if (!candidate) + return -ENOMEM; + candidate->vnode = vnode; + candidate->first = candidate->last = page->index; + candidate->offset_first = offset; + candidate->to_last = to; + candidate->usage = 1; + candidate->state = AFS_WBACK_PENDING; + init_waitqueue_head(>waitq); + + if (!PageUptodate(page)) { + _debug("not up to date"); + ret = afs_prepare_page(vnode, page, key, offset, to); + if (ret < 0) { + kfree(candidate); + _leave(" = %d [prep]", ret); + return ret; + } + SetPageUptodate(page); + } Why do you call SetPageUptodate when the page is not up to date? That leaks uninitialised data, AFAIKS. -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] AFS: Implement basic file write support
On Wed, 09 May 2007 12:07:39 +0100 David Howells <[EMAIL PROTECTED]> wrote: > Andrew Morton <[EMAIL PROTECTED]> wrote: > > > set_page_dirty() will set I_DIRTY_PAGES only. ie: the inode has dirty > > pagecache data. > > > > To tell the VFS that the inode itself is dirty one needs to run > > mark_inode_dirty(). > > But what's the difference in this case? I don't need to write the inode back > per se, and the inode attributes can be updated by the mechanism of data > storage. > Ah. Well if you don't need to write the inode back then sure, there shouldn't be a need to mark it dirty. That's what I was asking ;) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] AFS: Implement basic file write support
Andrew Morton <[EMAIL PROTECTED]> wrote: > set_page_dirty() will set I_DIRTY_PAGES only. ie: the inode has dirty > pagecache data. > > To tell the VFS that the inode itself is dirty one needs to run > mark_inode_dirty(). But what's the difference in this case? I don't need to write the inode back per se, and the inode attributes can be updated by the mechanism of data storage. David - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] AFS: Implement basic file write support
On Wed, 09 May 2007 11:25:47 +0100 David Howells <[EMAIL PROTECTED]> wrote: > > > + set_page_dirty(page); > > > + > > > + if (PageDirty(page)) > > > + _debug("dirtied"); > > > + > > > + return 0; > > > +} > > > > One would normally run mark_inode_dirty() after any i_size_write()? > > Not in this case, I assume, because set_page_dirty() ultimately calls > __mark_inode_dirty(), but I could be wrong. set_page_dirty() will set I_DIRTY_PAGES only. ie: the inode has dirty pagecache data. To tell the VFS that the inode itself is dirty one needs to run mark_inode_dirty(). Or maybe mark_inode_dirty_sync() but I can never for the life of me remember what that thing does. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] AFS: Implement basic file write support
Andrew Morton <[EMAIL PROTECTED]> wrote: > > + BUG_ON(i_size > 0x); // TODO: use 64-bit store > > You're sure this isn't user-triggerable? Hmmm... I'm not. I'll whip up a patch for this. > kmap_atomic() could be used here and is better. Yeah. It used to have something that slept in the middle of it, but that's no longer there. I'll add to the patch. > We have this zero_user_page() thing heading in which could perhaps be used > here also. Okay. I'll have a look at it once it's there. > > + ASSERTRANGE(wb->first, <=, index, <=, wb->last); > > wow. :-) The assertions I've put in have been very useful. > > + set_page_dirty(page); > > + > > + if (PageDirty(page)) > > + _debug("dirtied"); > > + > > + return 0; > > +} > > One would normally run mark_inode_dirty() after any i_size_write()? Not in this case, I assume, because set_page_dirty() ultimately calls __mark_inode_dirty(), but I could be wrong. > We can invalidate pages and we can truncate them and we can clean them. > But here we have a new operation, "killing". I wonder what that is. I can call it invalidation if you like, though that name is already reserved as it were:-/ I suppose it might actually make sense for me to call invalidatepage() myself. > > + if (wbc->sync_mode != WB_SYNC_NONE) > > + wait_on_page_writeback(page); > > Didn't the VFS already do that? I'm not entirely sure. Looking at generic_writepages(), I guess so. I'll patch it out. > > + if (PageWriteback(page) || !PageDirty(page)) { > > + unlock_page(page); > > + return 0; > > + } > > And some of that? Yeah. Seems so. I'll patch that out too. What I'd like to do is ditch writepage() entirely - I'm not sure it's entirely necessary with the availability of writepages() - but I'll look at that another time. > I have this vague prehistoric memory that something can go wrong at the VFS > level if the address_space writes back more pages than it was asked to. > But I forget what the issue was and it would be silly to have an issue > with that anyway. Something to keep an eye out for. Okay. Thanks for the 'cherry-pick'. I'll hopefully have a revision patch for you soon. David - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] AFS: Implement basic file write support
Andrew Morton [EMAIL PROTECTED] wrote: + BUG_ON(i_size 0x); // TODO: use 64-bit store You're sure this isn't user-triggerable? Hmmm... I'm not. I'll whip up a patch for this. kmap_atomic() could be used here and is better. Yeah. It used to have something that slept in the middle of it, but that's no longer there. I'll add to the patch. We have this zero_user_page() thing heading in which could perhaps be used here also. Okay. I'll have a look at it once it's there. + ASSERTRANGE(wb-first, =, index, =, wb-last); wow. :-) The assertions I've put in have been very useful. + set_page_dirty(page); + + if (PageDirty(page)) + _debug(dirtied); + + return 0; +} One would normally run mark_inode_dirty() after any i_size_write()? Not in this case, I assume, because set_page_dirty() ultimately calls __mark_inode_dirty(), but I could be wrong. We can invalidate pages and we can truncate them and we can clean them. But here we have a new operation, killing. I wonder what that is. I can call it invalidation if you like, though that name is already reserved as it were:-/ I suppose it might actually make sense for me to call invalidatepage() myself. + if (wbc-sync_mode != WB_SYNC_NONE) + wait_on_page_writeback(page); Didn't the VFS already do that? I'm not entirely sure. Looking at generic_writepages(), I guess so. I'll patch it out. + if (PageWriteback(page) || !PageDirty(page)) { + unlock_page(page); + return 0; + } And some of that? Yeah. Seems so. I'll patch that out too. What I'd like to do is ditch writepage() entirely - I'm not sure it's entirely necessary with the availability of writepages() - but I'll look at that another time. I have this vague prehistoric memory that something can go wrong at the VFS level if the address_space writes back more pages than it was asked to. But I forget what the issue was and it would be silly to have an issue with that anyway. Something to keep an eye out for. Okay. Thanks for the 'cherry-pick'. I'll hopefully have a revision patch for you soon. David - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] AFS: Implement basic file write support
On Wed, 09 May 2007 11:25:47 +0100 David Howells [EMAIL PROTECTED] wrote: + set_page_dirty(page); + + if (PageDirty(page)) + _debug(dirtied); + + return 0; +} One would normally run mark_inode_dirty() after any i_size_write()? Not in this case, I assume, because set_page_dirty() ultimately calls __mark_inode_dirty(), but I could be wrong. set_page_dirty() will set I_DIRTY_PAGES only. ie: the inode has dirty pagecache data. To tell the VFS that the inode itself is dirty one needs to run mark_inode_dirty(). Or maybe mark_inode_dirty_sync() but I can never for the life of me remember what that thing does. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] AFS: Implement basic file write support
Andrew Morton [EMAIL PROTECTED] wrote: set_page_dirty() will set I_DIRTY_PAGES only. ie: the inode has dirty pagecache data. To tell the VFS that the inode itself is dirty one needs to run mark_inode_dirty(). But what's the difference in this case? I don't need to write the inode back per se, and the inode attributes can be updated by the mechanism of data storage. David - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] AFS: Implement basic file write support
On Wed, 09 May 2007 12:07:39 +0100 David Howells [EMAIL PROTECTED] wrote: Andrew Morton [EMAIL PROTECTED] wrote: set_page_dirty() will set I_DIRTY_PAGES only. ie: the inode has dirty pagecache data. To tell the VFS that the inode itself is dirty one needs to run mark_inode_dirty(). But what's the difference in this case? I don't need to write the inode back per se, and the inode attributes can be updated by the mechanism of data storage. Ah. Well if you don't need to write the inode back then sure, there shouldn't be a need to mark it dirty. That's what I was asking ;) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] AFS: Implement basic file write support
David Howells wrote: +/* + * prepare a page for being written to + */ +static int afs_prepare_page(struct afs_vnode *vnode, struct page *page, + struct key *key, unsigned offset, unsigned to) +{ + unsigned eof, tail, start, stop, len; + loff_t i_size, pos; + void *p; + int ret; + + _enter(); + + if (offset == 0 to == PAGE_SIZE) + return 0; + + p = kmap(page); + + i_size = i_size_read(vnode-vfs_inode); + pos = (loff_t) page-index PAGE_SHIFT; + if (pos = i_size) { + /* partial write, page beyond EOF */ + _debug(beyond); + if (offset 0) + memset(p, 0, offset); + if (to PAGE_SIZE) + memset(p + to, 0, PAGE_SIZE - to); + kunmap(page); + return 0; + } + + if (i_size - pos = PAGE_SIZE) { + /* partial write, page entirely before EOF */ + _debug(before); + tail = eof = PAGE_SIZE; + } else { + /* partial write, page overlaps EOF */ + eof = i_size - pos; + _debug(overlap %u, eof); + tail = max(eof, to); + if (tail PAGE_SIZE) + memset(p + tail, 0, PAGE_SIZE - tail); + if (offset eof) + memset(p + eof, 0, PAGE_SIZE - eof); + } + + kunmap(p); + + ret = 0; + if (offset 0 || eof to) { + /* need to fill one or two bits that aren't going to be written +* (cover both fillers in one read if there are two) */ + start = (offset 0) ? 0 : to; + stop = (eof to) ? eof : offset; + len = stop - start; + _debug(wr=%u-%u av=0-%u [EMAIL PROTECTED], + offset, to, eof, start, len); + ret = afs_fill_page(vnode, key, start, len, page); + } + + _leave( = %d, ret); + return ret; +} + +/* + * prepare to perform part of a write to a page + * - the caller holds the page locked, preventing it from being written out or + * modified by anyone else + */ +int afs_prepare_write(struct file *file, struct page *page, + unsigned offset, unsigned to) +{ + struct afs_writeback *candidate, *wb; + struct afs_vnode *vnode = AFS_FS_I(file-f_dentry-d_inode); + struct key *key = file-private_data; + pgoff_t index; + int ret; + + _enter({%x:%u},{%lx},%u,%u, + vnode-fid.vid, vnode-fid.vnode, page-index, offset, to); + + candidate = kzalloc(sizeof(*candidate), GFP_KERNEL); + if (!candidate) + return -ENOMEM; + candidate-vnode = vnode; + candidate-first = candidate-last = page-index; + candidate-offset_first = offset; + candidate-to_last = to; + candidate-usage = 1; + candidate-state = AFS_WBACK_PENDING; + init_waitqueue_head(candidate-waitq); + + if (!PageUptodate(page)) { + _debug(not up to date); + ret = afs_prepare_page(vnode, page, key, offset, to); + if (ret 0) { + kfree(candidate); + _leave( = %d [prep], ret); + return ret; + } + SetPageUptodate(page); + } Why do you call SetPageUptodate when the page is not up to date? That leaks uninitialised data, AFAIKS. -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] AFS: Implement basic file write support
On Tue, 08 May 2007 20:44:11 +0100 David Howells <[EMAIL PROTECTED]> wrote: > Implement support for writing to regular AFS files, including: > > (1) write > > (2) truncate > > (3) fsync, fdatasync > > (4) chmod, chown, chgrp, utime. > > AFS writeback attempts to batch writes into as chunks as large as it can > manage > up to the point that it writes back 65535 pages in one chunk or it meets a > locked page. > > Furthermore, if a page has been written to using a particular key, then should > another write to that page use some other key, the first write will be flushed > before the second is allowed to take place. If the first write fails due to a > security error, then the page will be scrapped and reread before the second > write takes place. > > If a page is dirty and the callback on it is broken by the server, then the > dirty data is not discarded (same behaviour as NFS). > > Shared-writable mappings are not supported by this patch. The below isn't a review - it's some random cherrypickling. > ... > > +int afs_fs_store_data(struct afs_server *server, struct afs_writeback *wb, > + pgoff_t first, pgoff_t last, > + unsigned offset, unsigned to, > + const struct afs_wait_mode *wait_mode) > +{ > + struct afs_vnode *vnode = wb->vnode; > + struct afs_call *call; > + loff_t size, pos, i_size; > + __be32 *bp; > + > + _enter(",%x,{%x:%u},,", > +key_serial(wb->key), vnode->fid.vid, vnode->fid.vnode); > + > + size = to - offset; > + if (first != last) > + size += (loff_t)(last - first) << PAGE_SHIFT; > + pos = (loff_t)first << PAGE_SHIFT; > + pos += offset; > + > + i_size = i_size_read(>vfs_inode); > + if (pos + size > i_size) > + i_size = size + pos; > + > + _debug("size %llx, at %llx, i_size %llx", > +(unsigned long long) size, (unsigned long long) pos, > +(unsigned long long) i_size); > + > + BUG_ON(i_size > 0x); // TODO: use 64-bit store You're sure this isn't user-triggerable? > +static int afs_prepare_page(struct afs_vnode *vnode, struct page *page, > + struct key *key, unsigned offset, unsigned to) > +{ > + unsigned eof, tail, start, stop, len; > + loff_t i_size, pos; > + void *p; > + int ret; > + > + _enter(""); > + > + if (offset == 0 && to == PAGE_SIZE) > + return 0; > + > + p = kmap(page); > + > + i_size = i_size_read(>vfs_inode); > + pos = (loff_t) page->index << PAGE_SHIFT; > + if (pos >= i_size) { > + /* partial write, page beyond EOF */ > + _debug("beyond"); > + if (offset > 0) > + memset(p, 0, offset); > + if (to < PAGE_SIZE) > + memset(p + to, 0, PAGE_SIZE - to); > + kunmap(page); > + return 0; > + } > + > + if (i_size - pos >= PAGE_SIZE) { > + /* partial write, page entirely before EOF */ > + _debug("before"); > + tail = eof = PAGE_SIZE; > + } else { > + /* partial write, page overlaps EOF */ > + eof = i_size - pos; > + _debug("overlap %u", eof); > + tail = max(eof, to); > + if (tail < PAGE_SIZE) > + memset(p + tail, 0, PAGE_SIZE - tail); > + if (offset > eof) > + memset(p + eof, 0, PAGE_SIZE - eof); > + } > + > + kunmap(p); kmap_atomic() could be used here and is better. We have this zero_user_page() thing heading in which could perhaps be used here also. > + ret = 0; > + if (offset > 0 || eof > to) { > + /* need to fill one or two bits that aren't going to be written > + * (cover both fillers in one read if there are two) */ > + start = (offset > 0) ? 0 : to; > + stop = (eof > to) ? eof : offset; > + len = stop - start; > + _debug("wr=%u-%u av=0-%u [EMAIL PROTECTED]", > +offset, to, eof, start, len); > + ret = afs_fill_page(vnode, key, start, len, page); > + } > + > + _leave(" = %d", ret); > + return ret; > +} > + > > ... > + ASSERTRANGE(wb->first, <=, index, <=, wb->last); wow. > +} > + > +/* > + * finalise part of a write to a page > + */ > +int afs_commit_write(struct file *file, struct page *page, > + unsigned offset, unsigned to) > +{ > + struct afs_vnode *vnode = AFS_FS_I(file->f_dentry->d_inode); > + loff_t i_size, maybe_i_size; > + > + _enter("{%x:%u},{%lx},%u,%u", > +vnode->fid.vid, vnode->fid.vnode, page->index, offset, to); > + > + maybe_i_size = (loff_t) page->index << PAGE_SHIFT; > + maybe_i_size += to; > + > + i_size = i_size_read(>vfs_inode); > + if (maybe_i_size > i_size) { > + spin_lock(>writeback_lock); > + i_size =
Re: [PATCH 3/3] AFS: Implement basic file write support
On Tue, 08 May 2007 20:44:11 +0100 David Howells [EMAIL PROTECTED] wrote: Implement support for writing to regular AFS files, including: (1) write (2) truncate (3) fsync, fdatasync (4) chmod, chown, chgrp, utime. AFS writeback attempts to batch writes into as chunks as large as it can manage up to the point that it writes back 65535 pages in one chunk or it meets a locked page. Furthermore, if a page has been written to using a particular key, then should another write to that page use some other key, the first write will be flushed before the second is allowed to take place. If the first write fails due to a security error, then the page will be scrapped and reread before the second write takes place. If a page is dirty and the callback on it is broken by the server, then the dirty data is not discarded (same behaviour as NFS). Shared-writable mappings are not supported by this patch. The below isn't a review - it's some random cherrypickling. ... +int afs_fs_store_data(struct afs_server *server, struct afs_writeback *wb, + pgoff_t first, pgoff_t last, + unsigned offset, unsigned to, + const struct afs_wait_mode *wait_mode) +{ + struct afs_vnode *vnode = wb-vnode; + struct afs_call *call; + loff_t size, pos, i_size; + __be32 *bp; + + _enter(,%x,{%x:%u},,, +key_serial(wb-key), vnode-fid.vid, vnode-fid.vnode); + + size = to - offset; + if (first != last) + size += (loff_t)(last - first) PAGE_SHIFT; + pos = (loff_t)first PAGE_SHIFT; + pos += offset; + + i_size = i_size_read(vnode-vfs_inode); + if (pos + size i_size) + i_size = size + pos; + + _debug(size %llx, at %llx, i_size %llx, +(unsigned long long) size, (unsigned long long) pos, +(unsigned long long) i_size); + + BUG_ON(i_size 0x); // TODO: use 64-bit store You're sure this isn't user-triggerable? +static int afs_prepare_page(struct afs_vnode *vnode, struct page *page, + struct key *key, unsigned offset, unsigned to) +{ + unsigned eof, tail, start, stop, len; + loff_t i_size, pos; + void *p; + int ret; + + _enter(); + + if (offset == 0 to == PAGE_SIZE) + return 0; + + p = kmap(page); + + i_size = i_size_read(vnode-vfs_inode); + pos = (loff_t) page-index PAGE_SHIFT; + if (pos = i_size) { + /* partial write, page beyond EOF */ + _debug(beyond); + if (offset 0) + memset(p, 0, offset); + if (to PAGE_SIZE) + memset(p + to, 0, PAGE_SIZE - to); + kunmap(page); + return 0; + } + + if (i_size - pos = PAGE_SIZE) { + /* partial write, page entirely before EOF */ + _debug(before); + tail = eof = PAGE_SIZE; + } else { + /* partial write, page overlaps EOF */ + eof = i_size - pos; + _debug(overlap %u, eof); + tail = max(eof, to); + if (tail PAGE_SIZE) + memset(p + tail, 0, PAGE_SIZE - tail); + if (offset eof) + memset(p + eof, 0, PAGE_SIZE - eof); + } + + kunmap(p); kmap_atomic() could be used here and is better. We have this zero_user_page() thing heading in which could perhaps be used here also. + ret = 0; + if (offset 0 || eof to) { + /* need to fill one or two bits that aren't going to be written + * (cover both fillers in one read if there are two) */ + start = (offset 0) ? 0 : to; + stop = (eof to) ? eof : offset; + len = stop - start; + _debug(wr=%u-%u av=0-%u [EMAIL PROTECTED], +offset, to, eof, start, len); + ret = afs_fill_page(vnode, key, start, len, page); + } + + _leave( = %d, ret); + return ret; +} + ... + ASSERTRANGE(wb-first, =, index, =, wb-last); wow. +} + +/* + * finalise part of a write to a page + */ +int afs_commit_write(struct file *file, struct page *page, + unsigned offset, unsigned to) +{ + struct afs_vnode *vnode = AFS_FS_I(file-f_dentry-d_inode); + loff_t i_size, maybe_i_size; + + _enter({%x:%u},{%lx},%u,%u, +vnode-fid.vid, vnode-fid.vnode, page-index, offset, to); + + maybe_i_size = (loff_t) page-index PAGE_SHIFT; + maybe_i_size += to; + + i_size = i_size_read(vnode-vfs_inode); + if (maybe_i_size i_size) { + spin_lock(vnode-writeback_lock); + i_size = i_size_read(vnode-vfs_inode); + if (maybe_i_size i_size) + i_size_write(vnode-vfs_inode, maybe_i_size); +