Re: [PATCH 3/3] AFS: Implement basic file write support

2007-05-11 Thread Nick Piggin

David Howells wrote:

Nick Piggin <[EMAIL PROTECTED]> wrote:



Why do you call SetPageUptodate when the page is not up to date?
That leaks uninitialised data, AFAIKS.



It only seems that way.  If afs_prepare_write() is called, but doesn't return
an error, then afs_commit_write() will be called, and it seems that the copy
in of the data will be guaranteed not to fail by the caller.


Not only does it seem that way, it is that way :) PG_uptodate is being set
when the page is not uptodate, isn't it?



Furthermore, afs_prepare_page() will have filled in the missing bits.

And whilst all that is going on, the page lock will be help by the caller, so
that no-one else can access the partially complete page.


When a page is uptodate in pagecache, the generic read and nopage functions
do not take the page lock. So how are you excluding those?

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] AFS: Implement basic file write support

2007-05-11 Thread Nick Piggin

David Howells wrote:

Nick Piggin [EMAIL PROTECTED] wrote:



Why do you call SetPageUptodate when the page is not up to date?
That leaks uninitialised data, AFAIKS.



It only seems that way.  If afs_prepare_write() is called, but doesn't return
an error, then afs_commit_write() will be called, and it seems that the copy
in of the data will be guaranteed not to fail by the caller.


Not only does it seem that way, it is that way :) PG_uptodate is being set
when the page is not uptodate, isn't it?



Furthermore, afs_prepare_page() will have filled in the missing bits.

And whilst all that is going on, the page lock will be help by the caller, so
that no-one else can access the partially complete page.


When a page is uptodate in pagecache, the generic read and nopage functions
do not take the page lock. So how are you excluding those?

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] AFS: Implement basic file write support

2007-05-10 Thread David Howells
Nick Piggin <[EMAIL PROTECTED]> wrote:

> Why do you call SetPageUptodate when the page is not up to date?
> That leaks uninitialised data, AFAIKS.

It only seems that way.  If afs_prepare_write() is called, but doesn't return
an error, then afs_commit_write() will be called, and it seems that the copy
in of the data will be guaranteed not to fail by the caller.

Furthermore, afs_prepare_page() will have filled in the missing bits.

And whilst all that is going on, the page lock will be help by the caller, so
that no-one else can access the partially complete page.

I suppose I could call SetPageUptodate() in afs_commit_write() instead.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] AFS: Implement basic file write support

2007-05-10 Thread David Howells
Nick Piggin [EMAIL PROTECTED] wrote:

 Why do you call SetPageUptodate when the page is not up to date?
 That leaks uninitialised data, AFAIKS.

It only seems that way.  If afs_prepare_write() is called, but doesn't return
an error, then afs_commit_write() will be called, and it seems that the copy
in of the data will be guaranteed not to fail by the caller.

Furthermore, afs_prepare_page() will have filled in the missing bits.

And whilst all that is going on, the page lock will be help by the caller, so
that no-one else can access the partially complete page.

I suppose I could call SetPageUptodate() in afs_commit_write() instead.

David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] AFS: Implement basic file write support

2007-05-09 Thread Nick Piggin

David Howells wrote:


+/*
+ * prepare a page for being written to
+ */
+static int afs_prepare_page(struct afs_vnode *vnode, struct page *page,
+   struct key *key, unsigned offset, unsigned to)
+{
+   unsigned eof, tail, start, stop, len;
+   loff_t i_size, pos;
+   void *p;
+   int ret;
+
+   _enter("");
+
+   if (offset == 0 && to == PAGE_SIZE)
+   return 0;
+
+   p = kmap(page);
+
+   i_size = i_size_read(>vfs_inode);
+   pos = (loff_t) page->index << PAGE_SHIFT;
+   if (pos >= i_size) {
+   /* partial write, page beyond EOF */
+   _debug("beyond");
+   if (offset > 0)
+   memset(p, 0, offset);
+   if (to < PAGE_SIZE)
+   memset(p + to, 0, PAGE_SIZE - to);
+   kunmap(page);
+   return 0;
+   }
+
+   if (i_size - pos >= PAGE_SIZE) {
+   /* partial write, page entirely before EOF */
+   _debug("before");
+   tail = eof = PAGE_SIZE;
+   } else {
+   /* partial write, page overlaps EOF */
+   eof = i_size - pos;
+   _debug("overlap %u", eof);
+   tail = max(eof, to);
+   if (tail < PAGE_SIZE)
+   memset(p + tail, 0, PAGE_SIZE - tail);
+   if (offset > eof)
+   memset(p + eof, 0, PAGE_SIZE - eof);
+   }
+
+   kunmap(p);
+
+   ret = 0;
+   if (offset > 0 || eof > to) {
+   /* need to fill one or two bits that aren't going to be written
+* (cover both fillers in one read if there are two) */
+   start = (offset > 0) ? 0 : to;
+   stop = (eof > to) ? eof : offset;
+   len = stop - start;
+   _debug("wr=%u-%u av=0-%u [EMAIL PROTECTED]",
+  offset, to, eof, start, len);
+   ret = afs_fill_page(vnode, key, start, len, page);
+   }
+
+   _leave(" = %d", ret);
+   return ret;
+}
+
+/*
+ * prepare to perform part of a write to a page
+ * - the caller holds the page locked, preventing it from being written out or
+ *   modified by anyone else
+ */
+int afs_prepare_write(struct file *file, struct page *page,
+ unsigned offset, unsigned to)
+{
+   struct afs_writeback *candidate, *wb;
+   struct afs_vnode *vnode = AFS_FS_I(file->f_dentry->d_inode);
+   struct key *key = file->private_data;
+   pgoff_t index;
+   int ret;
+
+   _enter("{%x:%u},{%lx},%u,%u",
+  vnode->fid.vid, vnode->fid.vnode, page->index, offset, to);
+
+   candidate = kzalloc(sizeof(*candidate), GFP_KERNEL);
+   if (!candidate)
+   return -ENOMEM;
+   candidate->vnode = vnode;
+   candidate->first = candidate->last = page->index;
+   candidate->offset_first = offset;
+   candidate->to_last = to;
+   candidate->usage = 1;
+   candidate->state = AFS_WBACK_PENDING;
+   init_waitqueue_head(>waitq);
+
+   if (!PageUptodate(page)) {
+   _debug("not up to date");
+   ret = afs_prepare_page(vnode, page, key, offset, to);
+   if (ret < 0) {
+   kfree(candidate);
+   _leave(" = %d [prep]", ret);
+   return ret;
+   }
+   SetPageUptodate(page);
+   }



Why do you call SetPageUptodate when the page is not up to date?
That leaks uninitialised data, AFAIKS.

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] AFS: Implement basic file write support

2007-05-09 Thread Andrew Morton
On Wed, 09 May 2007 12:07:39 +0100 David Howells <[EMAIL PROTECTED]> wrote:

> Andrew Morton <[EMAIL PROTECTED]> wrote:
> 
> > set_page_dirty() will set I_DIRTY_PAGES only.  ie: the inode has dirty
> > pagecache data.
> > 
> > To tell the VFS that the inode itself is dirty one needs to run
> > mark_inode_dirty().
> 
> But what's the difference in this case?  I don't need to write the inode back
> per se, and the inode attributes can be updated by the mechanism of data
> storage.
> 

Ah.  Well if you don't need to write the inode back then sure, there
shouldn't be a need to mark it dirty.  That's what I was asking ;)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] AFS: Implement basic file write support

2007-05-09 Thread David Howells
Andrew Morton <[EMAIL PROTECTED]> wrote:

> set_page_dirty() will set I_DIRTY_PAGES only.  ie: the inode has dirty
> pagecache data.
> 
> To tell the VFS that the inode itself is dirty one needs to run
> mark_inode_dirty().

But what's the difference in this case?  I don't need to write the inode back
per se, and the inode attributes can be updated by the mechanism of data
storage.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] AFS: Implement basic file write support

2007-05-09 Thread Andrew Morton
On Wed, 09 May 2007 11:25:47 +0100 David Howells <[EMAIL PROTECTED]> wrote:

> > > + set_page_dirty(page);
> > > +
> > > + if (PageDirty(page))
> > > + _debug("dirtied");
> > > +
> > > + return 0;
> > > +}
> > 
> > One would normally run mark_inode_dirty() after any i_size_write()?
> 
> Not in this case, I assume, because set_page_dirty() ultimately calls
> __mark_inode_dirty(), but I could be wrong.

set_page_dirty() will set I_DIRTY_PAGES only.  ie: the inode has dirty
pagecache data.

To tell the VFS that the inode itself is dirty one needs to run
mark_inode_dirty().  Or maybe mark_inode_dirty_sync() but I can never for
the life of me remember what that thing does.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] AFS: Implement basic file write support

2007-05-09 Thread David Howells
Andrew Morton <[EMAIL PROTECTED]> wrote:

> > +   BUG_ON(i_size > 0x); // TODO: use 64-bit store
> 
> You're sure this isn't user-triggerable?

Hmmm...  I'm not.  I'll whip up a patch for this.

> kmap_atomic() could be used here and is better.

Yeah.  It used to have something that slept in the middle of it, but that's no
longer there.  I'll add to the patch.

> We have this zero_user_page() thing heading in which could perhaps be used
> here also.

Okay.  I'll have a look at it once it's there.

> > +   ASSERTRANGE(wb->first, <=, index, <=, wb->last);
> 
> wow.

:-)

The assertions I've put in have been very useful.

> > +   set_page_dirty(page);
> > +
> > +   if (PageDirty(page))
> > +   _debug("dirtied");
> > +
> > +   return 0;
> > +}
> 
> One would normally run mark_inode_dirty() after any i_size_write()?

Not in this case, I assume, because set_page_dirty() ultimately calls
__mark_inode_dirty(), but I could be wrong.

> We can invalidate pages and we can truncate them and we can clean them. 
> But here we have a new operation, "killing".  I wonder what that is.

I can call it invalidation if you like, though that name is already reserved
as it were:-/  I suppose it might actually make sense for me to call
invalidatepage() myself.

> > +   if (wbc->sync_mode != WB_SYNC_NONE)
> > +   wait_on_page_writeback(page);
> 
> Didn't the VFS already do that?

I'm not entirely sure.  Looking at generic_writepages(), I guess so.  I'll
patch it out.

> > +   if (PageWriteback(page) || !PageDirty(page)) {
> > +   unlock_page(page);
> > +   return 0;
> > +   }
> 
> And some of that?

Yeah.  Seems so.  I'll patch that out too.

What I'd like to do is ditch writepage() entirely - I'm not sure it's entirely
necessary with the availability of writepages() - but I'll look at that
another time.

> I have this vague prehistoric memory that something can go wrong at the VFS
> level if the address_space writes back more pages than it was asked to. 
> But I forget what the issue was and it would be silly to have an issue
> with that anyway.  Something to keep an eye out for.

Okay.

Thanks for the 'cherry-pick'.  I'll hopefully have a revision patch for you
soon.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] AFS: Implement basic file write support

2007-05-09 Thread David Howells
Andrew Morton [EMAIL PROTECTED] wrote:

  +   BUG_ON(i_size  0x); // TODO: use 64-bit store
 
 You're sure this isn't user-triggerable?

Hmmm...  I'm not.  I'll whip up a patch for this.

 kmap_atomic() could be used here and is better.

Yeah.  It used to have something that slept in the middle of it, but that's no
longer there.  I'll add to the patch.

 We have this zero_user_page() thing heading in which could perhaps be used
 here also.

Okay.  I'll have a look at it once it's there.

  +   ASSERTRANGE(wb-first, =, index, =, wb-last);
 
 wow.

:-)

The assertions I've put in have been very useful.

  +   set_page_dirty(page);
  +
  +   if (PageDirty(page))
  +   _debug(dirtied);
  +
  +   return 0;
  +}
 
 One would normally run mark_inode_dirty() after any i_size_write()?

Not in this case, I assume, because set_page_dirty() ultimately calls
__mark_inode_dirty(), but I could be wrong.

 We can invalidate pages and we can truncate them and we can clean them. 
 But here we have a new operation, killing.  I wonder what that is.

I can call it invalidation if you like, though that name is already reserved
as it were:-/  I suppose it might actually make sense for me to call
invalidatepage() myself.

  +   if (wbc-sync_mode != WB_SYNC_NONE)
  +   wait_on_page_writeback(page);
 
 Didn't the VFS already do that?

I'm not entirely sure.  Looking at generic_writepages(), I guess so.  I'll
patch it out.

  +   if (PageWriteback(page) || !PageDirty(page)) {
  +   unlock_page(page);
  +   return 0;
  +   }
 
 And some of that?

Yeah.  Seems so.  I'll patch that out too.

What I'd like to do is ditch writepage() entirely - I'm not sure it's entirely
necessary with the availability of writepages() - but I'll look at that
another time.

 I have this vague prehistoric memory that something can go wrong at the VFS
 level if the address_space writes back more pages than it was asked to. 
 But I forget what the issue was and it would be silly to have an issue
 with that anyway.  Something to keep an eye out for.

Okay.

Thanks for the 'cherry-pick'.  I'll hopefully have a revision patch for you
soon.

David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] AFS: Implement basic file write support

2007-05-09 Thread Andrew Morton
On Wed, 09 May 2007 11:25:47 +0100 David Howells [EMAIL PROTECTED] wrote:

   + set_page_dirty(page);
   +
   + if (PageDirty(page))
   + _debug(dirtied);
   +
   + return 0;
   +}
  
  One would normally run mark_inode_dirty() after any i_size_write()?
 
 Not in this case, I assume, because set_page_dirty() ultimately calls
 __mark_inode_dirty(), but I could be wrong.

set_page_dirty() will set I_DIRTY_PAGES only.  ie: the inode has dirty
pagecache data.

To tell the VFS that the inode itself is dirty one needs to run
mark_inode_dirty().  Or maybe mark_inode_dirty_sync() but I can never for
the life of me remember what that thing does.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] AFS: Implement basic file write support

2007-05-09 Thread David Howells
Andrew Morton [EMAIL PROTECTED] wrote:

 set_page_dirty() will set I_DIRTY_PAGES only.  ie: the inode has dirty
 pagecache data.
 
 To tell the VFS that the inode itself is dirty one needs to run
 mark_inode_dirty().

But what's the difference in this case?  I don't need to write the inode back
per se, and the inode attributes can be updated by the mechanism of data
storage.

David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] AFS: Implement basic file write support

2007-05-09 Thread Andrew Morton
On Wed, 09 May 2007 12:07:39 +0100 David Howells [EMAIL PROTECTED] wrote:

 Andrew Morton [EMAIL PROTECTED] wrote:
 
  set_page_dirty() will set I_DIRTY_PAGES only.  ie: the inode has dirty
  pagecache data.
  
  To tell the VFS that the inode itself is dirty one needs to run
  mark_inode_dirty().
 
 But what's the difference in this case?  I don't need to write the inode back
 per se, and the inode attributes can be updated by the mechanism of data
 storage.
 

Ah.  Well if you don't need to write the inode back then sure, there
shouldn't be a need to mark it dirty.  That's what I was asking ;)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] AFS: Implement basic file write support

2007-05-09 Thread Nick Piggin

David Howells wrote:


+/*
+ * prepare a page for being written to
+ */
+static int afs_prepare_page(struct afs_vnode *vnode, struct page *page,
+   struct key *key, unsigned offset, unsigned to)
+{
+   unsigned eof, tail, start, stop, len;
+   loff_t i_size, pos;
+   void *p;
+   int ret;
+
+   _enter();
+
+   if (offset == 0  to == PAGE_SIZE)
+   return 0;
+
+   p = kmap(page);
+
+   i_size = i_size_read(vnode-vfs_inode);
+   pos = (loff_t) page-index  PAGE_SHIFT;
+   if (pos = i_size) {
+   /* partial write, page beyond EOF */
+   _debug(beyond);
+   if (offset  0)
+   memset(p, 0, offset);
+   if (to  PAGE_SIZE)
+   memset(p + to, 0, PAGE_SIZE - to);
+   kunmap(page);
+   return 0;
+   }
+
+   if (i_size - pos = PAGE_SIZE) {
+   /* partial write, page entirely before EOF */
+   _debug(before);
+   tail = eof = PAGE_SIZE;
+   } else {
+   /* partial write, page overlaps EOF */
+   eof = i_size - pos;
+   _debug(overlap %u, eof);
+   tail = max(eof, to);
+   if (tail  PAGE_SIZE)
+   memset(p + tail, 0, PAGE_SIZE - tail);
+   if (offset  eof)
+   memset(p + eof, 0, PAGE_SIZE - eof);
+   }
+
+   kunmap(p);
+
+   ret = 0;
+   if (offset  0 || eof  to) {
+   /* need to fill one or two bits that aren't going to be written
+* (cover both fillers in one read if there are two) */
+   start = (offset  0) ? 0 : to;
+   stop = (eof  to) ? eof : offset;
+   len = stop - start;
+   _debug(wr=%u-%u av=0-%u [EMAIL PROTECTED],
+  offset, to, eof, start, len);
+   ret = afs_fill_page(vnode, key, start, len, page);
+   }
+
+   _leave( = %d, ret);
+   return ret;
+}
+
+/*
+ * prepare to perform part of a write to a page
+ * - the caller holds the page locked, preventing it from being written out or
+ *   modified by anyone else
+ */
+int afs_prepare_write(struct file *file, struct page *page,
+ unsigned offset, unsigned to)
+{
+   struct afs_writeback *candidate, *wb;
+   struct afs_vnode *vnode = AFS_FS_I(file-f_dentry-d_inode);
+   struct key *key = file-private_data;
+   pgoff_t index;
+   int ret;
+
+   _enter({%x:%u},{%lx},%u,%u,
+  vnode-fid.vid, vnode-fid.vnode, page-index, offset, to);
+
+   candidate = kzalloc(sizeof(*candidate), GFP_KERNEL);
+   if (!candidate)
+   return -ENOMEM;
+   candidate-vnode = vnode;
+   candidate-first = candidate-last = page-index;
+   candidate-offset_first = offset;
+   candidate-to_last = to;
+   candidate-usage = 1;
+   candidate-state = AFS_WBACK_PENDING;
+   init_waitqueue_head(candidate-waitq);
+
+   if (!PageUptodate(page)) {
+   _debug(not up to date);
+   ret = afs_prepare_page(vnode, page, key, offset, to);
+   if (ret  0) {
+   kfree(candidate);
+   _leave( = %d [prep], ret);
+   return ret;
+   }
+   SetPageUptodate(page);
+   }



Why do you call SetPageUptodate when the page is not up to date?
That leaks uninitialised data, AFAIKS.

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] AFS: Implement basic file write support

2007-05-08 Thread Andrew Morton
On Tue, 08 May 2007 20:44:11 +0100
David Howells <[EMAIL PROTECTED]> wrote:

> Implement support for writing to regular AFS files, including:
> 
>  (1) write
> 
>  (2) truncate
> 
>  (3) fsync, fdatasync
> 
>  (4) chmod, chown, chgrp, utime.
> 
> AFS writeback attempts to batch writes into as chunks as large as it can 
> manage
> up to the point that it writes back 65535 pages in one chunk or it meets a
> locked page.
> 
> Furthermore, if a page has been written to using a particular key, then should
> another write to that page use some other key, the first write will be flushed
> before the second is allowed to take place.  If the first write fails due to a
> security error, then the page will be scrapped and reread before the second
> write takes place.
> 
> If a page is dirty and the callback on it is broken by the server, then the
> dirty data is not discarded (same behaviour as NFS).
> 
> Shared-writable mappings are not supported by this patch.


The below isn't a review - it's some random cherrypickling.

> ...
>
> +int afs_fs_store_data(struct afs_server *server, struct afs_writeback *wb,
> +   pgoff_t first, pgoff_t last,
> +   unsigned offset, unsigned to,
> +   const struct afs_wait_mode *wait_mode)
> +{
> + struct afs_vnode *vnode = wb->vnode;
> + struct afs_call *call;
> + loff_t size, pos, i_size;
> + __be32 *bp;
> +
> + _enter(",%x,{%x:%u},,",
> +key_serial(wb->key), vnode->fid.vid, vnode->fid.vnode);
> +
> + size = to - offset;
> + if (first != last)
> + size += (loff_t)(last - first) << PAGE_SHIFT;
> + pos = (loff_t)first << PAGE_SHIFT;
> + pos += offset;
> +
> + i_size = i_size_read(>vfs_inode);
> + if (pos + size > i_size)
> + i_size = size + pos;
> +
> + _debug("size %llx, at %llx, i_size %llx",
> +(unsigned long long) size, (unsigned long long) pos,
> +(unsigned long long) i_size);
> +
> + BUG_ON(i_size > 0x); // TODO: use 64-bit store

You're sure this isn't user-triggerable?

> +static int afs_prepare_page(struct afs_vnode *vnode, struct page *page,
> + struct key *key, unsigned offset, unsigned to)
> +{
> + unsigned eof, tail, start, stop, len;
> + loff_t i_size, pos;
> + void *p;
> + int ret;
> +
> + _enter("");
> +
> + if (offset == 0 && to == PAGE_SIZE)
> + return 0;
> +
> + p = kmap(page);
> +
> + i_size = i_size_read(>vfs_inode);
> + pos = (loff_t) page->index << PAGE_SHIFT;
> + if (pos >= i_size) {
> + /* partial write, page beyond EOF */
> + _debug("beyond");
> + if (offset > 0)
> + memset(p, 0, offset);
> + if (to < PAGE_SIZE)
> + memset(p + to, 0, PAGE_SIZE - to);
> + kunmap(page);
> + return 0;
> + }
> +
> + if (i_size - pos >= PAGE_SIZE) {
> + /* partial write, page entirely before EOF */
> + _debug("before");
> + tail = eof = PAGE_SIZE;
> + } else {
> + /* partial write, page overlaps EOF */
> + eof = i_size - pos;
> + _debug("overlap %u", eof);
> + tail = max(eof, to);
> + if (tail < PAGE_SIZE)
> + memset(p + tail, 0, PAGE_SIZE - tail);
> + if (offset > eof)
> + memset(p + eof, 0, PAGE_SIZE - eof);
> + }
> +
> + kunmap(p);

kmap_atomic() could be used here and is better.

We have this zero_user_page() thing heading in which could perhaps be used
here also.


> + ret = 0;
> + if (offset > 0 || eof > to) {
> + /* need to fill one or two bits that aren't going to be written
> +  * (cover both fillers in one read if there are two) */
> + start = (offset > 0) ? 0 : to;
> + stop = (eof > to) ? eof : offset;
> + len = stop - start;
> + _debug("wr=%u-%u av=0-%u [EMAIL PROTECTED]",
> +offset, to, eof, start, len);
> + ret = afs_fill_page(vnode, key, start, len, page);
> + }
> +
> + _leave(" = %d", ret);
> + return ret;
> +}
> +
> 
> ...

> + ASSERTRANGE(wb->first, <=, index, <=, wb->last);

wow.

> +}
> +
> +/*
> + * finalise part of a write to a page
> + */
> +int afs_commit_write(struct file *file, struct page *page,
> +  unsigned offset, unsigned to)
> +{
> + struct afs_vnode *vnode = AFS_FS_I(file->f_dentry->d_inode);
> + loff_t i_size, maybe_i_size;
> +
> + _enter("{%x:%u},{%lx},%u,%u",
> +vnode->fid.vid, vnode->fid.vnode, page->index, offset, to);
> +
> + maybe_i_size = (loff_t) page->index << PAGE_SHIFT;
> + maybe_i_size += to;
> +
> + i_size = i_size_read(>vfs_inode);
> + if (maybe_i_size > i_size) {
> + spin_lock(>writeback_lock);
> + i_size = 

Re: [PATCH 3/3] AFS: Implement basic file write support

2007-05-08 Thread Andrew Morton
On Tue, 08 May 2007 20:44:11 +0100
David Howells [EMAIL PROTECTED] wrote:

 Implement support for writing to regular AFS files, including:
 
  (1) write
 
  (2) truncate
 
  (3) fsync, fdatasync
 
  (4) chmod, chown, chgrp, utime.
 
 AFS writeback attempts to batch writes into as chunks as large as it can 
 manage
 up to the point that it writes back 65535 pages in one chunk or it meets a
 locked page.
 
 Furthermore, if a page has been written to using a particular key, then should
 another write to that page use some other key, the first write will be flushed
 before the second is allowed to take place.  If the first write fails due to a
 security error, then the page will be scrapped and reread before the second
 write takes place.
 
 If a page is dirty and the callback on it is broken by the server, then the
 dirty data is not discarded (same behaviour as NFS).
 
 Shared-writable mappings are not supported by this patch.


The below isn't a review - it's some random cherrypickling.

 ...

 +int afs_fs_store_data(struct afs_server *server, struct afs_writeback *wb,
 +   pgoff_t first, pgoff_t last,
 +   unsigned offset, unsigned to,
 +   const struct afs_wait_mode *wait_mode)
 +{
 + struct afs_vnode *vnode = wb-vnode;
 + struct afs_call *call;
 + loff_t size, pos, i_size;
 + __be32 *bp;
 +
 + _enter(,%x,{%x:%u},,,
 +key_serial(wb-key), vnode-fid.vid, vnode-fid.vnode);
 +
 + size = to - offset;
 + if (first != last)
 + size += (loff_t)(last - first)  PAGE_SHIFT;
 + pos = (loff_t)first  PAGE_SHIFT;
 + pos += offset;
 +
 + i_size = i_size_read(vnode-vfs_inode);
 + if (pos + size  i_size)
 + i_size = size + pos;
 +
 + _debug(size %llx, at %llx, i_size %llx,
 +(unsigned long long) size, (unsigned long long) pos,
 +(unsigned long long) i_size);
 +
 + BUG_ON(i_size  0x); // TODO: use 64-bit store

You're sure this isn't user-triggerable?

 +static int afs_prepare_page(struct afs_vnode *vnode, struct page *page,
 + struct key *key, unsigned offset, unsigned to)
 +{
 + unsigned eof, tail, start, stop, len;
 + loff_t i_size, pos;
 + void *p;
 + int ret;
 +
 + _enter();
 +
 + if (offset == 0  to == PAGE_SIZE)
 + return 0;
 +
 + p = kmap(page);
 +
 + i_size = i_size_read(vnode-vfs_inode);
 + pos = (loff_t) page-index  PAGE_SHIFT;
 + if (pos = i_size) {
 + /* partial write, page beyond EOF */
 + _debug(beyond);
 + if (offset  0)
 + memset(p, 0, offset);
 + if (to  PAGE_SIZE)
 + memset(p + to, 0, PAGE_SIZE - to);
 + kunmap(page);
 + return 0;
 + }
 +
 + if (i_size - pos = PAGE_SIZE) {
 + /* partial write, page entirely before EOF */
 + _debug(before);
 + tail = eof = PAGE_SIZE;
 + } else {
 + /* partial write, page overlaps EOF */
 + eof = i_size - pos;
 + _debug(overlap %u, eof);
 + tail = max(eof, to);
 + if (tail  PAGE_SIZE)
 + memset(p + tail, 0, PAGE_SIZE - tail);
 + if (offset  eof)
 + memset(p + eof, 0, PAGE_SIZE - eof);
 + }
 +
 + kunmap(p);

kmap_atomic() could be used here and is better.

We have this zero_user_page() thing heading in which could perhaps be used
here also.


 + ret = 0;
 + if (offset  0 || eof  to) {
 + /* need to fill one or two bits that aren't going to be written
 +  * (cover both fillers in one read if there are two) */
 + start = (offset  0) ? 0 : to;
 + stop = (eof  to) ? eof : offset;
 + len = stop - start;
 + _debug(wr=%u-%u av=0-%u [EMAIL PROTECTED],
 +offset, to, eof, start, len);
 + ret = afs_fill_page(vnode, key, start, len, page);
 + }
 +
 + _leave( = %d, ret);
 + return ret;
 +}
 +
 
 ...

 + ASSERTRANGE(wb-first, =, index, =, wb-last);

wow.

 +}
 +
 +/*
 + * finalise part of a write to a page
 + */
 +int afs_commit_write(struct file *file, struct page *page,
 +  unsigned offset, unsigned to)
 +{
 + struct afs_vnode *vnode = AFS_FS_I(file-f_dentry-d_inode);
 + loff_t i_size, maybe_i_size;
 +
 + _enter({%x:%u},{%lx},%u,%u,
 +vnode-fid.vid, vnode-fid.vnode, page-index, offset, to);
 +
 + maybe_i_size = (loff_t) page-index  PAGE_SHIFT;
 + maybe_i_size += to;
 +
 + i_size = i_size_read(vnode-vfs_inode);
 + if (maybe_i_size  i_size) {
 + spin_lock(vnode-writeback_lock);
 + i_size = i_size_read(vnode-vfs_inode);
 + if (maybe_i_size  i_size)
 + i_size_write(vnode-vfs_inode, maybe_i_size);
 +