Re: [RFC] ext3 freeze feature

2008-01-25 Thread Pekka Enberg
Hi,

 diff -uprN -X linux-2.6.24-rc8/Documentation/dontdiff 
 linux-2.6.24-rc8/include/linux/ext3_fs_sb.h 
 linux-2.6.24-rc8-freeze/include/linux/ext3_fs_sb.h
 --- linux-2.6.24-rc8/include/linux/ext3_fs_sb.h 2008-01-16 13:22:48.0 
 +0900
 +++ linux-2.6.24-rc8-freeze/include/linux/ext3_fs_sb.h  2008-01-22 
 18:20:33.0 +0900
 @@ -81,6 +81,8 @@ struct ext3_sb_info {
 char *s_qf_names[MAXQUOTAS];/* Names of quota files with 
 journalled quota */
 int s_jquota_fmt;   /* Format of quota to use */
  #endif
 +   /* Delayed work for freeze */
 +   struct delayed_work s_freeze_timeout;

Why not put this in struct super_block? Then you don't need this

 +/**
 + * get_super_block - get super_block
 + * @s_fs_info  : filesystem dependent information
 + *   (super_block.s_fs_info)
 + *
 + * Get super_block which holds s_fs_info from super_blocks.
 + * get_super_block() returns a pointer of super block or
 + * %NULL if it have failed.
 + */
 +struct super_block *get_super_block(void *s_fs_info)
 +{

And these can be put to generic code:

  /*
 + * ext3_add_freeze_timeout - Add timeout for ext3 freeze.
 + *
 + * @sbi: ext3 super block
 + * @timeout_msec   : timeout period
 + *
 + * Add the delayed work for ext3 freeze timeout
 + * to the delayed work queue.
 + */
 +void ext3_add_freeze_timeout(struct ext3_sb_info *sbi,
 +   long timeout_msec)
 +{
 +   s64 timeout_jiffies = msecs_to_jiffies(timeout_msec);
 +
 +   /*
 +* setup freeze timeout function
 +*/
 +   INIT_DELAYED_WORK(sbi-s_freeze_timeout, ext3_freeze_timeout);
 +
 +   /* set delayed work queue */
 +   cancel_delayed_work(sbi-s_freeze_timeout);
 +   schedule_delayed_work(sbi-s_freeze_timeout, timeout_jiffies);
 +}
 +
 +/*
 + * ext3_del_freeze_timeout - Delete timeout for ext3 freeze.
 + *
 + * @sbi: ext3 super block
 + *
 + * Delete the delayed work for ext3 freeze timeout
 + * from the delayed work queue.
 + */
 +void ext3_del_freeze_timeout(struct ext3_sb_info *sbi)
 +{
 +   if (delayed_work_pending(sbi-s_freeze_timeout))
 +   cancel_delayed_work(sbi-s_freeze_timeout);
 +}

 +/*
 + * ext3_freeze_timeout - Thaw the filesystem.
 + *
 + * @work   : work queue (delayed_work.work)
 + *
 + * Called by the delayed work when elapsing the timeout period.
 + * Thaw the filesystem.
 + */
 +static void ext3_freeze_timeout(struct work_struct *work)
 +{
 +   struct ext3_sb_info *sbi = container_of(work,
 +   struct ext3_sb_info,
 +   s_freeze_timeout.work);
 +   struct super_block *sb = get_super_block(sbi);
 +
 +   BUG_ON(sb == NULL);
 +
 +   if (sb-s_frozen != SB_UNFROZEN)
 +   thaw_bdev(sb-s_bdev, sb);
 +}
 +

I am also wondering whether we should have system call(s) for these:

On Jan 25, 2008 12:59 PM, Takashi Sato [EMAIL PROTECTED] wrote:
 +   case EXT3_IOC_FREEZE: {

 +   case EXT3_IOC_THAW: {

And just convert XFS to use them too?

Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-26 Thread Pekka Enberg
Hi Hugh,

On 10/25/07, Hugh Dickins [EMAIL PROTECTED] wrote:
 @@ -349,10 +349,6 @@ static pageout_t pageout(struct page *pa
 res = mapping-a_ops-writepage(page, wbc);
 if (res  0)
 handle_write_error(mapping, page, res);
 -   if (res == AOP_WRITEPAGE_ACTIVATE) {
 -   ClearPageReclaim(page);
 -   return PAGE_ACTIVATE;
 -   }

I don't see ClearPageReclaim added anywhere. Is that done on purpose?
Other than that, the patch looks good to me and I think we should
stick it into -mm to punish Andrew for his secret hack ;-).

  Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-26 Thread Pekka Enberg
Hi,

On 10/26/07, Neil Brown [EMAIL PROTECTED] wrote:
 It seems that the new requirement is that if the address_space
 chooses not to write out the page, it should now call SetPageActive().
 If that is the case, I think it should be explicit in the
 documentation - please?

Agreed.
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-22 Thread Pekka Enberg
Hi Hugh,

On 10/22/07, Hugh Dickins [EMAIL PROTECTED] wrote:
 I don't disagree with your unionfs_writepages patch, Pekka, but I think
 it should be viewed as an optimization (don't waste time trying to write
 a group of pages when we know that nothing will be done) rather than as
 essential.

Ok, so tmpfs needs your fix still.

On 10/22/07, Hugh Dickins [EMAIL PROTECTED] wrote:
  So now I wonder if we still need the patch to prevent AOP_WRITEPAGE_ACTIVATE
  from being returned to userland.  I guess we still need it, b/c even with
  your patch, generic_writepages() can return AOP_WRITEPAGE_ACTIVATE back to
  the VFS and we need to ensure that doesn't leak outside the kernel.

 Can it now?  Current git has a patch from Andrew which bears a striking
 resemblance to that from Pekka, stopping the leak from write_cache_pages.

I don't think it can, it looks ok now.

 Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-22 Thread Pekka Enberg
Hi Hugh,

On Mon, 15 Oct 2007, Pekka Enberg wrote:
  I wonder whether _not setting_ BDI_CAP_NO_WRITEBACK implies that
  -writepage() will never return AOP_WRITEPAGE_ACTIVATE for
  !wbc-for_reclaim case which would explain why we haven't hit this bug
  before. Hugh, Andrew?

On 10/22/07, Hugh Dickins [EMAIL PROTECTED] wrote:
 Only ramdisk and shmem have been returning AOP_WRITEPAGE_ACTIVATE.
 Both of those set BDI_CAP_NO_WRITEBACK.  ramdisk never returned it
 if !wbc-for_reclaim.  I contend that shmem shouldn't either: it's
 a special code to get the LRU rotation right, not useful elsewhere.
 Though Documentation/filesystems/vfs.txt does imply wider use.

 I think this is where people use the phrase go figure ;)

Heh. As far as I can tell, the implication of wider use was added by
Neil in commit 341546f5ad6fce584531f744853a5807a140f2a9 Update some
VFS documentation, so perhaps he might know? Neil?

   Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-15 Thread Pekka Enberg
Hi,

On 10/15/07, Erez Zadok [EMAIL PROTECTED] wrote:
 Pekka, with a small change to your patch (to handle time-based cache
 coherency), your patch worked well and passed all my tests.  Thanks.

 So now I wonder if we still need the patch to prevent AOP_WRITEPAGE_ACTIVATE
 from being returned to userland.  I guess we still need it, b/c even with
 your patch, generic_writepages() can return AOP_WRITEPAGE_ACTIVATE back to
 the VFS and we need to ensure that doesn't leak outside the kernel.

I wonder whether _not setting_ BDI_CAP_NO_WRITEBACK implies that
-writepage() will never return AOP_WRITEPAGE_ACTIVATE for
!wbc-for_reclaim case which would explain why we haven't hit this bug
before. Hugh, Andrew?

And btw, I think we need to fix ecryptfs too.

   Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-14 Thread Pekka Enberg
Hi Hugh,

On Sat, 13 Oct 2007, Pekka Enberg wrote:
 Doesn't msync(2) get to it via mm/page-writeback.c:write_cache_pages()
 without unionfs even?

On 10/14/07, Hugh Dickins [EMAIL PROTECTED] wrote:
 I believe not.  Please do double-check my assertions, I've always found
 the _writepages paths rather twisty; but my belief (supported by the
 fact that we've not hit shmem_writepage's BUG_ON(page_mapped(page))
 in five years) is that tmpfs/shmem opts out of all of that with its
 .capabilities   = BDI_CAP_NO_ACCT_DIRTY | BDI_CAP_NO_WRITEBACK,
 in shmem_backing_dev_info, which avoids all those _writepages avenues
 (e.g. through bdi_cap_writeback_dirty tests), and write_cache_pages is
 just a subfunction of the _writepages.

Thanks for the explanation, you're obviously correct.

However, I don't think the mapping_cap_writeback_dirty() check in
__filemap_fdatawrite_range() works as expected when tmpfs is a lower
mount for an unionfs mount. There's no BDI_CAP_NO_WRITEBACK capability
for unionfs mappings so do_fsync() will call write_cache_pages() that
unconditionally invokes shmem_writepage() via unionfs_writepage().
Unless, of course, there's some other unionfs magic I am missing.

   Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-12 Thread Pekka Enberg
Hi Hugh,

On 10/12/07, Hugh Dickins [EMAIL PROTECTED] wrote:
 But I keep suspecting that the answer might be the patch below (which
 rather follows what drivers/block/rd.c is doing).  I'm especially
 worried that, rather than just AOP_WRITEPAGE_ACTIVATE being returned
 to userspace, bad enough in itself, you might be liable to hit that
 BUG_ON(page_mapped(page)).  shmem_writepage does not expect to be
 called by anyone outside mm/vmscan.c, but unionfs can now get to it?

Doesn't msync(2) get to it via mm/page-writeback.c:write_cache_pages()
without unionfs even?

   Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-08 Thread Pekka Enberg
Hi Ryan,

On 10/8/07, Ryan Finnie [EMAIL PROTECTED] wrote:
 Doesn't appear to be enough.  I can't figure out why (since it appears
 write_cache_pages bubbles up directly to sys_msync), but with that
 patch applied, in my test case[1], msync returns -1 EIO.  However,
 with the exact same kernel without that patch applied, msync returns
 524288 (AOP_WRITEPAGE_ACTIVATE).  But as your patch specifically flips
 524288 to 0, I can't figure out how it eventually returns  -1 EIO.

 [1] apt-get check on a unionfs2 mount backed by tmpfs over cdrom,
 standard livecd setup

You have swap device disabled, right? If so, I can't see any reason
why msync(2) on tmpfs would return -EIO. Can you please send a strace
log for your test case?

   Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SLUB performance regression vs SLAB

2007-10-05 Thread Pekka Enberg
Hi,

On 10/5/07, Jens Axboe [EMAIL PROTECTED] wrote:
 I'd like to second Davids emails here, this is a serious problem. Having
 a reproducible test case lowers the barrier for getting the problem
 fixed by orders of magnitude. It's the difference between the problem
 getting fixed in a day or two and it potentially lingering for months,
 because email ping-pong takes forever and the test team has moved on to
 other tests, we'll let you know the results of test foo in 3 weeks time
 when we have a new slot on the box just removing any developer
 motivation to work on the issue.

What I don't understand is that why don't the people who _have_ access
to the test case fix the problem? Unlike slab, slub is not a pile of
crap that only Christoph can hack on...

   Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [EXT4 set 7][PATCH 1/1]Remove 32000 subdirs limit.

2007-07-13 Thread Pekka Enberg

On 7/13/07, Kalpak Shah [EMAIL PROTECTED] wrote:

 EXT4_DIR_LINK_MAX() is buggy: it evaluates its arg twice.

#define EXT4_DIR_LINK_MAX(dir) (!is_dx(dir)  (dir)-i_nlink = EXT4_LINK_MAX)


[snip]


Sorry, I didn't understand what is the problem with this macro?


The expression represented by 'dir' is evaluated twice (think dir++
here). It's safer to make it a static inline function.

  Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update description in Documentation/filesystems/vfs.txt

2007-06-25 Thread Pekka Enberg

Hi Borislav,

On 6/24/2007, Borislav Petkov [EMAIL PROTECTED] wrote:
 @@ -3,7 +3,7 @@
 
   Original author: Richard Gooch [EMAIL PROTECTED]
 
 -   Last updated on October 28, 2005
 +   Last updated on Juni 24, 2007.

There's a typo here so do s/Juni/June/g please before sending to Andrew.
Other than that, looks good to me.

Acked-by: Pekka Enberg [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] LogFS take three

2007-05-17 Thread Pekka Enberg

Jörn Engel wrote:

Compressing random data will actually enlarge it.  If that happens I
simply store the verbatim uncompressed data instead and mark it as such.

There is also demand for a user-controlled bit in the inode to disable
compression completely.  All those .jpg, .mpg, .mp3, etc. just waste
time by trying and failing to compress them.


So any sane way to enable compression is on per-inode basis which makes 
me still wonder why you need per-object compression.

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] LogFS take three

2007-05-16 Thread Pekka Enberg

On 5/16/07, Jörn Engel [EMAIL PROTECTED] wrote:

  +/* FIXME: all this mess should get replaced by using the page cache */
  +static void fixup_from_wbuf(struct super_block *sb, struct logfs_area
 *area,
  + void *read, u64 ofs, size_t readlen)
  +{

 Indeed. And I think you're getting some more trouble because of this...

More trouble?


Forgot to add (see below). Seems logfs_segment_read would be simpler
too if you fixed this.


[ Objects are the units that get compressed.  Segments can contain both
compressed and uncompressed objects. ]

It is a trade-off.  Each object has a 24 Byte header plus X Bytes of
data.  Whether the data is compressed or not is indicated in the header.


Was my point really. Why do segments contain both compressed and
uncompressed objects?
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] LogFS take three

2007-05-16 Thread Pekka Enberg

On 5/16/07, Pekka Enberg [EMAIL PROTECTED] wrote:

Forgot to add (see below). Seems logfs_segment_read would be simpler
too if you fixed this.


Blah. Just to be clear: I forgot to add a (see below) text in the
original review comment.
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/5] ext4: write support for preallocated blocks/extents

2007-05-07 Thread Pekka Enberg

On 4/26/07, Amit K. Arora [EMAIL PROTECTED] wrote:

 /*
+ * ext4_ext_try_to_merge:
+ * tries to merge the ex extent to the next extent in the tree.
+ * It always tries to merge towards right. If you want to merge towards
+ * left, pass ex - 1 as argument instead of ex.
+ * Returns 0 if the extents (ex and ex+1) were _not_ merged and returns
+ * 1 if they got merged.
+ */
+int ext4_ext_try_to_merge(struct inode *inode,
+   struct ext4_ext_path *path,
+   struct ext4_extent *ex)
+{


Please either use proper kerneldoc format or drop
ext4_ext_try_to_merge from the comment.
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: REISER4 FOR INCLUSION IN THE LINUX KERNEL.

2007-04-09 Thread Pekka Enberg

On 4/9/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:

YOU GUYS WILL LAUGH ABOUT THIS:


No, I am crying actually. Dear post-masters, can we have this thread
shit-canned, please?
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Reiser4. BEST FILESYSTEM EVER.

2007-04-07 Thread Pekka Enberg

On 4/7/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:

I checked what bonnie++ actually writes to its test files, for you. It
is about 98-99% zeros.

Still, the results record sequential reads, of 232,729 K/sec, nearly
four times the physical disk read rate, 63,160 K/sec, of the hard drive.


Excellent! You've established the undeniable hard cold fact that
reiser4 beats the crap out of all other filesystems, when the files
are 98-99% filled with zeros. You've proven your point, so can we stop
this thread now?
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH] revokeat/frevoke system calls V5

2007-02-25 Thread Pekka Enberg

Hi Alan,

On 2/26/07, Alan [EMAIL PROTECTED] wrote:

Whats the status on this, I was suprised to see something so important
just go dead ?


It's not dead. You can find the latest patches here:

http://www.cs.helsinki.fi/u/penberg/linux/revoke/patches/

and user-space tests here:

http://www.cs.helsinki.fi/u/penberg/linux/revoke/utils/

What they are lacking is review so I am not sure how to proceed with
the patches.

Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GFS, what's remaining

2005-09-05 Thread Pekka Enberg
On Thu, Sep 01, 2005 at 01:35:23PM +0200, Arjan van de Ven wrote:
  +void gfs2_glock_hold(struct gfs2_glock *gl)
  +{
  + glock_hold(gl);
  +}
 
  eh why?

On 9/5/05, David Teigland [EMAIL PROTECTED] wrote:
 You removed the comment stating exactly why, see below.  If that's not a
 accepted technique in the kernel, say so and I'll be happy to change it
 here and elsewhere.

Is there a reason why users of gfs2_glock_hold() cannot use
glock_hold() directly?

Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC][PATCH] VFS: update documentation (take #2)

2005-08-24 Thread Pekka Enberg
Hi,

This patch brings the now out-of-date Documentation/filesystems/vfs.txt back
to life. Thanks to Carsten Otte for the description on get_xip_page().

Signed-off-by: Pekka Enberg [EMAIL PROTECTED]
---

 vfs.txt |  382 
 1 file changed, 291 insertions(+), 91 deletions(-)

Index: 2.6-mm/Documentation/filesystems/vfs.txt
===
--- 2.6-mm.orig/Documentation/filesystems/vfs.txt
+++ 2.6-mm/Documentation/filesystems/vfs.txt
@@ -1,10 +1,9 @@
-/* -*- auto-fill -*- */
 
Overview of the Virtual File System
 
-   Richard Gooch [EMAIL PROTECTED]
+   Original author: Richard Gooch [EMAIL PROTECTED]
 
- 5-JUL-1999
+ Last updated on August 24, 2005
 
 
 Conventions used in this document section
@@ -15,9 +14,6 @@ right-hand side of the section title. Ea
 subsection at the right-hand side. These strings are meant to make
 it easier to search through the document.
 
-NOTE that the master copy of this document is available online at:
-http://www.atnf.csiro.au/~rgooch/linux/docs/vfs.txt
-
 
 What is it?   section
 ===
@@ -26,7 +22,7 @@ The Virtual File System (otherwise known
 Switch) is the software layer in the kernel that provides the
 filesystem interface to userspace programs. It also provides an
 abstraction within the kernel which allows different filesystem
-implementations to co-exist.
+implementations to coexist.
 
 
 A Quick Look At How It Works  section
@@ -77,7 +73,7 @@ back to userspace.
 
 Opening a file requires another operation: allocation of a file
 structure (this is the kernel-side implementation of file
-descriptors). The freshly allocated file structure is initialised with
+descriptors). The freshly allocated file structure is initialized with
 a pointer to the dentry and a set of file operation member functions.
 These are taken from the inode data. The open() file method is then
 called so the specific filesystem implementation can do it's work. You
@@ -126,14 +122,18 @@ It's now time to look at things in more 
 struct file_system_type   section
 ===
 
-This describes the filesystem. As of kernel 2.1.99, the following
+This describes the filesystem. As of kernel 2.6.13, the following
 members are defined:
 
 struct file_system_type {
const char *name;
int fs_flags;
-   struct super_block *(*read_super) (struct super_block *, void *, int);
-   struct file_system_type * next;
+struct super_block *(*get_sb) (struct file_system_type *, int,
+   const char *, void *);
+void (*kill_sb) (struct super_block *);
+struct module *owner;
+struct file_system_type * next;
+struct list_head fs_supers;
 };
 
   name: the name of the filesystem type, such as ext2, iso9660,
@@ -141,51 +141,96 @@ struct file_system_type {
 
   fs_flags: various flags (i.e. FS_REQUIRES_DEV, FS_NO_DCACHE, etc.)
 
-  read_super: the method to call when a new instance of this
+  get_sb: the method to call when a new instance of this
filesystem should be mounted
 
-  next: for internal VFS use: you should initialise this to NULL
+  kill_sb: the method to call when an instance of this filesystem
+   should be unmounted
+
+  owner: for internal VFS use: you should initialize this to NULL
 
-The read_super() method has the following arguments:
+  next: for internal VFS use: you should initialize this to NULL
+
+The get_sb() method has the following arguments:
 
   struct super_block *sb: the superblock structure. This is partially
-   initialised by the VFS and the rest must be initialised by the
-   read_super() method
+   initialized by the VFS and the rest must be initialized by the
+   get_sb() method
+
+  int flags: mount flags
+
+  const char *dev_name: the device name we are mounting.
 
   void *data: arbitrary mount options, usually comes as an ASCII
string
 
   int silent: whether or not to be silent on error
 
-The read_super() method must determine if the block device specified
+The get_sb() method must determine if the block device specified
 in the superblock contains a filesystem of the type the method
 supports. On success the method returns the superblock pointer, on
 failure it returns NULL.
 
 The most interesting member of the superblock structure that the
-read_super() method fills in is the s_op field. This is a pointer to
+get_sb() method fills in is the s_op field. This is a pointer to
 a struct super_operations which describes the next level of the
 filesystem implementation.
 
+Usually, a filesystem uses generic one of the generic

[RFC][PATCH] VFS: update documentation

2005-08-21 Thread Pekka Enberg
Hi!

This patch updates the out-of-date Documentation/filesystems/vfs.txt.
As I am a novice on the VFS, I would much appreciate any comments and
help on this.

Signed-off-by: Pekka Enberg [EMAIL PROTECTED]
---

 vfs.txt |  314 +---
 1 files changed, 241 insertions(+), 73 deletions(-)

Index: 2.6-mm/Documentation/filesystems/vfs.txt
===
--- 2.6-mm.orig/Documentation/filesystems/vfs.txt
+++ 2.6-mm/Documentation/filesystems/vfs.txt
@@ -1,10 +1,9 @@
-/* -*- auto-fill -*- */
 
Overview of the Virtual File System
 
-   Richard Gooch [EMAIL PROTECTED]
+   Original author: Richard Gooch [EMAIL PROTECTED]
 
- 5-JUL-1999
+ Last updated on August 21, 2005
 
 
 Conventions used in this document section
@@ -15,9 +14,6 @@ right-hand side of the section title. Ea
 subsection at the right-hand side. These strings are meant to make
 it easier to search through the document.
 
-NOTE that the master copy of this document is available online at:
-http://www.atnf.csiro.au/~rgooch/linux/docs/vfs.txt
-
 
 What is it?   section
 ===
@@ -126,14 +122,18 @@ It's now time to look at things in more 
 struct file_system_type   section
 ===
 
-This describes the filesystem. As of kernel 2.1.99, the following
+This describes the filesystem. As of kernel 2.6.13, the following
 members are defined:
 
 struct file_system_type {
const char *name;
int fs_flags;
-   struct super_block *(*read_super) (struct super_block *, void *, int);
-   struct file_system_type * next;
+struct super_block *(*get_sb) (struct file_system_type *, int,
+   const char *, void *);
+void (*kill_sb) (struct super_block *);
+struct module *owner;
+struct file_system_type * next;
+struct list_head fs_supers;
 };
 
   name: the name of the filesystem type, such as ext2, iso9660,
@@ -141,51 +141,96 @@ struct file_system_type {
 
   fs_flags: various flags (i.e. FS_REQUIRES_DEV, FS_NO_DCACHE, etc.)
 
-  read_super: the method to call when a new instance of this
+  get_sb: the method to call when a new instance of this
filesystem should be mounted
 
+  kill_sb: the method to call when an instance of this filesystem
+   should be unmounted
+
+  owner: for internal VFS use: you should initialise this to NULL
+
   next: for internal VFS use: you should initialise this to NULL
 
-The read_super() method has the following arguments:
+The get_sb() method has the following arguments:
 
   struct super_block *sb: the superblock structure. This is partially
initialised by the VFS and the rest must be initialised by the
-   read_super() method
+   get_sb() method
+
+  int flags: mount flags
+
+  const char *dev_name: the device name we are mounting.
 
   void *data: arbitrary mount options, usually comes as an ASCII
string
 
   int silent: whether or not to be silent on error
 
-The read_super() method must determine if the block device specified
+The get_sb() method must determine if the block device specified
 in the superblock contains a filesystem of the type the method
 supports. On success the method returns the superblock pointer, on
 failure it returns NULL.
 
 The most interesting member of the superblock structure that the
-read_super() method fills in is the s_op field. This is a pointer to
+get_sb() method fills in is the s_op field. This is a pointer to
 a struct super_operations which describes the next level of the
 filesystem implementation.
 
+Usually, a filesystem uses generic one of the generic get_sb()
+implementations and provides a fill_super() method instead. The
+generic methods are:
+
+  get_sb_bdev: mount a filesystem residing on a block device
+
+  get_sb_nodev: mount a filesystem that is not backed by a device
+
+  get_sb_single: mount a filesystem which shares the instance between
+   all mounts
+
+A fill_super() method implementation has the following arguments:
+
+  struct super_block *sb: the superblock structure. The method fill_super()
+   must initialise this properly.
+
+  void *data: arbitrary mount options, usually comes as an ASCII
+   string
+
+  int silent: whether or not to be silent on error
+
 
 struct super_operations   section
 ===
 
 This describes how the VFS can manipulate the superblock of your
-filesystem. As of kernel 2.1.99, the following members are defined:
+filesystem. As of kernel 2.6.13, the following members are defined:
 
 struct super_operations {
-   void (*read_inode) (struct inode

Re: share/private/slave a subtree - define vs enum

2005-07-10 Thread Pekka Enberg
Hi Roman,

At some point in time, I wrote:
  Roman, it is not as if I get to decide for the patch submitters. I
  comment on any issues _I_ have with the patch and the authors fix
  whatever they want (or what the maintainers ask for).

On Fri, 2005-07-08 at 21:59 +0200, Roman Zippel wrote:
 The point of a review is to comment on things that _need_ fixing. Less 
 experienced hackers take this a requirement for their drivers to be 
 included.

Hmm. So we disagree on that issue as well. I think the point of review
is to improve code and help others conform with the existing coding
style which is why I find it strange that you're suggesting me to limit
my comments to a subset you agree with.

Would you please be so kind to define your criteria for things that
need fixing so we could see if can reach some sort of an agreement on
this. My list is roughly as follows:

  - Erroneous use of kernel API
  - Bad coding style
  - Layering violations
  - Duplicate code
  - Hard to read code

Pekka

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: share/private/slave a subtree

2005-07-08 Thread Pekka Enberg
On Fri, 2005-07-08 at 15:34 +0200, Roman Zippel wrote:
 Are the advantages big enough to actively discourage defines? It's nice 
 that you do reviews, but please leave some room for personal preferences. 
 If the code is correct and perfectly readable, it doesn't matter whether 
 to use defines or enums. Unless you also intent to also debug and work 
 with that code, why don't leave the decision to the author?

I think the advantages are big enough. Also, in my experience, it is
usually not a conscious decision by the author. But if you and other
developers think my enum pushing is too much, I can tone it down :).

Pekka

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html