Re: [PATCH 05/14] fs: don't allow kernel reads and writes without iter ops

2020-10-09 Thread Alexander Viro
On Fri, Oct 09, 2020 at 06:29:13PM -0700, Linus Torvalds wrote:
> On Fri, Oct 9, 2020 at 6:19 PM Eric Biggers  wrote:
> >
> > Okay, that makes more sense.  So the patchset from Matthew
> > https://lkml.kernel.org/linux-fsdevel/20201003025534.21045-1-wi...@infradead.org/T/#u
> > isn't what you had in mind.
> 
> No.
> 
> That first patch makes sense - it's just the "ppos can be NULL" patch.
> 
> But as mentioned, NULL isn't "shorthand for zero". It's just "pipes
> don't _have_ a pos, trying to pass in some explicit position is
> crazy".
> 
> So no, the other patches in that set are a bit odd, I think.
> 
> SOME of them look potentially fine - the bpfilter one seems to be
> valid, for example, because it's literally about reading/writing a
> pipe. And maybe the sysctl one is similarly sensible - I didn't check
> the context of that one.

FWIW, I hadn't pushed that branch out (or merged it into #for-next yet);
for one thing, uml part (mconsole) is simply broken, for another...
IMO ##5--8 are asking for kernel_pread() and if you look at binfmt_elf.c,
you'll see elf_read() being pretty much that.  acct.c, keys and usermode
parts are asking for kernel_pwrite() as well.

I've got stuck looking through the drivers/target stuff - it would've
been another kernel_pwrite() candidate, but it smells like its use of
filp_open() is really asking for trouble, starting with symlink attacks.
Not sure - I'm not familiar with the area, but...



Re: [PATCH] NFS: Stop sillyname renames and unmounts from racing

2007-11-06 Thread Alexander Viro
On Tue, Nov 06, 2007 at 10:24:50AM +0200, Benny Halevy wrote:

> It'd be very nice if the silly renamed inodes (with silly_count > 1) were 
> moved
> to a different list in the first pass, under the inode_lock, and then waited 
> on
> until silly_count <= 1 in a second pass only on the filtered list.  This will
> provide you with O(1).

It's absolutely pointless, starting with any kind of searching for inodes,
etc.  If you want fs shutdown _not_ to happen until async activity of
that kind is over, don't reinvent the sodding wheels, just tell VFS that
you are holding an active reference to superblock.  End of story.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] NFS: Stop sillyname renames and unmounts from racing

2007-11-06 Thread Alexander Viro
On Tue, Nov 06, 2007 at 10:24:50AM +0200, Benny Halevy wrote:

 It'd be very nice if the silly renamed inodes (with silly_count  1) were 
 moved
 to a different list in the first pass, under the inode_lock, and then waited 
 on
 until silly_count = 1 in a second pass only on the filtered list.  This will
 provide you with O(1).

It's absolutely pointless, starting with any kind of searching for inodes,
etc.  If you want fs shutdown _not_ to happen until async activity of
that kind is over, don't reinvent the sodding wheels, just tell VFS that
you are holding an active reference to superblock.  End of story.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] NFS: Stop sillyname renames and unmounts from racing

2007-11-05 Thread Alexander Viro
On Mon, Nov 05, 2007 at 09:06:36PM -0800, Andrew Morton wrote:
> > Any objections to exporting the inode_lock spin lock?
> > If so, how should modules _safely_ access the s_inode list?

> That's going to make hch unhappy.

That's going to make me just as unhappy, especially since it's pointless;
instead of the entire sorry mess we should just bump sb->s_active to pin
the superblock down (we know that it's active at that point, so it's just
an atomic_inc(); no games with locking, etc., are needed) and call
deactivate_super() on the way out.  And deactivate_super() is exported
already.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] NFS: Stop sillyname renames and unmounts from racing

2007-11-05 Thread Alexander Viro
On Mon, Nov 05, 2007 at 09:06:36PM -0800, Andrew Morton wrote:
  Any objections to exporting the inode_lock spin lock?
  If so, how should modules _safely_ access the s_inode list?

 That's going to make hch unhappy.

That's going to make me just as unhappy, especially since it's pointless;
instead of the entire sorry mess we should just bump sb-s_active to pin
the superblock down (we know that it's active at that point, so it's just
an atomic_inc(); no games with locking, etc., are needed) and call
deactivate_super() on the way out.  And deactivate_super() is exported
already.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] nfsroot uses bogus mountd version for NFSv2

2001-07-19 Thread Alexander Viro

nfsroot uses bogus protocol version when it asks portmapper on
server for mountd port. Fix is obvious:

--- linux/fs/nfs/nfsroot.cFri Feb 16 18:56:03 2001
+++ linux/fs/nfs/nfsroot.c.new  Thu Jul 19 23:55:09 2001
@@ -418,7 +418,7 @@
"as nfsd port\n", port);
}
 
-   if ((port = root_nfs_getport(NFS_MNT_PROGRAM, nfsd_ver, proto)) < 0) {
+   if ((port = root_nfs_getport(NFS_MNT_PROGRAM, mountd_ver, proto)) < 0) {
printk(KERN_ERR "Root-NFS: Unable to get mountd port "
"number from server, using default\n");
port = mountd_port;

Notice that for NFSv3 both nfsd and mountd are using version 3, so it both
nfsd_ver == mountd_ver. However, for NFSv2 we end up asking for mountd
version 2, which doesn't exist - mountd version for NFSv2 was 1.

Looks like this typo got into the tree in 2.3.99-4-pre3 when NFSv3 had
been merged into the tree - until then we had (correctly) asked for
version 1. Corresponding code in 2.2 is using mountd_ver, so it's also
OK.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[PATCH] nfsroot uses bogus mountd version for NFSv2

2001-07-19 Thread Alexander Viro

nfsroot uses bogus protocol version when it asks portmapper on
server for mountd port. Fix is obvious:

--- linux/fs/nfs/nfsroot.cFri Feb 16 18:56:03 2001
+++ linux/fs/nfs/nfsroot.c.new  Thu Jul 19 23:55:09 2001
@@ -418,7 +418,7 @@
as nfsd port\n, port);
}
 
-   if ((port = root_nfs_getport(NFS_MNT_PROGRAM, nfsd_ver, proto))  0) {
+   if ((port = root_nfs_getport(NFS_MNT_PROGRAM, mountd_ver, proto))  0) {
printk(KERN_ERR Root-NFS: Unable to get mountd port 
number from server, using default\n);
port = mountd_port;

Notice that for NFSv3 both nfsd and mountd are using version 3, so it both
nfsd_ver == mountd_ver. However, for NFSv2 we end up asking for mountd
version 2, which doesn't exist - mountd version for NFSv2 was 1.

Looks like this typo got into the tree in 2.3.99-4-pre3 when NFSv3 had
been merged into the tree - until then we had (correctly) asked for
version 1. Corresponding code in 2.2 is using mountd_ver, so it's also
OK.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Acpi] Re: ACPI fundamental locking problems

2001-07-08 Thread Alexander Viro



On Sat, 7 Jul 2001, Jamie Lokier wrote:

> Daniel Phillips wrote:
> > > Reading a tarball is the distillation of what you describe into
> > > efficient form :)
> > 
> > /me downloads tar file definition
> > 
> > Um, gnu tar or posix tar? or some new, improved tar?
> 
> I suggest cpio, which is more compact and in some ways more standard.
> (tar has a silly pad-to-multiple-of-512-byte per file rule, which is
> inappropriate for this).  GNU cpio creates cpio format just fine.

GNU cpio is a race-ridden unmaintained pile of junk. Look at the size
of, say it, Debian patch to upstream source. Then try to read the
patched code.  Quite a few of us simply don't have that FPOS on their
boxen.

Using cpio archive layout is OK, but _please_, don't make it dependent
on GNU cpio.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Acpi] Re: ACPI fundamental locking problems

2001-07-08 Thread Alexander Viro



On Sat, 7 Jul 2001, Jamie Lokier wrote:

 Daniel Phillips wrote:
   Reading a tarball is the distillation of what you describe into
   efficient form :)
  
  /me downloads tar file definition
  
  Um, gnu tar or posix tar? or some new, improved tar?
 
 I suggest cpio, which is more compact and in some ways more standard.
 (tar has a silly pad-to-multiple-of-512-byte per file rule, which is
 inappropriate for this).  GNU cpio creates cpio format just fine.

GNU cpio is a race-ridden unmaintained pile of junk. Look at the size
of, say it, Debian patch to upstream source. Then try to read the
patched code.  Quite a few of us simply don't have that FPOS on their
boxen.

Using cpio archive layout is OK, but _please_, don't make it dependent
on GNU cpio.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Acpi] Re: ACPI fundamental locking problems

2001-07-07 Thread Alexander Viro



On 7 Jul 2001, Eugene Crosser wrote:

> Doesn't the approach "treat a chunk of data built into bzImage as
> populated ramfs" look cleaner?  No need to fiddle with tar format,
> no copying data from place to place.

What the hell _is_ "populated ramfs"? The thing doesn't live in array
of blocks. Its directory structure consists of a bunch of dentries.
Permissions/ownership/timestamps are in a bunch of struct inode -
sitting in icache and allocated in normal way. Regular files are
entirely in pagecache, ditto for symlinks.

Ramfs has no backing store. At all. That's precisely what remains of
filesystem if you take backing store away - everything is in VFS/VM caches.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Acpi] Re: ACPI fundamental locking problems

2001-07-07 Thread Alexander Viro



On 7 Jul 2001, Eugene Crosser wrote:

 Doesn't the approach treat a chunk of data built into bzImage as
 populated ramfs look cleaner?  No need to fiddle with tar format,
 no copying data from place to place.

What the hell _is_ populated ramfs? The thing doesn't live in array
of blocks. Its directory structure consists of a bunch of dentries.
Permissions/ownership/timestamps are in a bunch of struct inode -
sitting in icache and allocated in normal way. Regular files are
entirely in pagecache, ditto for symlinks.

Ramfs has no backing store. At all. That's precisely what remains of
filesystem if you take backing store away - everything is in VFS/VM caches.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Acpi] Re: ACPI fundamental locking problems

2001-07-05 Thread Alexander Viro



On Thu, 5 Jul 2001, Helge Hafting wrote:

> Linus Torvalds wrote:
> [...]
> > We migth want to just make initrd a built-in thing in the kernel,
> > something that you simply cannot avoid. A lot of these things (ie dhcp for
> > NFS root etc) are right now done in kernel space, simply because we don't
> > want to depend on initrd, and people want to use old loaders.
> > 
> > I don't like the current initrd very much myself, I have to admit. I'm not
> > going to accept a "you have to have a ramdisk" approach - I think the
> > ramdisks are really broken.
> > 
> > But I've seen a "populate ramfs from a tar-file built into 'bzImage'"
> > patch somewhere, and that would be a whole lot more palatable to me.
> > 
> > If anybody were to send me a patch that just unconditionally does this, I
> > would probably not be adverse to putting it into 2.5.x. We have all the
> > infrastructure to make all this a lot cleaner than it used to be (ie the
> > "pivot_root()" stuff etc means that we can _truly_ do things from user
> > mode, with no magic kernel flags).

Open 2.5 and I'm starting to feed that stuff in pieces...

> I am fine with "You have to use initrd (or similiar) _if_ you want this
> feature."

"Similar" == ramfs.

> But please don't make initrd mandatory for those of us who don't
> need ACPI, don't need dhcp before mounting disks and so on.

How about "don't want to keep special-case code for mounting root in your
kernel"? It's more than ramfs, BTW, and rm(1) on ramfs frees memory just
fine.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Acpi] Re: ACPI fundamental locking problems

2001-07-05 Thread Alexander Viro



On Thu, 5 Jul 2001, Helge Hafting wrote:

 Linus Torvalds wrote:
 [...]
  We migth want to just make initrd a built-in thing in the kernel,
  something that you simply cannot avoid. A lot of these things (ie dhcp for
  NFS root etc) are right now done in kernel space, simply because we don't
  want to depend on initrd, and people want to use old loaders.
  
  I don't like the current initrd very much myself, I have to admit. I'm not
  going to accept a you have to have a ramdisk approach - I think the
  ramdisks are really broken.
  
  But I've seen a populate ramfs from a tar-file built into 'bzImage'
  patch somewhere, and that would be a whole lot more palatable to me.
  
  If anybody were to send me a patch that just unconditionally does this, I
  would probably not be adverse to putting it into 2.5.x. We have all the
  infrastructure to make all this a lot cleaner than it used to be (ie the
  pivot_root() stuff etc means that we can _truly_ do things from user
  mode, with no magic kernel flags).

Open 2.5 and I'm starting to feed that stuff in pieces...

 I am fine with You have to use initrd (or similiar) _if_ you want this
 feature.

Similar == ramfs.

 But please don't make initrd mandatory for those of us who don't
 need ACPI, don't need dhcp before mounting disks and so on.

How about don't want to keep special-case code for mounting root in your
kernel? It's more than ramfs, BTW, and rm(1) on ramfs frees memory just
fine.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ufs on linux question/problem

2001-07-03 Thread Alexander Viro



On Tue, 3 Jul 2001, Admin Mailing Lists wrote:

> 
> Trying to mount a solaris x86 drive under linux.
> kernel 2.4.5, ufs support and x86 partition support compiled in (no
> module)
> On boot, linux recognizes the drive, but shows no solaris partitions on
> it.
> Below, linux drive is hda, solaris is hdb.

You need support of Solaris disklabels. And UFS patches that are in
-ac. Then you can get more or less safe r/o mounts. r/w is hopeless
at that stage.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Recent change in directory g+s behavior (bug?)

2001-07-03 Thread Alexander Viro



On Tue, 3 Jul 2001, Ken Brownfield wrote:

> Somewhere between 2.4.5-pre1 and 2.4.6-pre3, the behavior of the setgid
> bit on directories has changed:

Fsck... Linus, please apply the patch below. That's a bug in
ext2_new_inode() that used to be hidden by redundant code in ext2_mkdir().

Notice that current code in ext2_new_inode() makes no sense at all -
the only reason why gcc doesn't scream bloody murder is that we have (unrelated)
S_ISLNK(mode) several lines below.

--- fs/ext2/ialloc.cTue Jun  5 09:24:52 2001
+++ fs/ext2/ialloc.c.fixTue Jul  3 05:38:37 2001
@@ -417,7 +417,6 @@
cpu_to_le32(le32_to_cpu(es->s_free_inodes_count) - 1);
mark_buffer_dirty(sb->u.ext2_sb.s_sbh);
sb->s_dirt = 1;
-   inode->i_mode = mode;
inode->i_uid = current->fsuid;
if (test_opt (sb, GRPID))
inode->i_gid = dir->i_gid;
@@ -427,6 +426,7 @@
mode |= S_ISGID;
} else
inode->i_gid = current->fsgid;
+   inode->i_mode = mode;
 
inode->i_ino = j;
inode->i_blksize = PAGE_SIZE;   /* This is the optimal IO size (for stat), not 
the fs block size */

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Recent change in directory g+s behavior (bug?)

2001-07-03 Thread Alexander Viro



On Tue, 3 Jul 2001, Ken Brownfield wrote:

 Somewhere between 2.4.5-pre1 and 2.4.6-pre3, the behavior of the setgid
 bit on directories has changed:

Fsck... Linus, please apply the patch below. That's a bug in
ext2_new_inode() that used to be hidden by redundant code in ext2_mkdir().

Notice that current code in ext2_new_inode() makes no sense at all -
the only reason why gcc doesn't scream bloody murder is that we have (unrelated)
S_ISLNK(mode) several lines below.

--- fs/ext2/ialloc.cTue Jun  5 09:24:52 2001
+++ fs/ext2/ialloc.c.fixTue Jul  3 05:38:37 2001
@@ -417,7 +417,6 @@
cpu_to_le32(le32_to_cpu(es-s_free_inodes_count) - 1);
mark_buffer_dirty(sb-u.ext2_sb.s_sbh);
sb-s_dirt = 1;
-   inode-i_mode = mode;
inode-i_uid = current-fsuid;
if (test_opt (sb, GRPID))
inode-i_gid = dir-i_gid;
@@ -427,6 +426,7 @@
mode |= S_ISGID;
} else
inode-i_gid = current-fsgid;
+   inode-i_mode = mode;
 
inode-i_ino = j;
inode-i_blksize = PAGE_SIZE;   /* This is the optimal IO size (for stat), not 
the fs block size */

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ufs on linux question/problem

2001-07-03 Thread Alexander Viro



On Tue, 3 Jul 2001, Admin Mailing Lists wrote:

 
 Trying to mount a solaris x86 drive under linux.
 kernel 2.4.5, ufs support and x86 partition support compiled in (no
 module)
 On boot, linux recognizes the drive, but shows no solaris partitions on
 it.
 Below, linux drive is hda, solaris is hdb.

You need support of Solaris disklabels. And UFS patches that are in
-ac. Then you can get more or less safe r/o mounts. r/w is hopeless
at that stage.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: A Possible 2.5 Idea, maybe?

2001-06-30 Thread Alexander Viro



On Sat, 30 Jun 2001, Philips wrote:

>   If I could choose what filesystem to run on / - it impact performance greatly

No, it doesn't. Most of lookups go outside of root and within root you
mostly deal with cached lookups from dcache (which doesn't give a damn for
fs type) and with page cache lookups for data (mostly in libc) (ditto).

[snip]

>   This would be one little step toward the microkernel architecture (like Hurd).
> Good again :-)

Hurd and architecture in one sentence? Uh-oh...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: A Possible 2.5 Idea, maybe?

2001-06-30 Thread Alexander Viro



On Sat, 30 Jun 2001, Philips wrote:

   If I could choose what filesystem to run on / - it impact performance greatly

No, it doesn't. Most of lookups go outside of root and within root you
mostly deal with cached lookups from dcache (which doesn't give a damn for
fs type) and with page cache lookups for data (mostly in libc) (ditto).

[snip]

   This would be one little step toward the microkernel architecture (like Hurd).
 Good again :-)

Hurd and architecture in one sentence? Uh-oh...

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: VFS locking & HFS problems (2.4.6pre6)

2001-06-29 Thread Alexander Viro



On Fri, 29 Jun 2001, Benjamin Herrenschmidt wrote:

> The deadlock happen in the HFS filesystem in hfs_cat_put(), apparently
> (quickly looking at addresses) in spin_lock().


Uh-oh. Looks like hfs_cat_put() grabs some internal spinlock and calls
write_entry(). If it really is what its name implies, you are calling
a blocking function under the spinlock.

> So my question: Is there any document explaining the various locking
> requirements & re-entrency possibilities in a filesystem.

There is, but this bug has nothing fs-specific in it. You should never
block while holding a spinlock.

BTW, looks like 2.2 has the same bug.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: directory order of files

2001-06-29 Thread Alexander Viro



On Fri, 29 Jun 2001, Alan Cox wrote:

> > With Linux ext2, and some other systems, when you create files in a
> > new directory, the file system remembers their order:
> 
> No - it merely seems too. 
> 
> > $ touch one two three four
> > $ ls -U
> > one  two  three  four
> 
> Then try 'rm three; touch five'

Moreover, it isn't true even for the case when we create a list of files
in empty directory. Example: assuming that /tmp has 1Kb blocks,

mkdir /tmp/A
cd A
touch `perl -e 'print "a"x255'`
touch `perl -e 'print "b"x255'`
touch `perl -e 'print "c"x255'`
touch `perl -e 'print "d"x255'`
touch A
ls -U

will give you (lots of a) (lots of b) (lots of c) A (lots of d).

With 4Kb blocks you'll need 16 long names instead of 4 - the effect
will be the same.

The reason is quite simple - at some point you get no space for long
name and it goes into the next directory block, but there's still enough
for a short name, so it gets created in the first block.

IOW, there's no warranties at all.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: VFS locking HFS problems (2.4.6pre6)

2001-06-29 Thread Alexander Viro



On Fri, 29 Jun 2001, Benjamin Herrenschmidt wrote:

 The deadlock happen in the HFS filesystem in hfs_cat_put(), apparently
 (quickly looking at addresses) in spin_lock().

looks
Uh-oh. Looks like hfs_cat_put() grabs some internal spinlock and calls
write_entry(). If it really is what its name implies, you are calling
a blocking function under the spinlock.

 So my question: Is there any document explaining the various locking
 requirements  re-entrency possibilities in a filesystem.

There is, but this bug has nothing fs-specific in it. You should never
block while holding a spinlock.

BTW, looks like 2.2 has the same bug.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: directory order of files

2001-06-29 Thread Alexander Viro



On Fri, 29 Jun 2001, Alan Cox wrote:

  With Linux ext2, and some other systems, when you create files in a
  new directory, the file system remembers their order:
 
 No - it merely seems too. 
 
  $ touch one two three four
  $ ls -U
  one  two  three  four
 
 Then try 'rm three; touch five'

Moreover, it isn't true even for the case when we create a list of files
in empty directory. Example: assuming that /tmp has 1Kb blocks,

mkdir /tmp/A
cd A
touch `perl -e 'print ax255'`
touch `perl -e 'print bx255'`
touch `perl -e 'print cx255'`
touch `perl -e 'print dx255'`
touch A
ls -U

will give you (lots of a) (lots of b) (lots of c) A (lots of d).

With 4Kb blocks you'll need 16 long names instead of 4 - the effect
will be the same.

The reason is quite simple - at some point you get no space for long
name and it goes into the next directory block, but there's still enough
for a short name, so it gets created in the first block.

IOW, there's no warranties at all.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Bug in 2.4.5 in proc_pid_make_inode ()

2001-06-28 Thread Alexander Viro



On Thu, 28 Jun 2001, Martin Wilck wrote:

> Hi,
> 
> I have recently experienced a number of kernel OOPSes
> in "top" under heavy load. Kernel is 2.4.5 (IA64, but
> this has nothing to do the IA64 patch).
> 
> The OOPS happens in the call tree
> 
> open () system call
> [...]
> real_lookup ()
> proc_base_lookup ()
> proc_pid_make_inode ()
> iput ()
> proc_delete_inode () -> OOPS in __MOD_DEC_USE_COUNT

Known, had been already fixed in 2.4.6-pre3.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Bug in 2.4.5 in proc_pid_make_inode ()

2001-06-28 Thread Alexander Viro



On Thu, 28 Jun 2001, Martin Wilck wrote:

 Hi,
 
 I have recently experienced a number of kernel OOPSes
 in top under heavy load. Kernel is 2.4.5 (IA64, but
 this has nothing to do the IA64 patch).
 
 The OOPS happens in the call tree
 
 open () system call
 [...]
 real_lookup ()
 proc_base_lookup ()
 proc_pid_make_inode ()
 iput ()
 proc_delete_inode () - OOPS in __MOD_DEC_USE_COUNT

Known, had been already fixed in 2.4.6-pre3.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Maximum mountpoints + chrooted login

2001-06-27 Thread Alexander Viro



On Wed, 27 Jun 2001, Magnus Naeslund(f) wrote:

> I'll wait for 2.5 then...
> Where's that namespace patch located?

The last one I've put on anonftp was against 2.4.6-pre2 (namespaces-a-S6-pre2,
on ftp.math.psu.edu/pub/viro). It still includes tons of fs/super.c cleanups
and fixes - they still need to be merged into the tree.

> Now in 2.4.5 it's darn slow to _unmount_, it's like 100 times faster to
> mount than unmount :)

Erm... The last umount should sync everything on given fs. You don't
read a hundred megabytes upon mount but you can easily get such amount
of dirty data after working for a while ;-)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: mounting a fs in two places at once?

2001-06-27 Thread Alexander Viro



On Thu, 28 Jun 2001, Chris Wedgwood wrote:

> On Mon, Jun 25, 2001 at 02:20:16AM -0700, Ben Ford wrote:
> 
> > Feature.  It actually makes it quite nice when you want to allow
> > chrooted user(s) access to a common directory, you just mount a
> > partition in all the users home dirs.
> 
> For security, this can be a bad idea.
> 
> Potentially, chrooted user can mess with another, by messing with
> libraries and such like. In most cases not terribly easy, but in some
> cases possible.

If chrooted user had gained root - he can do much more damage than that.
If your libraries are world-writable - you had asked for that, hadn't
you?

> No, if the fs was mounted RO, then I assume you would have less to
> worry about. Its a pity the VFS code doesn't allow you to fix RO & RW
> of the same fs.

 2.5 stuff. Requires extra argument on getattr/setattr/permission -
prototype change on 3 methods for something that is a feature and not a
fix for any specific bug...

If you want root-proof analog of chroot - fine, but that will require
at least taking away the ability to mount/umount anything. Otherwise
attacker will simply be able to remount everything he want r/w once he
had gained root. That can be done (e.g. by adding "can modify" flag
to namespace and doing something along the lines

pid = clone(CLONE_NAMESPACE, NULL);
if (!pid) {
/* do all needed mount/umount work */
pid = clone(CLONE_FREEZE_NAMESPACE, NULL);
if (!pid) {
/* we are set */
}
exit(0);
}

which would give grandchild a namespace we want it to see and prohibit
any changes in said namespace, root or not)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Maximum mountpoints + chrooted login

2001-06-27 Thread Alexander Viro



On Wed, 27 Jun 2001, Magnus Naeslund(f) wrote:

> I was thinking of doing a chrooted login for some ssh accounts.
> The plan is this:

[snip CLONE_NAMESPACE-by-hands]
 
> Does this seem like a bad idea?
> (then please tell me why :))

Mostly because there's a better way to do that. Yes, such scheme would
work (that + massive pending fs/super.c cleanups was the main reason why
I didn't go for proper solution in 2.4.0-test*). However, instead of
crufting up kinda-sorta namespaces one could use the real thing. Relevant
cleanups of superblock handling will go in in 2.5.very_early and the
rest of patch (namespace proper) takes about 10Kb.

You can simply say clone(CLONE_NAMESPACE,NULL) and you get an independent
set of mounts to play with. mount/umount whatever you want before dropping
the root priveleges. All children of that process will share its namespace.
When the last one goes away everything will be garbage-collected - no
need to umount anything on logout.

> One problem could be the _massive_ mounts, 3*online_users.
> Are there any limits/drawbacks doing it like this?

With the mntcache in - not really. It fixes the main performance problem.
Memory cost is sizeof(struct vfsmount)*total amount of mountpoints. I.e.
about 100 bytes per mountpoint. That's it.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] User chroot

2001-06-27 Thread Alexander Viro



On Wed, 27 Jun 2001, Chris Wedgwood wrote:

> On Tue, Jun 26, 2001 at 09:40:36PM -0400, Alexander Viro wrote:
> 
> > You need /dev/zero to get anywhere near the normal behaviour of the
> > system.
> 
> Not commenting on the original patch, I think requiring /dev/zero for
> a 'usable' system should be considered a [g]libc bug. /dev/zero should
> be present, but if not, [g]libc should have fall-back mechanisms to
> deal with things.

Frankly, glibc already has too many fall-back mechanisms of various kinds.
Several things Should Be There(tm). /dev/zero, /dev/null and /dev/tty are
definitely among them.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] User chroot

2001-06-27 Thread Alexander Viro



On Wed, 27 Jun 2001, Chris Wedgwood wrote:

 On Tue, Jun 26, 2001 at 09:40:36PM -0400, Alexander Viro wrote:
 
  You need /dev/zero to get anywhere near the normal behaviour of the
  system.
 
 Not commenting on the original patch, I think requiring /dev/zero for
 a 'usable' system should be considered a [g]libc bug. /dev/zero should
 be present, but if not, [g]libc should have fall-back mechanisms to
 deal with things.

Frankly, glibc already has too many fall-back mechanisms of various kinds.
Several things Should Be There(tm). /dev/zero, /dev/null and /dev/tty are
definitely among them.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Maximum mountpoints + chrooted login

2001-06-27 Thread Alexander Viro



On Wed, 27 Jun 2001, Magnus Naeslund(f) wrote:

 I was thinking of doing a chrooted login for some ssh accounts.
 The plan is this:

[snip CLONE_NAMESPACE-by-hands]
 
 Does this seem like a bad idea?
 (then please tell me why :))

Mostly because there's a better way to do that. Yes, such scheme would
work (that + massive pending fs/super.c cleanups was the main reason why
I didn't go for proper solution in 2.4.0-test*). However, instead of
crufting up kinda-sorta namespaces one could use the real thing. Relevant
cleanups of superblock handling will go in in 2.5.very_early and the
rest of patch (namespace proper) takes about 10Kb.

You can simply say clone(CLONE_NAMESPACE,NULL) and you get an independent
set of mounts to play with. mount/umount whatever you want before dropping
the root priveleges. All children of that process will share its namespace.
When the last one goes away everything will be garbage-collected - no
need to umount anything on logout.

 One problem could be the _massive_ mounts, 3*online_users.
 Are there any limits/drawbacks doing it like this?

With the mntcache in - not really. It fixes the main performance problem.
Memory cost is sizeof(struct vfsmount)*total amount of mountpoints. I.e.
about 100 bytes per mountpoint. That's it.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Maximum mountpoints + chrooted login

2001-06-27 Thread Alexander Viro



On Wed, 27 Jun 2001, Magnus Naeslund(f) wrote:

 I'll wait for 2.5 then...
 Where's that namespace patch located?

The last one I've put on anonftp was against 2.4.6-pre2 (namespaces-a-S6-pre2,
on ftp.math.psu.edu/pub/viro). It still includes tons of fs/super.c cleanups
and fixes - they still need to be merged into the tree.

 Now in 2.4.5 it's darn slow to _unmount_, it's like 100 times faster to
 mount than unmount :)

Erm... The last umount should sync everything on given fs. You don't
read a hundred megabytes upon mount but you can easily get such amount
of dirty data after working for a while ;-)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] User chroot

2001-06-26 Thread Alexander Viro



On Tue, 26 Jun 2001, Paul Menage wrote:

> But only root can set this up, since you currently have to be root in
> order to chroot(). The (only) advantage of the user chroot() patch would
> be that users would be able to do the same thing without root
> intervention.

You need to be root to do mknod. You need to do mknod to create /dev/zero.
You need /dev/zero to get anywhere near the normal behaviour of the system.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[RFC] Checks in ext2_new_block()

2001-06-26 Thread Alexander Viro

Ted, could you comment on sanity checks in ext2_new_block()?
a)
if (tmp == le32_to_cpu(gdp->bg_block_bitmap) ||
tmp == le32_to_cpu(gdp->bg_inode_bitmap) ||
in_range (tmp, le32_to_cpu(gdp->bg_inode_table),
  sb->u.ext2_sb.s_itb_per_group))
ext2_error (sb, "ext2_new_block",
"Allocating block in system zone - "
"block = %u", tmp);

will go ahead and return the block. Looks like we can do better than that
if we mark it in use (we do that anyway), decremnt relevant free blocks
counters (global and cylinder group one) and goto repeat;

b) we don't do similar checks for blocks we grab in preallocation loop.
And ext2_alloc_block() doesn't do such checks either.

c)
if (ext2_set_bit (j, bh->b_data)) {
ext2_warning (sb, "ext2_new_block",
  "bit already set for block %d", j);
DQUOT_FREE_BLOCK(sb, inode, 1);
goto repeat;
}
is of the "if memory got corrupted during the last dozens of cycles" variety -
we had seen that bit 0 several lines before and we couldn't even block during
that interval (not that it mattered much, since all modifications of these
bitmaps are under lock_super() anyway).

d)
if (j >= le32_to_cpu(es->s_blocks_count)) {
ext2_error (sb, "ext2_new_block",
"block(%d) >= blocks count(%d) - "
"block_group = %d, es == %p ",j,
le32_to_cpu(es->s_blocks_count), i, es);
goto out;
}
is a bit too late _and_ we don't do anything similar for preallocated blocks.

The question being: which of these checks deserve to stay ((c) doesn't, IMO)
and which deserve to be extended to preallocation? If we do them for
main path, we ought to be at least consistent...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[RFC] Checks in ext2_new_block()

2001-06-26 Thread Alexander Viro

Ted, could you comment on sanity checks in ext2_new_block()?
a)
if (tmp == le32_to_cpu(gdp-bg_block_bitmap) ||
tmp == le32_to_cpu(gdp-bg_inode_bitmap) ||
in_range (tmp, le32_to_cpu(gdp-bg_inode_table),
  sb-u.ext2_sb.s_itb_per_group))
ext2_error (sb, ext2_new_block,
Allocating block in system zone - 
block = %u, tmp);

will go ahead and return the block. Looks like we can do better than that
if we mark it in use (we do that anyway), decremnt relevant free blocks
counters (global and cylinder group one) and goto repeat;

b) we don't do similar checks for blocks we grab in preallocation loop.
And ext2_alloc_block() doesn't do such checks either.

c)
if (ext2_set_bit (j, bh-b_data)) {
ext2_warning (sb, ext2_new_block,
  bit already set for block %d, j);
DQUOT_FREE_BLOCK(sb, inode, 1);
goto repeat;
}
is of the if memory got corrupted during the last dozens of cycles variety -
we had seen that bit 0 several lines before and we couldn't even block during
that interval (not that it mattered much, since all modifications of these
bitmaps are under lock_super() anyway).

d)
if (j = le32_to_cpu(es-s_blocks_count)) {
ext2_error (sb, ext2_new_block,
block(%d) = blocks count(%d) - 
block_group = %d, es == %p ,j,
le32_to_cpu(es-s_blocks_count), i, es);
goto out;
}
is a bit too late _and_ we don't do anything similar for preallocated blocks.

The question being: which of these checks deserve to stay ((c) doesn't, IMO)
and which deserve to be extended to preallocation? If we do them for
main path, we ought to be at least consistent...

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] User chroot

2001-06-26 Thread Alexander Viro



On Tue, 26 Jun 2001, Paul Menage wrote:

 But only root can set this up, since you currently have to be root in
 order to chroot(). The (only) advantage of the user chroot() patch would
 be that users would be able to do the same thing without root
 intervention.

You need to be root to do mknod. You need to do mknod to create /dev/zero.
You need /dev/zero to get anywhere near the normal behaviour of the system.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: mounting a fs in two places at once?

2001-06-24 Thread Alexander Viro



On Sun, 24 Jun 2001, Marty Leisner wrote:

> I just installed redhat 7.1 on a system.
> 
> Cleaning up, a made a fs for home...(mounted on /mnt
> to write the stuff to it)
> 
> Then I accidently mounted it on /home.
> 
> So it was mounted on /home and /mnt at the same time.
> (I didn't bother going in to see what was there).

Same tree, obviously.

> Shouldn't this NOT happen?

Sigh... Guys, who maintains l-k FAQ?

Q: I've mounted filesystem in two different places and it worked. Why?
A: Because you've asked to do that. Yes, it works. No, it's not a bug.

Q: what should I do to unmount it?
A: umount 

Q: but that took care only of one of them. How can I deal with another?
A: umount 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: The Joy of Forking

2001-06-24 Thread Alexander Viro



On Sun, 24 Jun 2001, George Bonser wrote:

> > no SMP
> > x86 only (and similar, e.g. Crusoe)
> 
> Never 

YHBT. YHL.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: The Joy of Forking

2001-06-24 Thread Alexander Viro



On Sun, 24 Jun 2001, George Bonser wrote:

  no SMP
  x86 only (and similar, e.g. Crusoe)
 
 Never 

YHBT. YHL.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: mounting a fs in two places at once?

2001-06-24 Thread Alexander Viro



On Sun, 24 Jun 2001, Marty Leisner wrote:

 I just installed redhat 7.1 on a system.
 
 Cleaning up, a made a fs for home...(mounted on /mnt
 to write the stuff to it)
 
 Then I accidently mounted it on /home.
 
 So it was mounted on /home and /mnt at the same time.
 (I didn't bother going in to see what was there).

Same tree, obviously.

 Shouldn't this NOT happen?

Sigh... Guys, who maintains l-k FAQ?

Q: I've mounted filesystem in two different places and it worked. Why?
A: Because you've asked to do that. Yes, it works. No, it's not a bug.

Q: what should I do to unmount it?
A: umount mountpoint

Q: but that took care only of one of them. How can I deal with another?
A: umount another_mountpoint

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [OT]Re: One more ZDNet article with BillG hammering Linux andOpen Source.

2001-06-23 Thread Alexander Viro



On 22 Jun 2001, Miles Lane wrote:

> It would be great to see the "Shared Source" licenses that Microsoft has 
> made people sign.  It would be especially interesting to compare the

It would be great to see you learning WTF "offtopic" means and taking the
advocacy crap to the places where it belongs.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [OT]Re: One more ZDNet article with BillG hammering Linux andOpen Source.

2001-06-23 Thread Alexander Viro



On 22 Jun 2001, Miles Lane wrote:

 It would be great to see the Shared Source licenses that Microsoft has 
 made people sign.  It would be especially interesting to compare the

It would be great to see you learning WTF offtopic means and taking the
advocacy crap to the places where it belongs.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4.5-ac16 -- "proc_get_inode" still unresolved in /net/wan/comx.o

2001-06-22 Thread Alexander Viro



On Tue, 19 Jun 2001, Miles Lane wrote:

> 
> depmod: *** Unresolved symbols in 
>/lib/modules/2.4.5-ac16/kernel/drivers/net/wan/comx.o
> depmod:   proc_get_inode

And it won't be exported. Moreover, it has a very good chance to become
static.

If you have the hardware in question and are willing to help with
testing I would be rather grateful. I'm rewriting filesystem side of
the driver (along with fixing rmmod races, etc.) and testers will be
needed somewhere in the middle of next week.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: What happened to lookup_dentry?

2001-06-22 Thread Alexander Viro



On Tue, 19 Jun 2001, Timur Tabi wrote:

> Well, I didn't write the driver that I'm trying to port, so it's a little
> difficult.  The code in question is:
> 
> struct dentry *   de = lookup_dentry(zfcdb[i].fullname, NULL, LOOKUP_FOLLOW);
> if (IS_ERR(de))
>   continue;
> if (de != zfcdb[i].dentry) 
> {
>   print("zfc: dentry changed for %s\n", zfcdb[i].fullname);
>   zfc_file_init([i], de);
> }
> 
> So it appears it's just checking to see if the dentry for a particular file has
> changed.

Apparently, more than that. You'll need at least vfsmount in addition to
dentry. Could you send me the source? In principle, situation looks like
you need path_init() and path_walk(), but you almost definitely will need
to make changes in more places than that.

It should be easy to fix, but it's easier to mark the places that need
fixing in the source than try to describe how to find them ;-)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: What happened to lookup_dentry?

2001-06-22 Thread Alexander Viro



On Tue, 19 Jun 2001, Timur Tabi wrote:

 Well, I didn't write the driver that I'm trying to port, so it's a little
 difficult.  The code in question is:
 
 struct dentry *   de = lookup_dentry(zfcdb[i].fullname, NULL, LOOKUP_FOLLOW);
 if (IS_ERR(de))
   continue;
 if (de != zfcdb[i].dentry) 
 {
   print(zfc: dentry changed for %s\n, zfcdb[i].fullname);
   zfc_file_init(zfcdb[i], de);
 }
 
 So it appears it's just checking to see if the dentry for a particular file has
 changed.

Apparently, more than that. You'll need at least vfsmount in addition to
dentry. Could you send me the source? In principle, situation looks like
you need path_init() and path_walk(), but you almost definitely will need
to make changes in more places than that.

It should be easy to fix, but it's easier to mark the places that need
fixing in the source than try to describe how to find them ;-)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Alan Cox quote? (was: Re: accounting for threads)

2001-06-21 Thread Alexander Viro



On Thu, 21 Jun 2001, Alexander Viro wrote:

> 
> 
> On Thu, 21 Jun 2001, Rusty Russell wrote:
> 
> > Disagree.  A significant percentage of the netfilter bugs have been
> > SMP only (the whole thing is non-reentrant on UP).
> 
> I really doubt it.  
> Well, if you use GFP_ATOMIC for everything... grep...
> Erm... AFAICS, you call create_chain() with interrupts disabled
> (under write_lock_irq_save()). Unless I'm _very_ mistaken,
> kmalloc(..., GFP_KERNEL) is a Bad Thing(tm) in that situation.
> And create_chain() leads to it.

BTW, proc_net_create() is also not a good idea if you block the interrupts.
Ditto for netlink_kernel_create(), AFAICS (due to netlink_kernel_creat() ->
sock_alloc() -> get_empty_inode() -> kmem_cache_alloc() with SLAB_KERNEL).

That, BTW, is a nice illustration - it's easy to get a preemption point
without noticing, so holding spinlocks, let alone disabling interrupts
over the large area is going to hurt like hell.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Alan Cox quote? (was: Re: accounting for threads)

2001-06-21 Thread Alexander Viro



On Thu, 21 Jun 2001, Rusty Russell wrote:

> Disagree.  A significant percentage of the netfilter bugs have been
> SMP only (the whole thing is non-reentrant on UP).

I really doubt it.  
Well, if you use GFP_ATOMIC for everything... grep...
Erm... AFAICS, you call create_chain() with interrupts disabled
(under write_lock_irq_save()). Unless I'm _very_ mistaken,
kmalloc(..., GFP_KERNEL) is a Bad Thing(tm) in that situation.
And create_chain() leads to it.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Controversy over dynamic linking -- how to end the panic

2001-06-21 Thread Alexander Viro



On Thu, 21 Jun 2001, Timur Tabi wrote:

> In my opinion, this whole thing would just go away (including some of
> Microsoft's anti-GPL rants), if the FSF officially declared that under the GPL,
> #including a GPL header file does NOT force your code to be also GPL.

The problem being, there is no such thing as header file from C point of view.
I can do

cat >my_file.c 

Re: rename problem on vfat file systems

2001-06-21 Thread Alexander Viro



On Thu, 21 Jun 2001, abc abc wrote:

> If I reboot the machine just after the rename() call
> is completed, when the machine comes up the file
> /mnt/sns-c/segments/segfile has zero bytes and there
> is no file in the tmp directory. Effectively the file
> is lost some where. Running fsck recovers the file,
> but it doesn't help me much because I would be copying
> hundreds of files and its difficult to match the
> files.
> 
> Can you think of any thing that might be causing this.

Crappy filesystem layout. If you want to do something a-la journalling
for VFAT - seek professional help.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: rename problem on vfat file systems

2001-06-21 Thread Alexander Viro



On Thu, 21 Jun 2001, abc abc wrote:

 If I reboot the machine just after the rename() call
 is completed, when the machine comes up the file
 /mnt/sns-c/segments/segfile has zero bytes and there
 is no file in the tmp directory. Effectively the file
 is lost some where. Running fsck recovers the file,
 but it doesn't help me much because I would be copying
 hundreds of files and its difficult to match the
 files.
 
 Can you think of any thing that might be causing this.

Crappy filesystem layout. If you want to do something a-la journalling
for VFAT - seek professional help.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Controversy over dynamic linking -- how to end the panic

2001-06-21 Thread Alexander Viro



On Thu, 21 Jun 2001, Timur Tabi wrote:

 In my opinion, this whole thing would just go away (including some of
 Microsoft's anti-GPL rants), if the FSF officially declared that under the GPL,
 #including a GPL header file does NOT force your code to be also GPL.

The problem being, there is no such thing as header file from C point of view.
I can do

cat my_file.c EOF
#include /home/you/your_file.c
EOF

and be done with that. And yes, it's abusing cpp.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Alan Cox quote? (was: Re: accounting for threads)

2001-06-21 Thread Alexander Viro



On Thu, 21 Jun 2001, Rusty Russell wrote:

 Disagree.  A significant percentage of the netfilter bugs have been
 SMP only (the whole thing is non-reentrant on UP).

I really doubt it. looking through the thing raised brows
Well, if you use GFP_ATOMIC for everything... grep...
Erm... AFAICS, you call create_chain() with interrupts disabled
(under write_lock_irq_save()). Unless I'm _very_ mistaken,
kmalloc(..., GFP_KERNEL) is a Bad Thing(tm) in that situation.
And create_chain() leads to it.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Alan Cox quote? (was: Re: accounting for threads)

2001-06-21 Thread Alexander Viro



On Thu, 21 Jun 2001, Alexander Viro wrote:

 
 
 On Thu, 21 Jun 2001, Rusty Russell wrote:
 
  Disagree.  A significant percentage of the netfilter bugs have been
  SMP only (the whole thing is non-reentrant on UP).
 
 I really doubt it. looking through the thing raised brows
 Well, if you use GFP_ATOMIC for everything... grep...
 Erm... AFAICS, you call create_chain() with interrupts disabled
 (under write_lock_irq_save()). Unless I'm _very_ mistaken,
 kmalloc(..., GFP_KERNEL) is a Bad Thing(tm) in that situation.
 And create_chain() leads to it.

BTW, proc_net_create() is also not a good idea if you block the interrupts.
Ditto for netlink_kernel_create(), AFAICS (due to netlink_kernel_creat() -
sock_alloc() - get_empty_inode() - kmem_cache_alloc() with SLAB_KERNEL).

That, BTW, is a nice illustration - it's easy to get a preemption point
without noticing, so holding spinlocks, let alone disabling interrupts
over the large area is going to hurt like hell.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] remove null register_disk

2001-06-20 Thread Alexander Viro



On Wed, 20 Jun 2001 [EMAIL PROTECTED] wrote:

> In fs/partitions/check.c we read
> 
> void register_disk(struct gendisk *gdev, kdev_t dev, unsigned minors,
> struct block_device_operations *ops, long size)
> {
> if (!gdev)
> return;
> grok_partitions(gdev, MINOR(dev)>>gdev->minor_shift, minors, size);
> }
> 
> showing that register_disk is void when its first argument is NULL.
> This allows one to remove some dead code.
> Can be applied to 2.4. No behaviour is changed.

That's simply wrong. We will need register_disk(). Reinserting it into the
right places in 2.5 is a unnecessary PITA.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Alan Cox quote? (was: Re: accounting for threads)

2001-06-20 Thread Alexander Viro



On Wed, 20 Jun 2001, george anzinger wrote:

> > around we _will_ get problems. Kernel UP programming is not different
> > from SMP one. It is multithreaded. And amount of genuine SMP bugs is
> > very small compared to ones that had been there on UP since way back.
> > And yes, programming threads is the same thing. No arguments here.
> > 
> Correct, IF the UP kernel is preemptable.  As long as it is not (and SMP
> is ignored) threads are harder BECAUSE they are preemptable.

In practice it's a BS. There is a lot of ways minor modifications of code
could add a preemption point, so if you rely on the lack of such - expect
major PITA.

Yes, in theory SMP adds some extra fun. Practically, almost every "SMP"
race found so far did not require SMP.

Clean code is trivial to make SMP-safe - critical areas that rely on
lack of preemption are couple of instructions wide and are easy to
protect. Anything trickier and I bet that you have a race on (normal)
UP kernel. Been there, found probably several hundreds of them.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Threads are processes that share more

2001-06-20 Thread Alexander Viro



On Wed, 20 Jun 2001, bert hubert wrote:

> Rounding up, it may be worth repeating what I think Alan said some months
> ago:
> 
> Threads are processes that share more

... and for absolute majority of programmers additional shared objects mean
additional fsckup sources.  I don't trust them to write correct async code.
OK, so I don't trust the majority of programmers to find their dicks if
you take their Visual Masturbation Aid++ away, but that's another story -
I'm talking about otherwise clued people, not burger-flippers armed with
Foo For Complete Dummies in 24 Hours.

> And if we just keep bearing that out to everybody a lot of the myths will go
> away. I would suggest that the pthreads manpages get this attitude.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Threads are processes that share more

2001-06-20 Thread Alexander Viro



On Wed, 20 Jun 2001, bert hubert wrote:

 Rounding up, it may be worth repeating what I think Alan said some months
 ago:
 
 Threads are processes that share more

... and for absolute majority of programmers additional shared objects mean
additional fsckup sources.  I don't trust them to write correct async code.
OK, so I don't trust the majority of programmers to find their dicks if
you take their Visual Masturbation Aid++ away, but that's another story -
I'm talking about otherwise clued people, not burger-flippers armed with
Foo For Complete Dummies in 24 Hours.

 And if we just keep bearing that out to everybody a lot of the myths will go
 away. I would suggest that the pthreads manpages get this attitude.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Alan Cox quote? (was: Re: accounting for threads)

2001-06-20 Thread Alexander Viro



On Wed, 20 Jun 2001, george anzinger wrote:

  around we _will_ get problems. Kernel UP programming is not different
  from SMP one. It is multithreaded. And amount of genuine SMP bugs is
  very small compared to ones that had been there on UP since way back.
  And yes, programming threads is the same thing. No arguments here.
  
 Correct, IF the UP kernel is preemptable.  As long as it is not (and SMP
 is ignored) threads are harder BECAUSE they are preemptable.

In practice it's a BS. There is a lot of ways minor modifications of code
could add a preemption point, so if you rely on the lack of such - expect
major PITA.

Yes, in theory SMP adds some extra fun. Practically, almost every SMP
race found so far did not require SMP.

Clean code is trivial to make SMP-safe - critical areas that rely on
lack of preemption are couple of instructions wide and are easy to
protect. Anything trickier and I bet that you have a race on (normal)
UP kernel. Been there, found probably several hundreds of them.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] remove null register_disk

2001-06-20 Thread Alexander Viro



On Wed, 20 Jun 2001 [EMAIL PROTECTED] wrote:

 In fs/partitions/check.c we read
 
 void register_disk(struct gendisk *gdev, kdev_t dev, unsigned minors,
 struct block_device_operations *ops, long size)
 {
 if (!gdev)
 return;
 grok_partitions(gdev, MINOR(dev)gdev-minor_shift, minors, size);
 }
 
 showing that register_disk is void when its first argument is NULL.
 This allows one to remove some dead code.
 Can be applied to 2.4. No behaviour is changed.

That's simply wrong. We will need register_disk(). Reinserting it into the
right places in 2.5 is a unnecessary PITA.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4.5 corruption (again)

2001-06-19 Thread Alexander Viro



On Tue, 19 Jun 2001, Larry McVoy wrote:

> OK, my corruption is back and this time I'm saving the data.  Al, send some 
> email when you are around, we can talk about access to the data.  I'm tarring

Doing that.

> up both good & bad right now.  I've looked at a few files and they look
> "shifted".
> 
>   extra junk
>   original file less sizeof(extra junk) bytes
> 
> The machine has been up 6 days since the last corruption happened and the
> process which detected the corruption ran successfully every night as well
> as about 4 times by hand after my last corroption report.  

Lovely. Are these files longer than 4Kb, BTW?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4.5 corruption (again)

2001-06-19 Thread Alexander Viro



On Tue, 19 Jun 2001, Larry McVoy wrote:

 OK, my corruption is back and this time I'm saving the data.  Al, send some 
 email when you are around, we can talk about access to the data.  I'm tarring

Doing that.

 up both good  bad right now.  I've looked at a few files and they look
 shifted.
 
   extra junk
   original file less sizeof(extra junk) bytes
 
 The machine has been up 6 days since the last corruption happened and the
 process which detected the corruption ran successfully every night as well
 as about 4 times by hand after my last corroption report.  

Lovely. Are these files longer than 4Kb, BTW?

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: What happened to lookup_dentry?

2001-06-18 Thread Alexander Viro



On Mon, 18 Jun 2001, Timur Tabi wrote:

> I'm porting a driver from 2.2 to 2.4, and this driver calls lookup_dentry,
> which doesn't exist in 2.4.  I've read through the source code and searched the
> web and newsgroups, and I can't find any explanation as to why lookup_dentry no
> longer exists or how I'm supposed to change code that uses it.  Can anyone help
> me?

It depends on what kind of use 2.2 code had for it. There are several
situations in which it used to be called and proper replacements depend
on the context. Details, please... (alternatively, send an URL of patch
and I'll see what to do with the thing)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] devfs v181 available

2001-06-18 Thread Alexander Viro



On Mon, 18 Jun 2001, Richard Gooch wrote:

> > Irrelevant. BKL provides an exclusion only on non-blocking areas.
> 
> Yeah, I know all that.

So what the hell are you talking about?

> > _Moved_ them there from the callers of these functions. And AFAICS
> > you do need BKL for get_devfs_entry_...(); otherwise relocation of
> > the table will be able to screw you inside of that function. Now, it
> > will merrily screw you anyway in a lot of places, but that's another
> > story.
> 
> OK, so it was another global change.

Moving BKL into the ->readlink() and ->follow_link()? Sure, it was a global
change. About a year ago.

> Question: assuming data fed to vfs_follow_link() is "safe", does it

> need the BKL? I can see that vfs_readlink() obviously doesn't need
> it. From reading Documentation/filesystems/Locking I suspect it
> doesn't need the BKL, but the way I read it says "follow_link() method
> does not *have* the BKL already". But that doesn't explicitely say
> whether vfs_follow_link() needs it.

vfs_follow_link() doesn't need it. Moreover, if data fed to it is unsafe
without BKL, you are screwed even if you take BKL. So assumption above
is bogus - you _never_ need BKL on that call.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Newbie idiotic questions.

2001-06-18 Thread Alexander Viro



On Mon, 18 Jun 2001, Roman Zippel wrote:

> > I wouldn't call it "rather popular".
> 
> You should also grep for '__typeof__'. :-)

Yeeeccchhh. OK, there is more of that. However, the main user of that
beast is, AFAICS, get_user()/put_user() and their ilk in include/asm-*
The rest looks very bogus...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: function of getname() function

2001-06-18 Thread Alexander Viro



On Mon, 18 Jun 2001, SATHISH.J wrote:

> Hi,
> 
> Sorry if this question is too silly.
> 
> I could not understand what getname(filename) function in the sys_open()
> function is doing. I could not understand from the code what exactly it is
> doing. Please help me with the same.

It allocates a buffer and copies file name from user memory to that buffer.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] devfs v181 available

2001-06-18 Thread Alexander Viro



On Mon, 18 Jun 2001, Richard Gooch wrote:

> Alexander Viro writes:
> > 
> > 
> > On Mon, 18 Jun 2001, Richard Gooch wrote:
> > 
> > > - Widened locking in  and 
> > 
> > No, you hadn't. Both vfs_readlink() and vfs_follow_link() are blocking
> > functions, so BKL is worthless there.
> 
> Huh? The BKL will protect against other operations which might cause
> the devfs entry to be unregistered, where those other operations also
> grab the BKL. So, it's an improvement.

BKL is released as soon as you block. You _do_ regain it when you get
the next timeslice, but in the meanwhile anything could happen.

> Sure, some operations may cause unregistration without grabbing the

Irrelevant. BKL provides an exclusion only on non-blocking areas.

> BKL, but that's orthogonal (and requires more extensive changes). If
> this "widening" is of no use, then what use are the existing grabs of
> the BKL in those functions? You're the one who added them in the first
> place.

_Moved_ them there from the callers of these functions. And AFAICS you
do need BKL for get_devfs_entry_...(); otherwise relocation of the
table will be able to screw you inside of that function. Now, it will
merrily screw you anyway in a lot of places, but that's another story.

BTW, free advice: when you are checking some condition treat the result
as something that can expire. And don't rely on it past the moment when
it might expired. E.g. in case of de->registered result expires as soon
as you do unlock_kernel() _or_ do anything that might block.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] devfs v181 available

2001-06-18 Thread Alexander Viro



On Mon, 18 Jun 2001, Richard Gooch wrote:

> - Widened locking in  and 

No, you hadn't. Both vfs_readlink() and vfs_follow_link() are blocking
functions, so BKL is worthless there.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] devfs v181 available

2001-06-18 Thread Alexander Viro



On Mon, 18 Jun 2001, Richard Gooch wrote:

 - Widened locking in devfs_readlink and devfs_follow_link

No, you hadn't. Both vfs_readlink() and vfs_follow_link() are blocking
functions, so BKL is worthless there.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] devfs v181 available

2001-06-18 Thread Alexander Viro



On Mon, 18 Jun 2001, Richard Gooch wrote:

 Alexander Viro writes:
  
  
  On Mon, 18 Jun 2001, Richard Gooch wrote:
  
   - Widened locking in devfs_readlink and devfs_follow_link
  
  No, you hadn't. Both vfs_readlink() and vfs_follow_link() are blocking
  functions, so BKL is worthless there.
 
 Huh? The BKL will protect against other operations which might cause
 the devfs entry to be unregistered, where those other operations also
 grab the BKL. So, it's an improvement.

BKL is released as soon as you block. You _do_ regain it when you get
the next timeslice, but in the meanwhile anything could happen.

 Sure, some operations may cause unregistration without grabbing the

Irrelevant. BKL provides an exclusion only on non-blocking areas.

 BKL, but that's orthogonal (and requires more extensive changes). If
 this widening is of no use, then what use are the existing grabs of
 the BKL in those functions? You're the one who added them in the first
 place.

_Moved_ them there from the callers of these functions. And AFAICS you
do need BKL for get_devfs_entry_...(); otherwise relocation of the
table will be able to screw you inside of that function. Now, it will
merrily screw you anyway in a lot of places, but that's another story.

BTW, free advice: when you are checking some condition treat the result
as something that can expire. And don't rely on it past the moment when
it might expired. E.g. in case of de-registered result expires as soon
as you do unlock_kernel() _or_ do anything that might block.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: function of getname() function

2001-06-18 Thread Alexander Viro



On Mon, 18 Jun 2001, SATHISH.J wrote:

 Hi,
 
 Sorry if this question is too silly.
 
 I could not understand what getname(filename) function in the sys_open()
 function is doing. I could not understand from the code what exactly it is
 doing. Please help me with the same.

It allocates a buffer and copies file name from user memory to that buffer.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Newbie idiotic questions.

2001-06-18 Thread Alexander Viro



On Mon, 18 Jun 2001, Roman Zippel wrote:

  I wouldn't call it rather popular.
 
 You should also grep for '__typeof__'. :-)

Yeeeccchhh. OK, there is more of that. However, the main user of that
beast is, AFAICS, get_user()/put_user() and their ilk in include/asm-*
The rest looks very bogus...

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] devfs v181 available

2001-06-18 Thread Alexander Viro



On Mon, 18 Jun 2001, Richard Gooch wrote:

  Irrelevant. BKL provides an exclusion only on non-blocking areas.
 
 Yeah, I know all that.

So what the hell are you talking about?

  _Moved_ them there from the callers of these functions. And AFAICS
  you do need BKL for get_devfs_entry_...(); otherwise relocation of
  the table will be able to screw you inside of that function. Now, it
  will merrily screw you anyway in a lot of places, but that's another
  story.
 
 OK, so it was another global change.

Moving BKL into the -readlink() and -follow_link()? Sure, it was a global
change. About a year ago.

 Question: assuming data fed to vfs_follow_link() is safe, does it

 need the BKL? I can see that vfs_readlink() obviously doesn't need
 it. From reading Documentation/filesystems/Locking I suspect it
 doesn't need the BKL, but the way I read it says follow_link() method
 does not *have* the BKL already. But that doesn't explicitely say
 whether vfs_follow_link() needs it.

vfs_follow_link() doesn't need it. Moreover, if data fed to it is unsafe
without BKL, you are screwed even if you take BKL. So assumption above
is bogus - you _never_ need BKL on that call.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: What happened to lookup_dentry?

2001-06-18 Thread Alexander Viro



On Mon, 18 Jun 2001, Timur Tabi wrote:

 I'm porting a driver from 2.2 to 2.4, and this driver calls lookup_dentry,
 which doesn't exist in 2.4.  I've read through the source code and searched the
 web and newsgroups, and I can't find any explanation as to why lookup_dentry no
 longer exists or how I'm supposed to change code that uses it.  Can anyone help
 me?

It depends on what kind of use 2.2 code had for it. There are several
situations in which it used to be called and proper replacements depend
on the context. Details, please... (alternatively, send an URL of patch
and I'll see what to do with the thing)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Newbie idiotic questions.

2001-06-17 Thread Alexander Viro



On Sun, 17 Jun 2001, Daniel Phillips wrote:

> typeof?  It's rather popular in the kernel already.  Besides, who is going to 

Really? 5 instances in PPC arch-specific code, 1 (absolutely gratitious)
in drivers/mtd, 2 - in m68k (also useless), 4 - in drivers/video, 2 -
in AFFS and 1 - in netfilter.

I wouldn't call it "rather popular".

> compile this with anything other than gcc?

> 
> I don't see your point about greppability.

You are making the types it is applied to harder to deal with wrt. global
search.

But the real issue here is that preprocessor is not a way to get
polymorphism. And that would be the only context where typeof might
have any use. Trying to turn C into the things it isn't is always a bad
idea - had been proven many times. starting at least with Bourne shel
(check the v7 sh source if you don't know what I'm refering to).

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Newbie idiotic questions.

2001-06-17 Thread Alexander Viro



On Sun, 17 Jun 2001, Daniel Phillips wrote:

> > macro that behaves like `new' in C++:
> > | #define knew(type, flags) (type *)kmalloc(sizeof(type), (flags))
> >
> > If the types in the assignment don't match, gcc will tell you.
> 
> Well, since we are still beating this one to death, I'd written a "knew" 
> macro as well, and put it aside.  It does the assignment for you too:
> 
>#define knew(p) ((p) = (typeof(p)) kmalloc(sizeof(*(p)), GFP_KERNEL))
 
> Terse and clear at the same time, and type safe.  I still don't like it much. 

And ungreppable, not to mention gratitious use of GNU extension.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Reg:magic number of the filesystem

2001-06-17 Thread Alexander Viro



On Sun, 17 Jun 2001, SATHISH.J wrote:

> Hi,
> 
> Every file system has a magic number. Can you please tell me what for this
> magic number is used. When do we really use this unique magic number of
> the file system and why?

find . -name *.[chS] >/tmp/list
xargs http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Reg:use of file_system_type structure

2001-06-17 Thread Alexander Viro



On Sun, 17 Jun 2001, SATHISH.J wrote:

> Hi,
> Every file system has file_system_type structure defined. Where else this
> structure is referred. Does register_filesystem() refer this structure.
> Does sys_mount refer to this structure by any means?

Umm... No offense, but
* all of these questions take a couple of minutes to answer.
* if you know how to use grep you should be able to find the
answer faster than anybody could reply
* if you know C the last two questions are non-issue (everyone who
doubts that is welcome to read the register_filesystem() source and see
what arguments do its callers pass to it)
* it looks suspiciously similar to, pardon me, attempt to cheat on
a quiz.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Reg:use of file_system_type structure

2001-06-17 Thread Alexander Viro



On Sun, 17 Jun 2001, SATHISH.J wrote:

 Hi,
 Every file system has file_system_type structure defined. Where else this
 structure is referred. Does register_filesystem() refer this structure.
 Does sys_mount refer to this structure by any means?

Umm... No offense, but
* all of these questions take a couple of minutes to answer.
* if you know how to use grep you should be able to find the
answer faster than anybody could reply
* if you know C the last two questions are non-issue (everyone who
doubts that is welcome to read the register_filesystem() source and see
what arguments do its callers pass to it)
* it looks suspiciously similar to, pardon me, attempt to cheat on
a quiz.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Reg:magic number of the filesystem

2001-06-17 Thread Alexander Viro



On Sun, 17 Jun 2001, SATHISH.J wrote:

 Hi,
 
 Every file system has a magic number. Can you please tell me what for this
 magic number is used. When do we really use this unique magic number of
 the file system and why?

find . -name *.[chS] /tmp/list
xargs /tmp/list grep -nw s_magic
xargs /tmp/list grep -nw statfs
man 2 statfs
man 2 fstatfs
man 2 ustat

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Newbie idiotic questions.

2001-06-17 Thread Alexander Viro



On Sun, 17 Jun 2001, Daniel Phillips wrote:

  macro that behaves like `new' in C++:
  | #define knew(type, flags) (type *)kmalloc(sizeof(type), (flags))
 
  If the types in the assignment don't match, gcc will tell you.
 
 Well, since we are still beating this one to death, I'd written a knew 
 macro as well, and put it aside.  It does the assignment for you too:
 
#define knew(p) ((p) = (typeof(p)) kmalloc(sizeof(*(p)), GFP_KERNEL))
 
 Terse and clear at the same time, and type safe.  I still don't like it much. 

And ungreppable, not to mention gratitious use of GNU extension.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Newbie idiotic questions.

2001-06-17 Thread Alexander Viro



On Sun, 17 Jun 2001, Daniel Phillips wrote:

 typeof?  It's rather popular in the kernel already.  Besides, who is going to 

Really? 5 instances in PPC arch-specific code, 1 (absolutely gratitious)
in drivers/mtd, 2 - in m68k (also useless), 4 - in drivers/video, 2 -
in AFFS and 1 - in netfilter.

I wouldn't call it rather popular.

 compile this with anything other than gcc?

 
 I don't see your point about greppability.

You are making the types it is applied to harder to deal with wrt. global
search.

But the real issue here is that preprocessor is not a way to get
polymorphism. And that would be the only context where typeof might
have any use. Trying to turn C into the things it isn't is always a bad
idea - had been proven many times. starting at least with Bourne shel
(check the v7 sh source if you don't know what I'm refering to).

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [ANNOUNCE] HotPlug CPU patch against 2.4.5

2001-06-16 Thread Alexander Viro



On Sun, 17 Jun 2001, Rusty Russell wrote:

> In message <[EMAIL PROTECTED]> you write:
> > In article  you wrote:
> > >   # Up...
> > >   echo 1 > /proc/sys/cpu/1
> > 
> > Wouldn't /proc/sys/cpu//enable be better?  This way other per-cpu
> > sysctls could be added more easily...
> 
> Yep.  But rewrite the sysctl crap first to make dynamically adding and
> deleting entries sane.

I had, actually. 2.5 stuff, but as soon as fs/super.c merge gets into the
sane area I'll see what can be safely merged into 2.4. Sorry - it touches
quite a few places and running two splitups in parallel...  As
soon as this fscking roll of barbwire^W^W^Wset of locking changes gets
untangled...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [ANNOUNCE] HotPlug CPU patch against 2.4.5

2001-06-16 Thread Alexander Viro



On Sun, 17 Jun 2001, Rusty Russell wrote:

 In message [EMAIL PROTECTED] you write:
  In article m15BG8K-001UIwC@mozart you wrote:
 # Up...
 echo 1  /proc/sys/cpu/1
  
  Wouldn't /proc/sys/cpu/num/enable be better?  This way other per-cpu
  sysctls could be added more easily...
 
 Yep.  But rewrite the sysctl crap first to make dynamically adding and
 deleting entries sane.

I had, actually. 2.5 stuff, but as soon as fs/super.c merge gets into the
sane area I'll see what can be safely merged into 2.4. Sorry - it touches
quite a few places and running two splitups in parallel... shudder As
soon as this fscking roll of barbwire^W^W^Wset of locking changes gets
untangled...

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Kernel 2.0.35 limits

2001-06-15 Thread Alexander Viro



On Fri, 15 Jun 2001, Paul Faure wrote:

> Just this morning, our firewall get a kernel panic after 500 days of
> uptime.
> 
> As you can see from the log files, the date starts at June 15th, where we
> get two div by zeros, then jumps May 11th, then a kernel panic. A reboot
> brings it back to June 15th. Since cron could not open /dev/rtc. My first
> thought was an internal kernel limit on the time, but 500 days seems a bit
> short.
> 
> Any ideas ?

(1<<32) / (24 * 60 * 60 * 100) == 497

IOW, 2^32 timer interrupts since the boot.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Kernel 2.0.35 limits

2001-06-15 Thread Alexander Viro



On Fri, 15 Jun 2001, Paul Faure wrote:

 Just this morning, our firewall get a kernel panic after 500 days of
 uptime.
 
 As you can see from the log files, the date starts at June 15th, where we
 get two div by zeros, then jumps May 11th, then a kernel panic. A reboot
 brings it back to June 15th. Since cron could not open /dev/rtc. My first
 thought was an internal kernel limit on the time, but 500 days seems a bit
 short.
 
 Any ideas ?

(132) / (24 * 60 * 60 * 100) == 497

IOW, 2^32 timer interrupts since the boot.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: unregistered changes to the user<->kernel API

2001-06-14 Thread Alexander Viro



On Thu, 14 Jun 2001, Richard Henderson wrote:

> Yes, I saw those.  What is the effect of O_NOFOLLOW?  To not
> follow symbolic links when opening the file.  If you open a
> regular file, in effect nothing happens.  Moreover, if these
> opens were not finding files now, the system wouldn't work.
> 
> So: the effect, I suppose, is (1) disabling some security
> within glibc, and (2) making these accesses slower since they
> will be considered O_DIRECT after the change.
> 
> Which doesn't seem that life-threatening to me.

O_NOFOLLOW is used to deal with symlink attacks. Breaking it means
that for quite a few binaries you are opening security holes. And
since it's a flagday change, you'll get the situation when no version
will work for all kernels. Bad idea, IMO.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Download process for a "split kernel" (was: obsolete code mustdie)

2001-06-14 Thread Alexander Viro



On Thu, 14 Jun 2001, Daniel Phillips wrote:

> This sounds a lot like apt-get, doesn't it?

Folks, RTFFAQ, please. URL is attached to the end of each posting.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Download process for a split kernel (was: obsolete code mustdie)

2001-06-14 Thread Alexander Viro



On Thu, 14 Jun 2001, Daniel Phillips wrote:

 This sounds a lot like apt-get, doesn't it?

Folks, RTFFAQ, please. URL is attached to the end of each posting.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: unregistered changes to the user-kernel API

2001-06-14 Thread Alexander Viro



On Thu, 14 Jun 2001, Richard Henderson wrote:

 Yes, I saw those.  What is the effect of O_NOFOLLOW?  To not
 follow symbolic links when opening the file.  If you open a
 regular file, in effect nothing happens.  Moreover, if these
 opens were not finding files now, the system wouldn't work.
 
 So: the effect, I suppose, is (1) disabling some security
 within glibc, and (2) making these accesses slower since they
 will be considered O_DIRECT after the change.
 
 Which doesn't seem that life-threatening to me.

O_NOFOLLOW is used to deal with symlink attacks. Breaking it means
that for quite a few binaries you are opening security holes. And
since it's a flagday change, you'll get the situation when no version
will work for all kernels. Bad idea, IMO.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: is there a way to export a fat32 file system using nfs?

2001-06-12 Thread Alexander Viro



On Wed, 13 Jun 2001, Neil Brown wrote:

>Call fat_iget(i_location).
> If this finds something, check i_logstart. 
> If it matches, assume SUCCESS.
> 
>Then comes the tricky bit:  read the directory entry
> indicated by i_location, check the i_logstart is right,
> if it is, try to get it into the inode cache properly.

Uh-huh. Suppose that directory had been removed and space had been
reused by a regular file. Which had been filled with the right
contents. It's really not hard to do. Now, remove that file and
you've got a nice data corruption waiting to happen.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: threading question

2001-06-12 Thread Alexander Viro



On Tue, 12 Jun 2001, Kip Macy wrote:

> implementation of threads is not an accidental oversight, threads are not
> looked upon favorably by most of the core linux kernel hackers. A quote

s/threads/POSIX threads/.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: threading question

2001-06-12 Thread Alexander Viro



On Tue, 12 Jun 2001, Kip Macy wrote:

 implementation of threads is not an accidental oversight, threads are not
 looked upon favorably by most of the core linux kernel hackers. A quote

s/threads/POSIX threads/.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: is there a way to export a fat32 file system using nfs?

2001-06-12 Thread Alexander Viro



On Wed, 13 Jun 2001, Neil Brown wrote:

Call fat_iget(i_location).
 If this finds something, check i_logstart. 
 If it matches, assume SUCCESS.
 
Then comes the tricky bit:  read the directory entry
 indicated by i_location, check the i_logstart is right,
 if it is, try to get it into the inode cache properly.

Uh-huh. Suppose that directory had been removed and space had been
reused by a regular file. Which had been filled with the right
contents. It's really not hard to do. Now, remove that file and
you've got a nice data corruption waiting to happen.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [CFT][PATCH] superblock handling changes

2001-06-11 Thread Alexander Viro



On Tue, 12 Jun 2001, Marcelo Tosatti wrote:

> 
> 
> On Tue, 12 Jun 2001, Alexander Viro wrote:
> 
> > Folks, the patch below the fixed and combined variant of
> > the last series of patches sent to Linus.
> 
> Al, 
> 
> Since you are working on that code, would you mind to add some comments
> about IO completion guarantees (also why we don't guarantee fsync() to
> work as it should :)) there ?

I'm _not_ working on that side of things. Let's not add that into the
mix, OK? If you look at inode.c changes you'll see that the only thing
they expect from __sync_one() is to retake inode_lock before moving the
inode from the locked list. Other than that patch doesn't know and
doesn't care about fsync() semantics and implementation.

We have enough fun on the superblock side of the business. Let's keep
the fsync() stuff separate - they are pretty much orthogonal to each
other.

Right now I don't want to open that can of worms. Sorry.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [CFT][PATCH] superblock handling changes

2001-06-11 Thread Alexander Viro



On Tue, 12 Jun 2001, Marcelo Tosatti wrote:

 
 
 On Tue, 12 Jun 2001, Alexander Viro wrote:
 
  Folks, the patch below the fixed and combined variant of
  the last series of patches sent to Linus.
 
 Al, 
 
 Since you are working on that code, would you mind to add some comments
 about IO completion guarantees (also why we don't guarantee fsync() to
 work as it should :)) there ?

I'm _not_ working on that side of things. Let's not add that into the
mix, OK? If you look at inode.c changes you'll see that the only thing
they expect from __sync_one() is to retake inode_lock before moving the
inode from the locked list. Other than that patch doesn't know and
doesn't care about fsync() semantics and implementation.

We have enough fun on the superblock side of the business. Let's keep
the fsync() stuff separate - they are pretty much orthogonal to each
other.

Right now I don't want to open that can of worms. Sorry.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[PATCH] fs/super.c stuff (3/10)

2001-06-10 Thread Alexander Viro

diff -urN S6-pre2-fsync_no_super/include/linux/fs.h 
S6-pre2-put_super/include/linux/fs.h
--- S6-pre2-fsync_no_super/include/linux/fs.h   Sun Jun 10 18:36:27 2001
+++ S6-pre2-put_super/include/linux/fs.hSun Jun 10 18:39:04 2001
@@ -1320,7 +1320,6 @@
 
 extern struct file_system_type *get_fs_type(const char *name);
 extern struct super_block *get_super(kdev_t);
-extern void put_super(kdev_t);
 static inline int is_mounted(kdev_t dev)
 {
struct super_block *sb = get_super(dev);


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[PATCH] fs/super.c stuff (8/10)

2001-06-10 Thread Alexander Viro

diff -urN S6-pre2-s_count/fs/inode.c S6-pre2-freeing/fs/inode.c
--- S6-pre2-s_count/fs/inode.c  Sun Jun 10 12:45:04 2001
+++ S6-pre2-freeing/fs/inode.c  Sun Jun 10 12:45:47 2001
@@ -258,23 +258,6 @@
__sync_one(list_entry(tmp, struct inode, i_list), 0);
 }
 
-static inline int wait_on_dirty(struct list_head *head)
-{
-   struct list_head * tmp;
-   list_for_each(tmp, head) {
-   struct inode *inode = list_entry(tmp, struct inode, i_list);
-   if (!inode->i_state & I_DIRTY)
-   continue;
-   __iget(inode);
-   spin_unlock(_lock);
-   __wait_on_inode(inode);
-   iput(inode);
-   spin_lock(_lock);
-   return 1;
-   }
-   return 0;
-}
-
 static inline void wait_on_locked(struct list_head *head)
 {
struct list_head * tmp;
@@ -319,23 +302,13 @@
return 1;
 }
 
-/**
- * sync_inodes
- * @dev: device to sync the inodes from.
- *
- * sync_inodes goes through the super block's dirty list, 
- * writes them out, and puts them back on the normal list.
- */
-
-/*
- * caller holds exclusive lock on sb->s_umount
- */
- 
 void sync_inodes_sb(struct super_block *sb)
 {
spin_lock(_lock);
-   sync_list(>s_dirty);
-   wait_on_locked(>s_locked_inodes);
+   while (!list_empty(>s_dirty)||!list_empty(>s_locked_inodes)) {
+   sync_list(>s_dirty);
+   wait_on_locked(>s_locked_inodes);
+   }
spin_unlock(_lock);
 }
 
@@ -365,37 +338,75 @@
spin_unlock(_lock);
 }
 
+/*
+ * Find a superblock with inodes that need to be synced
+ */
+
+static struct super_block *get_super_to_sync(void)
+{
+   struct list_head *p;
+restart:
+   spin_lock(_lock);
+   spin_lock(_lock);
+   list_for_each(p, _blocks) {
+   struct super_block *s = list_entry(p,struct super_block,s_list);
+   if (list_empty(>s_dirty) && list_empty(>s_locked_inodes))
+   continue;
+   s->s_count++;
+   spin_unlock(_lock);
+   spin_unlock(_lock);
+   down_read(>s_umount);
+   if (!s->s_root) {
+   up_read(>s_umount);
+   spin_lock(_lock);
+   if (!--s->s_count)
+   kfree(s);
+   spin_unlock(_lock);
+   goto restart;
+   }
+   return s;
+   }
+   spin_unlock(_lock);
+   spin_unlock(_lock);
+   return NULL;
+}
+
+/**
+ * sync_inodes
+ * @dev: device to sync the inodes from.
+ *
+ * sync_inodes goes through the super block's dirty list, 
+ * writes them out, and puts them back on the normal list.
+ */
+
 void sync_inodes(kdev_t dev)
 {
-   struct super_block * sb;
+   struct super_block * s;
 
/*
 * Search the super_blocks array for the device(s) to sync.
 */
-   spin_lock(_lock);
-   sb = sb_entry(super_blocks.next);
-   for (; sb != sb_entry(_blocks); sb = sb_entry(sb->s_list.next)) {
-   if (!sb->s_dev)
-   continue;
-   if (dev && sb->s_dev != dev)
-   continue;
-   sb->s_count++;
-   spin_unlock(_lock);
-   down_read(>s_umount);
-   if (sb->s_dev && (sb->s_dev == dev || !dev)) {
-   spin_lock(_lock);
-   do {
-   sync_list(>s_dirty);
-   } while (wait_on_dirty(>s_locked_inodes));
-   spin_unlock(_lock);
+   if (dev) {
+   if ((s = get_super(dev)) != NULL) {
+   down_read(>s_umount);
+   if (s->s_root)
+   sync_inodes_sb(s);
+   up_read(>s_umount);
+   spin_lock(_lock);
+   if (!--s->s_count)
+   kfree(s);
+   spin_unlock(_lock);
+   }
+   } else {
+   while ((s = get_super_to_sync()) != NULL) {
+   sync_inodes_sb(s);
+   up_read(>s_umount);
+   spin_lock(_lock);
+   if (!--s->s_count)
+   kfree(s);
+   spin_unlock(_lock);
}
-   up_read(>s_umount);
-   spin_lock(_lock);
-   sb->s_count--;
-   if (dev)
-   break;
}
-   spin_unlock(_lock);
 }
 
 /*
diff -urN S6-pre2-s_count/fs/super.c S6-pre2-freeing/fs/super.c
--- S6-pre2-s_count/fs/super.c  Sun Jun 10 12:45:04 2001
+++ S6-pre2-freeing/fs/super.c  Sun Jun 10 12:45:47 2001
@@ -644,7 +644,8 @@
 static inline void __put_super(struct super_block *sb)
 {
spin_lock(_lock);
-   sb->s_count--;
+   

[PATCH] fs/super.c stuff (5/10)

2001-06-10 Thread Alexander Viro

diff -urN S6-pre2-put_super/fs/dquot.c S6-pre2-dquot/fs/dquot.c
--- S6-pre2-put_super/fs/dquot.cThu May 24 18:26:44 2001
+++ S6-pre2-dquot/fs/dquot.cSun Jun 10 18:46:54 2001
@@ -325,7 +325,7 @@
 memset(>dq_dqb, 0, sizeof(struct dqblk));
 }
 
-void invalidate_dquots(kdev_t dev, short type)
+static void invalidate_dquots(kdev_t dev, short type)
 {
struct dquot *dquot, *next;
int need_restart;
@@ -1388,7 +1388,7 @@
 }
 
 /* Function in inode.c - remove pointers to dquots in icache */
-extern void remove_dquot_ref(kdev_t, short);
+extern void remove_dquot_ref(struct super_block *, short);
 
 /*
  * Turn quota off on a device. type == -1 ==> quotaoff for all types (umount)
@@ -1413,7 +1413,7 @@
reset_enable_flags(dqopt, cnt);
 
/* Note: these are blocking operations */
-   remove_dquot_ref(sb->s_dev, cnt);
+   remove_dquot_ref(sb, cnt);
invalidate_dquots(sb->s_dev, cnt);
 
/* Wait for any pending IO - remove me as soon as invalidate is more 
polite */
diff -urN S6-pre2-put_super/fs/inode.c S6-pre2-dquot/fs/inode.c
--- S6-pre2-put_super/fs/inode.cFri Jun  8 18:29:03 2001
+++ S6-pre2-dquot/fs/inode.cSun Jun 10 18:43:02 2001
@@ -1164,14 +1164,13 @@
 void put_dquot_list(struct list_head *);
 int remove_inode_dquot_ref(struct inode *, short, struct list_head *);
 
-void remove_dquot_ref(kdev_t dev, short type)
+void remove_dquot_ref(struct super_block *sb, short type)
 {
-   struct super_block *sb = get_super(dev);
struct inode *inode;
struct list_head *act_head;
LIST_HEAD(tofree_head);
 
-   if (!sb || !sb->dq_op)
+   if (!sb->dq_op)
return; /* nothing to do */
 
/* We have to be protected against other CPUs */
diff -urN S6-pre2-put_super/include/linux/quotaops.h 
S6-pre2-dquot/include/linux/quotaops.h
--- S6-pre2-put_super/include/linux/quotaops.h  Sun Jun 10 13:15:27 2001
+++ S6-pre2-dquot/include/linux/quotaops.h  Sun Jun 10 18:46:33 2001
@@ -21,7 +21,6 @@
  */
 extern void dquot_initialize(struct inode *inode, short type);
 extern void dquot_drop(struct inode *inode);
-extern void invalidate_dquots(kdev_t dev, short type);
 extern int  quota_off(struct super_block *sb, short type);
 extern int  sync_dquots(kdev_t dev, short type);
 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] fs/super.c stuff (6/10)

2001-06-10 Thread Alexander Viro

diff -urN S6-pre2-dquot/arch/parisc/hpux/sys_hpux.c 
S6-pre2-drop_super/arch/parisc/hpux/sys_hpux.c
--- S6-pre2-dquot/arch/parisc/hpux/sys_hpux.c   Fri Feb 16 20:46:44 2001
+++ S6-pre2-drop_super/arch/parisc/hpux/sys_hpux.c  Sun Jun 10 18:38:23 2001
@@ -109,9 +109,11 @@
 
lock_kernel();
s = get_super(to_kdev_t(dev));
+   unlock_kernel();
if (s == NULL)
goto out;
err = vfs_statfs(s, );
+   drop_super(s);
if (err)
goto out;
 
@@ -124,7 +126,6 @@
/* Changed to hpux_ustat:  */
err = copy_to_user(ubuf,,sizeof(struct hpux_ustat)) ? -EFAULT : 0;
 out:
-   unlock_kernel();
return err;
 }
 
diff -urN S6-pre2-dquot/fs/dquot.c S6-pre2-drop_super/fs/dquot.c
--- S6-pre2-dquot/fs/dquot.cSun Jun 10 18:46:54 2001
+++ S6-pre2-drop_super/fs/dquot.c   Sun Jun 10 18:38:23 2001
@@ -1602,6 +1602,8 @@
if (sb && sb_has_quota_enabled(sb, type))
ret = set_dqblk(sb, id, type, flags, (struct dqblk *) addr);
 out:
+   if (sb)
+   drop_super(sb);
unlock_kernel();
return ret;
 }
diff -urN S6-pre2-dquot/fs/inode.c S6-pre2-drop_super/fs/inode.c
--- S6-pre2-dquot/fs/inode.cSun Jun 10 18:43:02 2001
+++ S6-pre2-drop_super/fs/inode.c   Sun Jun 10 18:38:23 2001
@@ -605,8 +605,10 @@
fsync_dev(dev);
 
res = 0;
-   if (sb)
+   if (sb) {
res = invalidate_inodes(sb);
+   drop_super(sb);
+   }
invalidate_buffers(dev);
return res;
 }
diff -urN S6-pre2-dquot/fs/super.c S6-pre2-drop_super/fs/super.c
--- S6-pre2-dquot/fs/super.cSun Jun 10 18:36:27 2001
+++ S6-pre2-drop_super/fs/super.c   Sun Jun 10 18:38:23 2001
@@ -491,7 +491,6 @@
kill_super(sb);
 }
 
-
 /* Use octal escapes, like mount does, for embedded spaces etc. */
 static unsigned char need_escaping[] = { ' ', '\t', '\n', '\\' };
 
@@ -640,6 +639,10 @@
 #undef MANGLE
 #undef FREEROOM
 }
+
+void drop_super(struct super_block *sb)
+{
+}
  
 /*
  * Note: check the dirty flag before waiting, so we don't
@@ -709,6 +712,7 @@
 if (s == NULL)
 goto out;
err = vfs_statfs(s, );
+   drop_super(s);
if (err)
goto out;
 
diff -urN S6-pre2-dquot/include/linux/fs.h S6-pre2-drop_super/include/linux/fs.h
--- S6-pre2-dquot/include/linux/fs.hSun Jun 10 18:39:04 2001
+++ S6-pre2-drop_super/include/linux/fs.h   Sun Jun 10 18:38:31 2001
@@ -1320,11 +1320,12 @@
 
 extern struct file_system_type *get_fs_type(const char *name);
 extern struct super_block *get_super(kdev_t);
+extern void drop_super(struct super_block *sb);
 static inline int is_mounted(kdev_t dev)
 {
struct super_block *sb = get_super(dev);
if (sb) {
-   /* drop_super(sb); will go here */
+   drop_super(sb);
return 1;
}
return 0;
diff -urN S6-pre2-dquot/kernel/ksyms.c S6-pre2-drop_super/kernel/ksyms.c
--- S6-pre2-dquot/kernel/ksyms.cFri Jun  8 18:29:03 2001
+++ S6-pre2-drop_super/kernel/ksyms.c   Sun Jun 10 18:38:23 2001
@@ -129,6 +129,7 @@
 EXPORT_SYMBOL(update_atime);
 EXPORT_SYMBOL(get_fs_type);
 EXPORT_SYMBOL(get_super);
+EXPORT_SYMBOL(drop_super);
 EXPORT_SYMBOL(getname);
 EXPORT_SYMBOL(names_cachep);
 EXPORT_SYMBOL(fput);



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[PATCH] fs/super.c stuff (10/10)

2001-06-10 Thread Alexander Viro

diff -urN S6-pre2-alloc_super/fs/inode.c S6-pre2-current/fs/inode.c
--- S6-pre2-alloc_super/fs/inode.c  Sun Jun 10 19:09:35 2001
+++ S6-pre2-current/fs/inode.c  Sun Jun 10 19:26:27 2001
@@ -357,11 +357,7 @@
spin_unlock(_lock);
down_read(>s_umount);
if (!s->s_root) {
-   up_read(>s_umount);
-   spin_lock(_lock);
-   if (!--s->s_count)
-   kfree(s);
-   spin_unlock(_lock);
+   drop_super(s);
goto restart;
}
return s;
@@ -388,23 +384,13 @@
 */
if (dev) {
if ((s = get_super(dev)) != NULL) {
-   down_read(>s_umount);
-   if (s->s_root)
-   sync_inodes_sb(s);
-   up_read(>s_umount);
-   spin_lock(_lock);
-   if (!--s->s_count)
-   kfree(s);
-   spin_unlock(_lock);
+   sync_inodes_sb(s);
+   drop_super(s);
}
} else {
while ((s = get_super_to_sync()) != NULL) {
sync_inodes_sb(s);
-   up_read(>s_umount);
-   spin_lock(_lock);
-   if (!--s->s_count)
-   kfree(s);
-   spin_unlock(_lock);
+   drop_super(s);
}
}
 }
@@ -636,13 +622,14 @@
  
 int invalidate_device(kdev_t dev, int do_sync)
 {
-   struct super_block *sb = get_super(dev);
+   struct super_block *sb;
int res;
 
if (do_sync)
fsync_dev(dev);
 
res = 0;
+   sb = get_super(dev);
if (sb) {
res = invalidate_inodes(sb);
drop_super(sb);
diff -urN S6-pre2-alloc_super/fs/super.c S6-pre2-current/fs/super.c
--- S6-pre2-alloc_super/fs/super.c  Sun Jun 10 19:09:39 2001
+++ S6-pre2-current/fs/super.c  Sun Jun 10 19:36:51 2001
@@ -647,8 +647,23 @@
spin_unlock(_lock);
 }
 
+static inline struct super_block * find_super(kdev_t dev)
+{
+   struct list_head *p;
+
+   list_for_each(p, _blocks) {
+   struct super_block * s = sb_entry(p);
+   if (s->s_dev == dev) {
+   s->s_count++;
+   return s;
+   }
+   }
+   return NULL;
+}
+
 void drop_super(struct super_block *sb)
 {
+   up_read(>s_umount);
__put_super(sb);
 }
 
@@ -681,8 +696,7 @@
if (sb) {
if (sb->s_dirt)
write_super(sb);
-   up_read(>s_umount);
-   __put_super(sb);
+   drop_super(sb);
}
return;
}
@@ -695,8 +709,7 @@
spin_unlock(_lock);
down_read(>s_umount);
write_super(sb);
-   up_read(>s_umount);
-   __put_super(sb);
+   drop_super(sb);
goto restart;
} else
sb = sb_entry(sb->s_list.next);
@@ -719,21 +732,19 @@
return NULL;
 restart:
spin_lock(_lock);
-   s = sb_entry(super_blocks.next);
-   while (s != sb_entry(_blocks))
-   if (s->s_dev == dev) {
-   /* Yes, it sucks. As soon as we get refcounting... */
-   /* Almost there */
-   s->s_count++;
-   spin_unlock(_lock);
-   lock_super(s);
-   unlock_super(s);
-   if (s->s_dev == dev)
-   return s;
-   drop_super(s);
-   goto restart;
-   } else
-   s = sb_entry(s->s_list.next);
+   s = find_super(dev);
+   if (s) {
+   spin_unlock(_lock);
+   /* Yes, it sucks. As soon as we get refcounting... */
+   /* Almost there - next two lines will go away RSN */
+   lock_super(s);
+   unlock_super(s);
+   down_read(>s_umount);
+   if (s->s_root)
+   return s;
+   drop_super(s);
+   goto restart;
+   }
spin_unlock(_lock);
return NULL;
 }
@@ -905,10 +916,11 @@
spin_unlock(_lock);
}
atomic_inc(>s_active);
+   up_read(>s_umount);
path_release();
return sb;
}
-   __put_super(sb);
+   drop_super(sb);
} else {

  1   2   3   4   5   6   7   8   9   10   >