Re: Test12 ll_rw_block error.

2000-12-15 Thread Stephen C. Tweedie
Hi, On Fri, Dec 15, 2000 at 02:00:19AM -0500, Alexander Viro wrote: > On Thu, 14 Dec 2000, Linus Torvalds wrote: > > Just one: any fs that really cares about completion callback is very likely > to be picky about the requests ordering. So sync_buffers() is very unlikely > to be useful anyway. >

Re: Test12 ll_rw_block error.

2000-12-15 Thread Stephen C. Tweedie
Hi, On Fri, Dec 15, 2000 at 02:00:19AM -0500, Alexander Viro wrote: On Thu, 14 Dec 2000, Linus Torvalds wrote: Just one: any fs that really cares about completion callback is very likely to be picky about the requests ordering. So sync_buffers() is very unlikely to be useful anyway. In

Re: New patches for 2.2.18pre24 raw IO (fix for bounce buffer copy)

2000-12-15 Thread Stephen C. Tweedie
Hi, On Fri, Dec 08, 2000 at 01:06:33PM +0100, Andrea Arcangeli wrote: On Mon, Dec 04, 2000 at 08:50:04PM +, Stephen C. Tweedie wrote: I have pushed another set of raw IO patches out, this time to fix a This fix is missing: --- rawio-sct/mm/memory.c.~1~ Fri Dec 8 03:05:01 2000

New patches for 2.2.18 raw IO (fix for fault retry)

2000-12-15 Thread Stephen C. Tweedie
Hi all, OK, this now assembles the full outstanding set of raw IO fixes for the final 2.2.18 kernel, both with and without the 4G bigmem patches. The only changes since the last 2.2.18pre24 release are the addition of a minor bugfix (possible failures when retrying after getting colliding

Re: 64bit offsets for block devices ?

2000-12-07 Thread Stephen C. Tweedie
Hi, On Wed, Dec 06, 2000 at 06:50:15AM -0800, Reto Baettig wrote: > Imagine we have a virtual disk which provides a 64bit (sparse) address > room. Unfortunately we can not use it as a block device because in a lot > of places (including buffer_head structure), we're using a long or even > an

Re: 64bit offsets for block devices ?

2000-12-07 Thread Stephen C. Tweedie
Hi, On Wed, Dec 06, 2000 at 06:50:15AM -0800, Reto Baettig wrote: Imagine we have a virtual disk which provides a 64bit (sparse) address room. Unfortunately we can not use it as a block device because in a lot of places (including buffer_head structure), we're using a long or even an int

Re: Fixing random corruption in raw IO on 2.2.x kernel with bigmem enabled

2000-12-06 Thread Stephen C. Tweedie
Hi, On Wed, Dec 06, 2000 at 12:28:54PM -0500, Peng Dai wrote: > > This patch fixes a subtle corruption when doing raw IO on the 2.2.x > kernel > with bigmem enabled. The problem was first reported by Markus Döhr while That patch is already part of the full bugfixed raw IO patchset I posted out

Re: Fixing random corruption in raw IO on 2.2.x kernel with bigmem enabled

2000-12-06 Thread Stephen C. Tweedie
Hi, On Wed, Dec 06, 2000 at 12:28:54PM -0500, Peng Dai wrote: This patch fixes a subtle corruption when doing raw IO on the 2.2.x kernel with bigmem enabled. The problem was first reported by Markus Döhr while That patch is already part of the full bugfixed raw IO patchset I posted out a

Re: test12-pre5

2000-12-05 Thread Stephen C. Tweedie
Hi, On Tue, Dec 05, 2000 at 03:17:07PM -0500, Alexander Viro wrote: > > > On Tue, 5 Dec 2000, Linus Torvalds wrote: > > > And this is not just a "it happens to be like this" kind of thing. It > > _has_ to be like this, because every time we call clear_inode() we are > > going to physically

Re: test12-pre5

2000-12-05 Thread Stephen C. Tweedie
Hi, On Tue, Dec 05, 2000 at 09:48:51AM -0800, Linus Torvalds wrote: > > On Tue, 5 Dec 2000, Stephen C. Tweedie wrote: > > > > That is still buggy. We MUST NOT invalidate the inode buffers unless > > i_nlink == 0, because otherwise a subsequent open() and fsync() will

Re: test12-pre5

2000-12-05 Thread Stephen C. Tweedie
Hi, On Mon, Dec 04, 2000 at 08:00:03PM -0800, Linus Torvalds wrote: > > On Mon, 4 Dec 2000, Alexander Viro wrote: > > > This _is_ what clear_inode() does in pre5 (and in pre4, for that matter): > > void clear_inode(struct inode *inode) > { > if

Re: test12-pre5

2000-12-05 Thread Stephen C. Tweedie
Hi, On Mon, Dec 04, 2000 at 08:00:03PM -0800, Linus Torvalds wrote: On Mon, 4 Dec 2000, Alexander Viro wrote: This _is_ what clear_inode() does in pre5 (and in pre4, for that matter): void clear_inode(struct inode *inode) { if

Re: test12-pre5

2000-12-05 Thread Stephen C. Tweedie
Hi, On Tue, Dec 05, 2000 at 09:48:51AM -0800, Linus Torvalds wrote: On Tue, 5 Dec 2000, Stephen C. Tweedie wrote: That is still buggy. We MUST NOT invalidate the inode buffers unless i_nlink == 0, because otherwise a subsequent open() and fsync() will have forgotten what buffers

Re: test12-pre5

2000-12-05 Thread Stephen C. Tweedie
Hi, On Tue, Dec 05, 2000 at 03:17:07PM -0500, Alexander Viro wrote: On Tue, 5 Dec 2000, Linus Torvalds wrote: And this is not just a "it happens to be like this" kind of thing. It _has_ to be like this, because every time we call clear_inode() we are going to physically free the

Re: Using map_user_kiobuf()

2000-12-04 Thread Stephen C. Tweedie
Hi, On Thu, Nov 30, 2000 at 01:07:37PM -, John Meikle wrote: > I have been experimenting with a module that returns data to either a user > space programme or another module. A memory area is passed in, and the data > is written to it. Because the memory may be allocated either by a module

New patches for 2.2.18pre24 raw IO (fix for bounce buffer copy)

2000-12-04 Thread Stephen C. Tweedie
Hi, I have pushed another set of raw IO patches out, this time to fix a bug with bounce buffer copying when running on highmem boxes. It is likely to affect any bounce buffer copies using non-page-aligned accesses if both highmem and normal pages are involved in the kiobuf. The specific new

Re: [PATCH] inode dirty blocks Re: test12-pre4

2000-12-04 Thread Stephen C. Tweedie
On Mon, Dec 04, 2000 at 01:01:36AM -0500, Alexander Viro wrote: > > It doesn't solve the problem. If you unlink a file with dirty metadata > you have a nice chance to hit the BUG() in inode.c:83. I hope that patch > below closes all remaining holes. See analysis in previous posting > (basically,

Re: corruption

2000-12-04 Thread Stephen C. Tweedie
Hi, On Sat, Dec 02, 2000 at 10:33:36AM -0500, Alexander Viro wrote: > > On Sun, 3 Dec 2000, Andrew Morton wrote: > > > It appears that this problem is not fixed. > Sure, it isn't. Place where the shit hits the fan: fs/buffer.c::unmap_buffer(). > Add the call of remove_inode_queue(bh) there and

Re: corruption

2000-12-04 Thread Stephen C. Tweedie
Hi, On Sat, Dec 02, 2000 at 10:33:36AM -0500, Alexander Viro wrote: On Sun, 3 Dec 2000, Andrew Morton wrote: It appears that this problem is not fixed. Sure, it isn't. Place where the shit hits the fan: fs/buffer.c::unmap_buffer(). Add the call of remove_inode_queue(bh) there and see if

Re: [PATCH] inode dirty blocks Re: test12-pre4

2000-12-04 Thread Stephen C. Tweedie
On Mon, Dec 04, 2000 at 01:01:36AM -0500, Alexander Viro wrote: It doesn't solve the problem. If you unlink a file with dirty metadata you have a nice chance to hit the BUG() in inode.c:83. I hope that patch below closes all remaining holes. See analysis in previous posting (basically,

New patches for 2.2.18pre24 raw IO (fix for bounce buffer copy)

2000-12-04 Thread Stephen C. Tweedie
Hi, I have pushed another set of raw IO patches out, this time to fix a bug with bounce buffer copying when running on highmem boxes. It is likely to affect any bounce buffer copies using non-page-aligned accesses if both highmem and normal pages are involved in the kiobuf. The specific new

Re: Using map_user_kiobuf()

2000-12-04 Thread Stephen C. Tweedie
Hi, On Thu, Nov 30, 2000 at 01:07:37PM -, John Meikle wrote: I have been experimenting with a module that returns data to either a user space programme or another module. A memory area is passed in, and the data is written to it. Because the memory may be allocated either by a module or

Re: Updated: raw I/O patches (v2.2)

2000-12-01 Thread Stephen C. Tweedie
Hi, On Tue, Nov 21, 2000 at 11:18:15AM -0500, Eric Lowe wrote: > > I have updated raw I/O patches with Andrea's and my fixes against 2.2. > They check for CONFIG_BIGMEM so they can be applied and compiled > without the bigmem patch. I've just posted an assembly of all of the outstanding raw IO

Re: corruption

2000-12-01 Thread Stephen C. Tweedie
Hi, On Fri, Dec 01, 2000 at 08:35:41AM +1100, Andrew Morton wrote: > > I bet this'll catch it: > > static __inline__ void list_del(struct list_head *entry) > { > __list_del(entry->prev, entry->next); > + entry->next = entry->prev = 0; > } No, because the buffer hash list is never

Re: corruption

2000-12-01 Thread Stephen C. Tweedie
Hi, On Fri, Dec 01, 2000 at 08:35:41AM +1100, Andrew Morton wrote: I bet this'll catch it: static __inline__ void list_del(struct list_head *entry) { __list_del(entry-prev, entry-next); + entry-next = entry-prev = 0; } No, because the buffer hash list is never referenced

Re: Updated: raw I/O patches (v2.2)

2000-12-01 Thread Stephen C. Tweedie
Hi, On Tue, Nov 21, 2000 at 11:18:15AM -0500, Eric Lowe wrote: I have updated raw I/O patches with Andrea's and my fixes against 2.2. They check for CONFIG_BIGMEM so they can be applied and compiled without the bigmem patch. I've just posted an assembly of all of the outstanding raw IO

Re: [PATCH] blindingly stupid 2.2 VM bug

2000-11-30 Thread Stephen C. Tweedie
Hi, On Tue, Nov 28, 2000 at 04:35:32PM -0800, John Kennedy wrote: > On Wed, Nov 29, 2000 at 01:04:16AM +0100, Andrea Arcangeli wrote: > > On Tue, Nov 28, 2000 at 03:36:15PM -0800, John Kennedy wrote: > > > No, it is all ext3fs stuff that is touching the same areas your > > > > Ok this now

Re: [PATCH] blindingly stupid 2.2 VM bug

2000-11-30 Thread Stephen C. Tweedie
Hi, On Tue, Nov 28, 2000 at 04:35:32PM -0800, John Kennedy wrote: On Wed, Nov 29, 2000 at 01:04:16AM +0100, Andrea Arcangeli wrote: On Tue, Nov 28, 2000 at 03:36:15PM -0800, John Kennedy wrote: No, it is all ext3fs stuff that is touching the same areas your Ok this now makes sense.

Re: e2fs performance as function of block size

2000-11-24 Thread Stephen C. Tweedie
Hi, On Wed, Nov 22, 2000 at 11:28:12PM +0100, Michael Marxmeier wrote: > > If the files get somewhat bigger (eg. > 1G) having a bigger block > size also greatly reduces the ext2 overhead. Especially fsync() > used to be really bad on big file but choosing a bigger block > size changed a lot.

Re: e2fs performance as function of block size

2000-11-24 Thread Stephen C. Tweedie
Hi, On Wed, Nov 22, 2000 at 11:28:12PM +0100, Michael Marxmeier wrote: If the files get somewhat bigger (eg. 1G) having a bigger block size also greatly reduces the ext2 overhead. Especially fsync() used to be really bad on big file but choosing a bigger block size changed a lot. 2.4

Re: [patch] O_SYNC patch 3/3, add inode dirty buffer list support to ext2

2000-11-23 Thread Stephen C. Tweedie
Hi, On Wed, Nov 22, 2000 at 11:54:24AM -0700, Jeff V. Merkey wrote: > > I have not implemented O_SYNC in NWFS, but it looks like I need to add it > before posting the final patches. This patch appears to force write-through > of only dirty inodes, and allow reads to continue from cache. Is

Re: [patch] O_SYNC patch 3/3, add inode dirty buffer list support to ext2

2000-11-23 Thread Stephen C. Tweedie
Hi, On Wed, Nov 22, 2000 at 11:54:24AM -0700, Jeff V. Merkey wrote: I have not implemented O_SYNC in NWFS, but it looks like I need to add it before posting the final patches. This patch appears to force write-through of only dirty inodes, and allow reads to continue from cache. Is this

[testcase] fsync/O_SYNC simple test cases

2000-11-22 Thread Stephen C. Tweedie
Hi, The code below may be useful for doing simple testing of the O_SYNC and f[data]sync code in the kernel. It times various combinations of updates-in-place and appends under various synchronisation mechanisms, making it possible to see clearly whether fdatasync is skipping inode updates for

[patch] O_SYNC patch 3/3, add inode dirty buffer list support to ext2

2000-11-22 Thread Stephen C. Tweedie
Hi, This final part of the O_SYNC patches adds calls to ext2, and to generic_commit_write, to record dirty buffers against the owning inode. It also removes most of fs/ext2/fsync.c, which now simply calls the generic sync code. --Stephen 2.4.0test11.02.ext2-osync.diff : ---

[patch] O_SYNC patch 2/3, add per-inode dirty buffer lists

2000-11-22 Thread Stephen C. Tweedie
Hi, This is the second part of my old O_SYNC diffs patched up for 2.4.0-test11. It adds support for per-inode dirty buffer lists. In 2.4, we are now generating dirty buffers on a per-page basis for every write. For large O_SYNC writes (often databases use around 128K per write), we obviously

[patch] O_SYNC patch 1/3: Fix fdatasync

2000-11-22 Thread Stephen C. Tweedie
Hi, This is the first patch out of 3 to fix O_SYNC and fdatasync for 2.4.0-test11. The patch below fixes fdatasync (at least for ext2) so that it does not flush the inode to disk for purely timestamp updates. It splits I_DIRTY into two bits, one bit (I_DIRTY_DATASYNC) which is set only for

[patch] O_SYNC patch 2/3, add per-inode dirty buffer lists

2000-11-22 Thread Stephen C. Tweedie
Hi, This is the second part of my old O_SYNC diffs patched up for 2.4.0-test11. It adds support for per-inode dirty buffer lists. In 2.4, we are now generating dirty buffers on a per-page basis for every write. For large O_SYNC writes (often databases use around 128K per write), we obviously

[patch] O_SYNC patch 3/3, add inode dirty buffer list support to ext2

2000-11-22 Thread Stephen C. Tweedie
Hi, This final part of the O_SYNC patches adds calls to ext2, and to generic_commit_write, to record dirty buffers against the owning inode. It also removes most of fs/ext2/fsync.c, which now simply calls the generic sync code. --Stephen 2.4.0test11.02.ext2-osync.diff : ---

[patch] O_SYNC patch 1/3: Fix fdatasync

2000-11-22 Thread Stephen C. Tweedie
Hi, This is the first patch out of 3 to fix O_SYNC and fdatasync for 2.4.0-test11. The patch below fixes fdatasync (at least for ext2) so that it does not flush the inode to disk for purely timestamp updates. It splits I_DIRTY into two bits, one bit (I_DIRTY_DATASYNC) which is set only for

[testcase] fsync/O_SYNC simple test cases

2000-11-22 Thread Stephen C. Tweedie
Hi, The code below may be useful for doing simple testing of the O_SYNC and f[data]sync code in the kernel. It times various combinations of updates-in-place and appends under various synchronisation mechanisms, making it possible to see clearly whether fdatasync is skipping inode updates for

Re: ext3 vs. JFS file locations...

2000-11-06 Thread Stephen C. Tweedie
Hi, On Sat, Nov 04, 2000 at 09:53:41PM -0500, Albert D. Cahalan wrote: > > The journalling layer for ext3 is not a filesystem by itself. > It is generic journalling code. So, even if IBM did not have > any jfs code, the name would be wrong. Indeed, and the jfs layer will be renamed "jbd" at

Re: ext3 vs. JFS file locations...

2000-11-06 Thread Stephen C. Tweedie
Hi, On Sat, Nov 04, 2000 at 09:53:41PM -0500, Albert D. Cahalan wrote: The journalling layer for ext3 is not a filesystem by itself. It is generic journalling code. So, even if IBM did not have any jfs code, the name would be wrong. Indeed, and the jfs layer will be renamed "jbd" at some

Re: [PATCH] kiobuf/rawio fixes for 2.4.0-test10-pre6

2000-11-01 Thread Stephen C. Tweedie
Hi, On Mon, Oct 30, 2000 at 01:56:07PM -0500, Jeff Garzik wrote: > > Seen it, re-read my question... > > I keep seeing "audio drivers' mmap" used a specific example of a place > that would benefit from kiobufs. The current via audio mmap looks quite > a bit like mmap_kiobuf and its support

Re: [PATCH] kiobuf/rawio fixes for 2.4.0-test10-pre6

2000-11-01 Thread Stephen C. Tweedie
Hi, On Mon, Oct 30, 2000 at 01:56:07PM -0500, Jeff Garzik wrote: Seen it, re-read my question... I keep seeing "audio drivers' mmap" used a specific example of a place that would benefit from kiobufs. The current via audio mmap looks quite a bit like mmap_kiobuf and its support code...

Re: Quota fixes and a few questions

2000-10-27 Thread Stephen C. Tweedie
Hi, On Fri, Oct 27, 2000 at 11:31:59AM +0200, Juri Haberland wrote: > > > Hi Stephen, > > unfortunately 0.0.3b has the same problem. I tried it with a stock > 2.2.17 kernel + NFS patches + ext3-0.0.3b and the quota rpm you > included. Extracting two larger tar.gz files hits the deadlock

Re: Quota mods needed for journaled quota

2000-10-26 Thread Stephen C. Tweedie
Hi, On Thu, Oct 26, 2000 at 12:53:00PM -0400, Nathan Scott wrote: > > The addition of an "init_quota" method to the super_operations struct, > > with quota_on calling this and defaulting to installing the default > > quota_ops if the method is NULL, ought to be sufficient to let ext3 > > get

Re: Quota mods needed for journaled quota

2000-10-26 Thread Stephen C. Tweedie
Hi, On Thu, Oct 26, 2000 at 12:53:00PM -0400, Nathan Scott wrote: The addition of an "init_quota" method to the super_operations struct, with quota_on calling this and defaulting to installing the default quota_ops if the method is NULL, ought to be sufficient to let ext3 get quotas

Quota mods needed for journaled quota

2000-10-25 Thread Stephen C. Tweedie
Hi, There are a few problems in the Linux quota code which make it impossible to perform quota updates transactionally when using a journaled filesystem. Basically we have the following problems: * The underlying filesystem does not know which files are the quota files, so cannot tell when

Quota mods needed for journaled quota

2000-10-25 Thread Stephen C. Tweedie
Hi, There are a few problems in the Linux quota code which make it impossible to perform quota updates transactionally when using a journaled filesystem. Basically we have the following problems: * The underlying filesystem does not know which files are the quota files, so cannot tell when

Re: Quota fixes and a few questions

2000-10-24 Thread Stephen C. Tweedie
Hi, On Fri, Oct 20, 2000 at 05:02:28PM +0200, Juri Haberland wrote: > As I wrote in my original mail I used 0.0.2f. > Is there a version called 0.0.3 yet and if so where can I find it? In > ftp.uk.linux.org (which is currently not reachable as well as > vger.kernel.org) I found only 0.0.2f. I

Re: Quota fixes and a few questions

2000-10-24 Thread Stephen C. Tweedie
Hi, On Fri, Oct 20, 2000 at 05:02:28PM +0200, Juri Haberland wrote: As I wrote in my original mail I used 0.0.2f. Is there a version called 0.0.3 yet and if so where can I find it? In ftp.uk.linux.org (which is currently not reachable as well as vger.kernel.org) I found only 0.0.2f. I must

Re: Quota fixes and a few questions

2000-10-20 Thread Stephen C. Tweedie
Hi, On Thu, Oct 19, 2000 at 07:03:54PM +0200, Jan Kara wrote: > > > I stumbled into another problem: > > When using ext3 with quotas the kjournald process stops responding and > > stays in DW state when the filesystem gets under heavy load. It is easy > > to reproduce: > > Just extract two or

Re: Quota fixes and a few questions

2000-10-20 Thread Stephen C. Tweedie
Hi, On Thu, Oct 19, 2000 at 07:03:54PM +0200, Jan Kara wrote: I stumbled into another problem: When using ext3 with quotas the kjournald process stops responding and stays in DW state when the filesystem gets under heavy load. It is easy to reproduce: Just extract two or three larger

Re: Quota fixes and a few questions

2000-10-06 Thread Stephen C. Tweedie
Hi Jan, On Wed, Sep 27, 2000 at 02:56:20PM +0200, Jan Kara wrote: > > So I've been thinking about fixes in quota (and also writing some parts). While we're at it, I've attached a patch which I was sent which simply teaches quota about ext3 as a valid fs type in fstab. It appears to work

Re: Quota fixes and a few questions

2000-10-06 Thread Stephen C. Tweedie
Hi Jan, On Wed, Sep 27, 2000 at 02:56:20PM +0200, Jan Kara wrote: So I've been thinking about fixes in quota (and also writing some parts). While we're at it, I've attached a patch which I was sent which simply teaches quota about ext3 as a valid fs type in fstab. It appears to work fine

Re: Soft-Updates for Linux ?

2000-10-03 Thread Stephen C. Tweedie
Hi, On Mon, Oct 02, 2000 at 03:13:07AM +0200, Daniel Phillips wrote: > What I've seen proposed is a mechanism where the VM can say 'flush this > page' to a filesystem and the filesystem can then go ahead and do what > it wants, including flushing the page, flushing some other page, or not >

Re: Can ext3 or ReiserFS w/ journalling be made on /dev/loop?

2000-10-03 Thread Stephen C. Tweedie
Hi, On Thu, Sep 28, 2000 at 07:59:21PM +, Marc Mutz wrote: > > I was asked a question lately that I was unable to answer: Assume you > want to make a (encrypted, but that's not the issue here) filesystem on > a loopback block device (/dev/loop*). Can this be a journalling one? In > other

Re: Can ext3 or ReiserFS w/ journalling be made on /dev/loop?

2000-10-03 Thread Stephen C. Tweedie
Hi, On Thu, Sep 28, 2000 at 07:59:21PM +, Marc Mutz wrote: I was asked a question lately that I was unable to answer: Assume you want to make a (encrypted, but that's not the issue here) filesystem on a loopback block device (/dev/loop*). Can this be a journalling one? In other words,

Re: Soft-Updates for Linux ?

2000-10-03 Thread Stephen C. Tweedie
Hi, On Mon, Oct 02, 2000 at 03:13:07AM +0200, Daniel Phillips wrote: What I've seen proposed is a mechanism where the VM can say 'flush this page' to a filesystem and the filesystem can then go ahead and do what it wants, including flushing the page, flushing some other page, or not doing

Re: the new VMt

2000-09-26 Thread Stephen C. Tweedie
Hi, On Tue, Sep 26, 2000 at 11:02:48AM -0600, Erik Andersen wrote: > Another approach would be to let user space turn off overcommit. No. Overcommit only applies to pageable memory. Beancounter is really needed for non-pageable resources such as page tables and mlock()ed pages. Cheers,

Re: the new VMt

2000-09-26 Thread Stephen C. Tweedie
Hi, On Tue, Sep 26, 2000 at 09:17:44AM -0600, [EMAIL PROTECTED] wrote: > Operating systems cannot make more memory appear by magic. > The question is really about the best strategy for dealing with low memory. In my > opinion, the OS should not try to out-think physical limitations. Instead,

Re: the new VMt

2000-09-26 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 03:12:50PM -0600, [EMAIL PROTECTED] wrote: > > > > > > I'm not too sure of what you have in mind, but if it is > > > "process creates vast virtual space to generate many page table > > > entries -- using mmap" > > > the answer is, virtual address space

Re: the new VMt

2000-09-26 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 03:07:44PM -0600, [EMAIL PROTECTED] wrote: > On Mon, Sep 25, 2000 at 09:46:35PM +0100, Alan Cox wrote: > > > I'm not too sure of what you have in mind, but if it is > > > "process creates vast virtual space to generate many page table > > > entries -- using

Re: the new VMt

2000-09-26 Thread Stephen C. Tweedie
Hi, On Tue, Sep 26, 2000 at 09:17:44AM -0600, [EMAIL PROTECTED] wrote: Operating systems cannot make more memory appear by magic. The question is really about the best strategy for dealing with low memory. In my opinion, the OS should not try to out-think physical limitations. Instead, the

Re: the new VMt

2000-09-26 Thread Stephen C. Tweedie
Hi, On Tue, Sep 26, 2000 at 11:02:48AM -0600, Erik Andersen wrote: Another approach would be to let user space turn off overcommit. No. Overcommit only applies to pageable memory. Beancounter is really needed for non-pageable resources such as page tables and mlock()ed pages. Cheers,

Re: the new VMt

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 02:04:19PM -0600, [EMAIL PROTECTED] wrote: > > Right, but if the alternative is spurious ENOMEM when we can satisfy > > An ENOMEM is not spurious if there is not enough memory. UNIX does not ask the > OS to do impossible tricks. Yes, but the ENOMEM _is_ spurious if

Re: [patch] vmfixes-2.4.0-test9-B2 - fixing deadlocks

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 09:32:42PM +0200, Andrea Arcangeli wrote: > Having shrink_mmap that browse the mapped page cache is useless > as having shrink_mmap browsing kernel memory and anonymous pages > as it does in 2.2.x as far I can tell. It's an algorithm > complexity problem and it will

Re: the new VMt

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 12:34:56PM -0600, [EMAIL PROTECTED] wrote: > > > Process 1,2 and 3 all start allocating 20 pages > > > now 57 pages are locked up in non-swapable kernel space and the system >deadlocks OOM. > > > > Or go the beancounter route: process 1 asks "can I pin 20

Re: the new VMt

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 08:09:31PM +0100, Alan Cox wrote: > > > Indeed. But we wont fail the kmalloc with a NULL return > > > > Isn't that the preferred behaviour, though? If we are completely out > > of VM on a no-swap machine, we should be killing one of the existing > > processes rather

Re: the new VMt

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 12:13:15PM -0600, [EMAIL PROTECTED] wrote: > > Definitely not. GFP_ATOMIC is reserved for things that really can't > > swap or schedule right now. Use GFP_ATOMIC indiscriminately and you'll > > have to increase the number of atomic-allocatable pages. > > Process

Re: [patch] vmfixes-2.4.0-test9-B2 - fixing deadlocks

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 07:03:47PM +0200, Andrea Arcangeli wrote: > > > This really seems to be the biggest difference between the two > > approaches right now. The FreeBSD folks believe fervently that one of > > [ aging cache and mapped pages in the same cycle ] > > Right. > > And since

Re: the new VMt

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 05:51:49PM +0100, Alan Cox wrote: > > > 2 active processes, no swap > > > > > > #1#2 > > > kmalloc 32K kmalloc 16K > > > OKOK > > > kmalloc 16K

Re: refill_inactive()

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 09:17:54AM -0700, Linus Torvalds wrote: > > On Mon, 25 Sep 2000, Rik van Riel wrote: > > > > Hmmm, doesn't GFP_BUFFER simply imply that we cannot > > allocate new buffer heads to do IO with?? > > No. > > New buffer heads would be ok - recursion is fine in theory,

Re: the new VMt

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 06:05:00PM +0200, Andrea Arcangeli wrote: > On Mon, Sep 25, 2000 at 04:42:49PM +0100, Stephen C. Tweedie wrote: > > Progress is made, clean pages are discarded and dirty ones queued for > > How can you make progress if there isn't swap avaiable and all

Re: [patch] vmfixes-2.4.0-test9-B2 - fixing deadlocks

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 01:41:37AM +0200, Andrea Arcangeli wrote: > > Since you're talking about this I'll soon (as soon as I'll finish some other > thing that is just work in progress) release a classzone against latest's > 2.4.x. My approch is _quite_ different from the curren VM. Current

Re: [patch] vmfixes-2.4.0-test9-B2 - fixing deadlocks

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 12:36:50AM +0200, bert hubert wrote: > On Mon, Sep 25, 2000 at 12:13:42AM +0200, Andrea Arcangeli wrote: > > On Sun, Sep 24, 2000 at 10:43:03PM +0100, Stephen C. Tweedie wrote: > > > any form of serialisation on the quota file). This feels like

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 04:02:30AM +0200, Andrea Arcangeli wrote: > On Sun, Sep 24, 2000 at 09:27:39PM -0400, Alexander Viro wrote: > > So help testing the patches to them. Arrgh... > > I think I'd better fix the bugs that I know about before testing patches that > tries to remove the

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 04:02:30AM +0200, Andrea Arcangeli wrote: On Sun, Sep 24, 2000 at 09:27:39PM -0400, Alexander Viro wrote: So help testing the patches to them. Arrgh... I think I'd better fix the bugs that I know about before testing patches that tries to remove the

Re: [patch] vmfixes-2.4.0-test9-B2 - fixing deadlocks

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 12:36:50AM +0200, bert hubert wrote: On Mon, Sep 25, 2000 at 12:13:42AM +0200, Andrea Arcangeli wrote: On Sun, Sep 24, 2000 at 10:43:03PM +0100, Stephen C. Tweedie wrote: any form of serialisation on the quota file). This feels like rather a lot of new

Re: the new VMt

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 06:05:00PM +0200, Andrea Arcangeli wrote: On Mon, Sep 25, 2000 at 04:42:49PM +0100, Stephen C. Tweedie wrote: Progress is made, clean pages are discarded and dirty ones queued for How can you make progress if there isn't swap avaiable and all the freeable page

Re: refill_inactive()

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 09:17:54AM -0700, Linus Torvalds wrote: On Mon, 25 Sep 2000, Rik van Riel wrote: Hmmm, doesn't GFP_BUFFER simply imply that we cannot allocate new buffer heads to do IO with?? No. New buffer heads would be ok - recursion is fine in theory, as long as

Re: [patch] vmfixes-2.4.0-test9-B2 - fixing deadlocks

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 07:03:47PM +0200, Andrea Arcangeli wrote: This really seems to be the biggest difference between the two approaches right now. The FreeBSD folks believe fervently that one of [ aging cache and mapped pages in the same cycle ] Right. And since you move

Re: the new VMt

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 08:09:31PM +0100, Alan Cox wrote: Indeed. But we wont fail the kmalloc with a NULL return Isn't that the preferred behaviour, though? If we are completely out of VM on a no-swap machine, we should be killing one of the existing processes rather than

Re: [patch] vmfixes-2.4.0-test9-B2 - fixing deadlocks

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 09:32:42PM +0200, Andrea Arcangeli wrote: Having shrink_mmap that browse the mapped page cache is useless as having shrink_mmap browsing kernel memory and anonymous pages as it does in 2.2.x as far I can tell. It's an algorithm complexity problem and it will

Re: the new VMt

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 02:04:19PM -0600, [EMAIL PROTECTED] wrote: Right, but if the alternative is spurious ENOMEM when we can satisfy An ENOMEM is not spurious if there is not enough memory. UNIX does not ask the OS to do impossible tricks. Yes, but the ENOMEM _is_ spurious if you

Re: the new VMt

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 12:13:15PM -0600, [EMAIL PROTECTED] wrote: Definitely not. GFP_ATOMIC is reserved for things that really can't swap or schedule right now. Use GFP_ATOMIC indiscriminately and you'll have to increase the number of atomic-allocatable pages. Process 1,2 and 3

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-24 Thread Stephen C. Tweedie
Hi, On Sun, Sep 24, 2000 at 11:12:39PM +0200, Ingo Molnar wrote: > > > ext2_new_block (or whatever that runs getblk with the superlock lock > > acquired)->getblk->GFP->shrink_dcache_memory->prune_dcache-> > > prune_one_dentry->dput->dentry_iput->iput->inode->i_sb->s_op-> > >

Re: __GFP_IO && shrink_[d|i]cache_memory()?

2000-09-24 Thread Stephen C. Tweedie
Hi, On Sun, Sep 24, 2000 at 08:40:05PM +0200, Ingo Molnar wrote: > On Sun, 24 Sep 2000, Linus Torvalds wrote: > > > [...] I don't think shrinking the inode cache is actually illegal when > > GPF_IO isn't set. In fact, it's probably only the buffer cache itself > > that has to avoid recursion -

Re: [PATCH] Useless inode semaphore locking in 2.4.0-test8

2000-09-19 Thread Stephen C. Tweedie
Hi, On Fri, Sep 15, 2000 at 08:31:43AM -0400, Alexander Viro wrote: > > Also truncate inode locking is needed to get a halfway reliable loopback > > device (unlike the current one) > > ? > I'm afraid that I've lost you here - what do you mean? loop does a bmap() and then submits block IO.

Re: [PATCH] Useless inode semaphore locking in 2.4.0-test8

2000-09-19 Thread Stephen C. Tweedie
Hi, On Fri, Sep 15, 2000 at 08:31:43AM -0400, Alexander Viro wrote: Also truncate inode locking is needed to get a halfway reliable loopback device (unlike the current one) ? I'm afraid that I've lost you here - what do you mean? loop does a bmap() and then submits block IO. You don't

Re: Adding members to task_struct without recompling the kernel

2000-09-15 Thread Stephen C. Tweedie
Hi, On Tue, Sep 12, 2000 at 06:17:48PM -0400, Michael Vines wrote: > > I'm writing a kernel module that needs to keep track of a pointer to some > custom module information for every task in the system. Basically I want > to add another member to task_struct but I don't want to have to >

Re: More on 2.2.18pre2aa2

2000-09-13 Thread Stephen C. Tweedie
Hi, On Tue, Sep 12, 2000 at 04:08:54PM +0200, Andrea Arcangeli wrote: > > >Andrea - latency is time measured and perceived. Doing it time based seems to > >make reasonable sense. I grant you might want to play with the weighting per > > When you have a device that writes a request every two

Re: [ANNOUNCE] Withdrawl of Open Source NDS Project/NTFS/M2FS forLinux

2000-09-07 Thread Stephen C. Tweedie
Hi, On Wed, Sep 06, 2000 at 09:44:54AM -0600, Jeff V. Merkey wrote: > > KDB is a user mode debugger designed to debug user space apps that's > been hacked to run with a driver. Absolutely not true. You're probably thinking about kgdb, the gdb stub for remote kernel source level debugging.

Re: [ANNOUNCE] Withdrawl of Open Source NDS Project/NTFS/M2FS forLinux

2000-09-07 Thread Stephen C. Tweedie
Hi, On Wed, Sep 06, 2000 at 09:44:54AM -0600, Jeff V. Merkey wrote: KDB is a user mode debugger designed to debug user space apps that's been hacked to run with a driver. Absolutely not true. You're probably thinking about kgdb, the gdb stub for remote kernel source level debugging. kdb

Re: [patch] All the fs patches resulting from updating mark_buffer_dirty

2000-09-04 Thread Stephen C. Tweedie
Hi, On Mon, Sep 04, 2000 at 11:29:56PM +0200, Rasmus Andersen wrote: > > I have changed the interface to mark_buffer_dirty (as per Tigran > Aivazian's suggestion). This impacts a lot of places in the kernel > (trivially), noticeably the file systems. The URL below points a > big patch for all

Two VM problems for the 2.4 TODO list

2000-09-04 Thread Stephen C. Tweedie
Hi Ted, To be fixed for 2.4: 1) Non-atomic pte updates The page aging code and mprotect both modify existing ptes non-atomically. That can stomp on the VM hardware on other CPUs setting the dirty bit on mmaped pages when using threads. 2.2 is vulnerable too. 2) RSS locking

Re: zero-copy TCP

2000-09-04 Thread Stephen C. Tweedie
Hi, On Sun, Sep 03, 2000 at 07:29:56PM +0200, Ingo Molnar wrote: > > On Sun, 3 Sep 2000, Andi Kleen wrote: > > > I did the same for fragment RX some months ago (simple fragment lists > > that were copy-checksummed to user space). Overall it is probably > > better to use a kiovec, because that

Re: thread rant

2000-09-04 Thread Stephen C. Tweedie
Hi, On Sat, Sep 02, 2000 at 09:41:03PM +0200, Ingo Molnar wrote: > > On Sat, 2 Sep 2000, Alexander Viro wrote: > > > unlink() and the last munmap()/exit() will get rid of it... > > yep - and this isnt possible with traditional SysV shared memory, and isnt > possible with traditional SysV

Re: 2T for i386 OT

2000-09-04 Thread Stephen C. Tweedie
Hi, On Sun, Sep 03, 2000 at 11:36:25PM +0200, Andrea Ferraris wrote: > > I used to think that. Im planning on deploying a 1Tb IDE raid using 3ware > > kit for an ftp site very soon. Its very cheap and its very fast. UDMA > with > > one disk per channel and the controller doing some of the work.

Re: Large File support and blocks.

2000-09-04 Thread Stephen C. Tweedie
Hi, On Fri, Sep 01, 2000 at 09:16:23AM -0700, Linda Walsh wrote: > With all the talk about bugs and slowness on a 386/486/586 -- does anyone > think those platforms will have multi-T disks hooked up to them? Yes. They are already doing it, and the number of people trying is growing rapidly.

<    1   2   3   4   5   6   7   >