Re: scsi vs ide performance on fsync's

2001-03-07 Thread Stephen C. Tweedie
Hi, On Wed, Mar 07, 2001 at 09:15:36PM +0100, Jens Axboe wrote: On Wed, Mar 07 2001, Stephen C. Tweedie wrote: For most fs'es, that's not an issue. The fs won't start writeback on the primary disk at all until the journal commit has been acknowledged as firm on disk. But do you

Raw IO fixes for 2.4.2-ac8

2001-03-02 Thread Stephen C. Tweedie
Hi, I've just uploaded the current raw IO fixes as kiobuf-2.4.2-ac8-A0.tar.gz on ftp.uk.linux.org:/pub/linux/sct/fs/raw-io/ and ftp.*.kernel.org:/pub/linux/kernel/people/sct/raw-io/ This includes: 00-movecode.diff: move kiobuf code from mm/memory.c to fs/iobuf.c

Re: [patch] set kiobuf io_count once, instead of increment

2001-03-02 Thread Stephen C. Tweedie
Hi, On Wed, Feb 28, 2001 at 09:18:59AM -0800, Robert Read wrote: > On Tue, Feb 27, 2001 at 10:50:54PM -0300, Marcelo Tosatti wrote: > This is true, but it looks like the brw_kiovec allocation failure > handling is broken already; it's calling __put_unused_buffer_head on > bhs without waiting

Re: [patch] set kiobuf io_count once, instead of increment

2001-03-02 Thread Stephen C. Tweedie
On Tue, Feb 27, 2001 at 04:22:22PM -0800, Robert Read wrote: > Currently in brw_kiovec, iobuf->io_count is being incremented as each > bh is submitted, and decremented in the bh->b_end_io(). This means > io_count can go to zero before all the bhs have been submitted, > especially during a large

Re: [patch] set kiobuf io_count once, instead of increment

2001-03-02 Thread Stephen C. Tweedie
On Tue, Feb 27, 2001 at 04:22:22PM -0800, Robert Read wrote: Currently in brw_kiovec, iobuf-io_count is being incremented as each bh is submitted, and decremented in the bh-b_end_io(). This means io_count can go to zero before all the bhs have been submitted, especially during a large

Re: [patch] set kiobuf io_count once, instead of increment

2001-03-02 Thread Stephen C. Tweedie
Hi, On Wed, Feb 28, 2001 at 09:18:59AM -0800, Robert Read wrote: On Tue, Feb 27, 2001 at 10:50:54PM -0300, Marcelo Tosatti wrote: This is true, but it looks like the brw_kiovec allocation failure handling is broken already; it's calling __put_unused_buffer_head on bhs without waiting for

Raw IO fixes for 2.4.2-ac8

2001-03-02 Thread Stephen C. Tweedie
Hi, I've just uploaded the current raw IO fixes as kiobuf-2.4.2-ac8-A0.tar.gz on ftp.uk.linux.org:/pub/linux/sct/fs/raw-io/ and ftp.*.kernel.org:/pub/linux/kernel/people/sct/raw-io/ This includes: 00-movecode.diff: move kiobuf code from mm/memory.c to fs/iobuf.c

Re: Writing on raw device with software RAID 0 is slow

2001-03-01 Thread Stephen C. Tweedie
Hi, On Thu, Mar 01, 2001 at 11:08:13AM -0500, Ben LaHaise wrote: > On Thu, 1 Mar 2001, Stephen C. Tweedie wrote: > > Actually, how about making it a sysctl? That's probably the most > reasonable approach for now since the optimal size depends on hardware. Fine with me

Re: Writing on raw device with software RAID 0 is slow

2001-03-01 Thread Stephen C. Tweedie
Hi, On Thu, Mar 01, 2001 at 10:44:38AM -0500, Ben LaHaise wrote: > > On Thu, 1 Mar 2001, Stephen C. Tweedie wrote: > > > Raw IO is always synchronous: it gets flushed to disk before the write > > returns. You don't get any write-behind with raw IO, so the smaller > &

Re: ext3 fsck question

2001-03-01 Thread Stephen C. Tweedie
Hi, On Wed, Feb 28, 2001 at 08:03:21PM -0600, Neal Gieselman wrote: > > I applied the libs and other utilites from e2fsprogs by hand. > I ran fsck.ext3 on my secondary partition and it ran fine. The boot fsck > on / was complaining about something but I could not catch it. > I then went single

Re: ext3 fsck question

2001-03-01 Thread Stephen C. Tweedie
Hi, On Wed, Feb 28, 2001 at 08:03:21PM -0600, Neal Gieselman wrote: I applied the libs and other utilites from e2fsprogs by hand. I ran fsck.ext3 on my secondary partition and it ran fine. The boot fsck on / was complaining about something but I could not catch it. I then went single user

Re: Writing on raw device with software RAID 0 is slow

2001-03-01 Thread Stephen C. Tweedie
Hi, On Thu, Mar 01, 2001 at 10:44:38AM -0500, Ben LaHaise wrote: On Thu, 1 Mar 2001, Stephen C. Tweedie wrote: Raw IO is always synchronous: it gets flushed to disk before the write returns. You don't get any write-behind with raw IO, so the smaller the blocksize you write

Re: Writing on raw device with software RAID 0 is slow

2001-03-01 Thread Stephen C. Tweedie
Hi, On Thu, Mar 01, 2001 at 11:08:13AM -0500, Ben LaHaise wrote: On Thu, 1 Mar 2001, Stephen C. Tweedie wrote: Actually, how about making it a sysctl? That's probably the most reasonable approach for now since the optimal size depends on hardware. Fine with me. --Stephen - To unsubscribe

Re: [PATCH] guard mm->rss with page_table_lock (241p11)

2001-02-13 Thread Stephen C. Tweedie
Hi, On Mon, Feb 12, 2001 at 07:15:57PM -0800, george anzinger wrote: > Excuse me if I am off base here, but wouldn't an atomic operation be > better here. There are atomic inc/dec and add/sub macros for this. It > just seems that that is all that is needed here (from inspection of the >

Re: [PATCH] guard mm-rss with page_table_lock (241p11)

2001-02-13 Thread Stephen C. Tweedie
Hi, On Mon, Feb 12, 2001 at 07:15:57PM -0800, george anzinger wrote: Excuse me if I am off base here, but wouldn't an atomic operation be better here. There are atomic inc/dec and add/sub macros for this. It just seems that that is all that is needed here (from inspection of the patch).

Re: ext2: block > big ?

2001-02-12 Thread Stephen C. Tweedie
Hi, On Sun, Feb 11, 2001 at 05:44:02PM -0700, Brian Grossman wrote: > > What does a message like 'ext2: block > big' indicate? An attempt was made to access a block beyond the legal max size for an ext2 file. That probably implies a corrupt inode, because the ext2 file write code checks for

Re: ext2: block big ?

2001-02-12 Thread Stephen C. Tweedie
Hi, On Sun, Feb 11, 2001 at 05:44:02PM -0700, Brian Grossman wrote: What does a message like 'ext2: block big' indicate? An attempt was made to access a block beyond the legal max size for an ext2 file. That probably implies a corrupt inode, because the ext2 file write code checks for that

Re: Raw devices bound to RAID arrays ?

2001-02-11 Thread Stephen C. Tweedie
Hi, On Sun, Feb 11, 2001 at 06:29:12PM +0200, Petru Paler wrote: > > Is it possible to bind a raw device to a software RAID 1 array ? Yes. --Stephen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at

Re: Raw devices bound to RAID arrays ?

2001-02-11 Thread Stephen C. Tweedie
Hi, On Sun, Feb 11, 2001 at 06:29:12PM +0200, Petru Paler wrote: Is it possible to bind a raw device to a software RAID 1 array ? Yes. --Stephen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-08 Thread Stephen C. Tweedie
Hi, On Thu, Feb 08, 2001 at 03:52:35PM +0100, Mikulas Patocka wrote: > > > How do you write high-performance ftp server without threads if select > > on regular file always returns "ready"? > > No, it's not really possible on Linux. Use SYS$QIO call on VMS :-) Ahh, but even VMS SYS$QIO is

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-08 Thread Stephen C. Tweedie
Hi, On Thu, Feb 08, 2001 at 12:15:13AM +0100, Pavel Machek wrote: > > > EAGAIN is _not_ a valid return value for block devices or for regular > > files. And in fact it _cannot_ be, because select() is defined to always > > return 1 on them - so if a write() were to return EAGAIN, user space

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-08 Thread Stephen C. Tweedie
Hi, On Thu, Feb 08, 2001 at 12:15:13AM +0100, Pavel Machek wrote: EAGAIN is _not_ a valid return value for block devices or for regular files. And in fact it _cannot_ be, because select() is defined to always return 1 on them - so if a write() were to return EAGAIN, user space would

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-08 Thread Stephen C. Tweedie
Hi, On Thu, Feb 08, 2001 at 03:52:35PM +0100, Mikulas Patocka wrote: How do you write high-performance ftp server without threads if select on regular file always returns "ready"? No, it's not really possible on Linux. Use SYS$QIO call on VMS :-) Ahh, but even VMS SYS$QIO is

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-07 Thread Stephen C. Tweedie
Hi, On Wed, Feb 07, 2001 at 12:12:44PM -0700, Richard Gooch wrote: > Stephen C. Tweedie writes: > > > > Sorry? I'm not sure where communication is breaking down here, but > > we really don't seem to be talking about the same things. SGI's > > kiobuf request patches

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-07 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 06:37:41PM -0800, Linus Torvalds wrote: > > > However, I really _do_ want to have the page cache have a bigger > granularity than the smallest memory mapping size, and there are always > special cases that might be able to generate IO in bigger chunks (ie > in-kernel

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-07 Thread Stephen C. Tweedie
Hi, On Wed, Feb 07, 2001 at 09:10:32AM +, David Howells wrote: > > I presume that correct_size will always be a power of 2... Yes. --Stephen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-07 Thread Stephen C. Tweedie
Hi, On Wed, Feb 07, 2001 at 09:10:32AM +, David Howells wrote: I presume that correct_size will always be a power of 2... Yes. --Stephen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-07 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 06:37:41PM -0800, Linus Torvalds wrote: However, I really _do_ want to have the page cache have a bigger granularity than the smallest memory mapping size, and there are always special cases that might be able to generate IO in bigger chunks (ie in-kernel

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-07 Thread Stephen C. Tweedie
Hi, On Wed, Feb 07, 2001 at 12:12:44PM -0700, Richard Gooch wrote: Stephen C. Tweedie writes: Sorry? I'm not sure where communication is breaking down here, but we really don't seem to be talking about the same things. SGI's kiobuf request patches already let us pass a large IO

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-06 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 04:50:19PM -0800, Linus Torvalds wrote: > > > On Wed, 7 Feb 2001, Stephen C. Tweedie wrote: > > > > That gets us from 512-byte blocks to 4k, but no more (ll_rw_block > > enforces a single blocksize on all requests but that relaxing tha

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-06 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 04:41:21PM -0800, Linus Torvalds wrote: > > On Wed, 7 Feb 2001, Stephen C. Tweedie wrote: > > No, it is a problem of the ll_rw_block interface: buffer_heads need to > > be aligned on disk at a multiple of their buffer size. > > E

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-06 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 07:25:19PM -0500, Ingo Molnar wrote: > > On Wed, 7 Feb 2001, Stephen C. Tweedie wrote: > > > No, it is a problem of the ll_rw_block interface: buffer_heads need to > > be aligned on disk at a multiple of their buffer size. Under the Unix

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-06 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 08:57:13PM +0100, Ingo Molnar wrote: > > [overhead of 512-byte bhs in the raw IO code is an artificial problem of > the raw IO code.] No, it is a problem of the ll_rw_block interface: buffer_heads need to be aligned on disk at a multiple of their buffer size. Under

Re: sync & asyck i/o

2001-02-06 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 11:25:00AM -0800, Andre Hedrick wrote: > On Tue, 6 Feb 2001, Stephen C. Tweedie wrote: > > No, we simply omit to instruct them to enable write-back caching. > > Linux assumes that the WCE (write cache enable) bit in a disk's > > caching mode p

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-06 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 06:22:58PM +0100, Christoph Hellwig wrote: > On Tue, Feb 06, 2001 at 05:05:06PM +0000, Stephen C. Tweedie wrote: > > The whole point of the post was that it is merging, not splitting, > > which is troublesome. How are you going to merge requests wit

Re: sync & asyck i/o

2001-02-06 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 02:52:40PM +, Alan Cox wrote: > > According to the man page for fsync it copies in-core data to disk > > prior to its return. Does that take async i/o to the media in account? > > I.e. does it wait for completion of the async i/o to the disk? > > Undefined. >

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-06 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 06:00:58PM +0100, Christoph Hellwig wrote: > On Tue, Feb 06, 2001 at 12:07:04AM +0000, Stephen C. Tweedie wrote: > > > > Is that a realistic basis for a cleaned-up ll_rw_blk.c? > > I don't think os. If we minimize the state in the IO containe

Re: rawio usage

2001-02-06 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 10:36:32PM -0800, Mayank Vasa wrote: > > When I run this program as root, I get the error "write: Invalid argument". Raw IO requires that the buffers are aligned on a 512-byte boundary in memory. --Stephen - To unsubscribe from this list: send the line "unsubscribe

Re: rawio usage

2001-02-06 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 10:36:32PM -0800, Mayank Vasa wrote: When I run this program as root, I get the error "write: Invalid argument". Raw IO requires that the buffers are aligned on a 512-byte boundary in memory. --Stephen - To unsubscribe from this list: send the line "unsubscribe

Re: sync asyck i/o

2001-02-06 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 02:52:40PM +, Alan Cox wrote: According to the man page for fsync it copies in-core data to disk prior to its return. Does that take async i/o to the media in account? I.e. does it wait for completion of the async i/o to the disk? Undefined. In

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-06 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 06:22:58PM +0100, Christoph Hellwig wrote: On Tue, Feb 06, 2001 at 05:05:06PM +, Stephen C. Tweedie wrote: The whole point of the post was that it is merging, not splitting, which is troublesome. How are you going to merge requests without having chains

Re: sync asyck i/o

2001-02-06 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 11:25:00AM -0800, Andre Hedrick wrote: On Tue, 6 Feb 2001, Stephen C. Tweedie wrote: No, we simply omit to instruct them to enable write-back caching. Linux assumes that the WCE (write cache enable) bit in a disk's caching mode page is zero. You can

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-06 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 08:57:13PM +0100, Ingo Molnar wrote: [overhead of 512-byte bhs in the raw IO code is an artificial problem of the raw IO code.] No, it is a problem of the ll_rw_block interface: buffer_heads need to be aligned on disk at a multiple of their buffer size. Under

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-06 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 07:25:19PM -0500, Ingo Molnar wrote: On Wed, 7 Feb 2001, Stephen C. Tweedie wrote: No, it is a problem of the ll_rw_block interface: buffer_heads need to be aligned on disk at a multiple of their buffer size. Under the Unix raw IO interface it is perfectly

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-06 Thread Stephen C. Tweedie
Hi, On Tue, Feb 06, 2001 at 04:50:19PM -0800, Linus Torvalds wrote: On Wed, 7 Feb 2001, Stephen C. Tweedie wrote: That gets us from 512-byte blocks to 4k, but no more (ll_rw_block enforces a single blocksize on all requests but that relaxing that requirement is no big deal

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-05 Thread Stephen C. Tweedie
Hi, OK, if we take a step back what does this look like: On Mon, Feb 05, 2001 at 08:54:29PM +, Stephen C. Tweedie wrote: > > If we are doing readahead, we want completion callbacks raised as soon > as possible on IO completions, no matter how many other IOs have been

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 11:06:48PM +, Alan Cox wrote: > > do you then tell the application _above_ raid0 if one of the > > underlying IOs succeeds and the other fails halfway through? > > struct > { > u32 flags; /* because everything needs flags */ > struct

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 10:28:37PM +0100, Ingo Molnar wrote: > > On Mon, 5 Feb 2001, Stephen C. Tweedie wrote: > > it's exactly these 'compound' structures i'm vehemently against. I do > think it's a design nightmare. I can picture these monster kiobufs > complicat

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-05 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 11:28:17AM -0800, Linus Torvalds wrote: > The _vectors_ are needed at the very lowest levels: the levels that do not > necessarily have to worry at all about completion notification etc. You > want the arbitrary scatter-gather vectors passed down to the stuff that >

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 08:36:31AM -0800, Linus Torvalds wrote: > Have you ever thought about other things, like networking, special > devices, stuff like that? They can (and do) have packet boundaries that > have nothing to do with pages what-so-ever. They can have such notions as >

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 05:29:47PM +, Alan Cox wrote: > > > > _All_ drivers would have to do that in the degenerate case, because > > none of our drivers can deal with a dma boundary in the middle of a > > sector, and even in those places where the hardware supports it in > > theory,

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 03:19:09PM +, Alan Cox wrote: > > Yes, it's the sort of thing that you would hope should work, but in > > practice it's not reliable. > > So the less smart devices need to call something like > > kiovec_align(kiovec, 512); > > and have it do the bounce

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 01:00:51PM +0100, Manfred Spraul wrote: > "Stephen C. Tweedie" wrote: > > > > You simply cannot do physical disk IO on > > non-sector-aligned memory or in chunks which aren't a multiple of > > sector size. > > Why not?

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 08:01:45PM +0530, [EMAIL PROTECTED] wrote: > > >It's the very essence of readahead that we wake up the earlier buffers > >as soon as they become available, without waiting for the later ones > >to complete, so we _need_ this multiple completion concept. > > I can

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Fri, Feb 02, 2001 at 01:02:28PM +0100, Christoph Hellwig wrote: > > > I may still be persuaded that we need the full scatter-gather list > > fields throughout, but for now I tend to think that, at least in the > > disk layers, we may get cleaner results by allow linked lists of > >

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Sun, Feb 04, 2001 at 06:54:58PM +0530, [EMAIL PROTECTED] wrote: > > Can't we define a kiobuf structure as just this ? A combination of a > frag_list and a page_list ? Then all code which needs to accept an arbitrary kiobuf needs to be able to parse both --- ugh. > BTW, We could have a

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Sat, Feb 03, 2001 at 12:28:47PM -0800, Linus Torvalds wrote: > > On Thu, 1 Feb 2001, Stephen C. Tweedie wrote: > > > Neither the read nor the write are page-aligned. I don't know where you > got that idea. It's obviously not true even in the common case: it depends &g

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Sat, Feb 03, 2001 at 12:28:47PM -0800, Linus Torvalds wrote: On Thu, 1 Feb 2001, Stephen C. Tweedie wrote: Neither the read nor the write are page-aligned. I don't know where you got that idea. It's obviously not true even in the common case: it depends _entirely_ on what

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Sun, Feb 04, 2001 at 06:54:58PM +0530, [EMAIL PROTECTED] wrote: Can't we define a kiobuf structure as just this ? A combination of a frag_list and a page_list ? Then all code which needs to accept an arbitrary kiobuf needs to be able to parse both --- ugh. BTW, We could have a

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Fri, Feb 02, 2001 at 01:02:28PM +0100, Christoph Hellwig wrote: I may still be persuaded that we need the full scatter-gather list fields throughout, but for now I tend to think that, at least in the disk layers, we may get cleaner results by allow linked lists of page-aligned

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 08:01:45PM +0530, [EMAIL PROTECTED] wrote: It's the very essence of readahead that we wake up the earlier buffers as soon as they become available, without waiting for the later ones to complete, so we _need_ this multiple completion concept. I can understand

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 01:00:51PM +0100, Manfred Spraul wrote: "Stephen C. Tweedie" wrote: You simply cannot do physical disk IO on non-sector-aligned memory or in chunks which aren't a multiple of sector size. Why not? Obviously the disk access itself must be sect

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 03:19:09PM +, Alan Cox wrote: Yes, it's the sort of thing that you would hope should work, but in practice it's not reliable. So the less smart devices need to call something like kiovec_align(kiovec, 512); and have it do the bounce buffers ?

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 05:29:47PM +, Alan Cox wrote: _All_ drivers would have to do that in the degenerate case, because none of our drivers can deal with a dma boundary in the middle of a sector, and even in those places where the hardware supports it in theory, you are

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 08:36:31AM -0800, Linus Torvalds wrote: Have you ever thought about other things, like networking, special devices, stuff like that? They can (and do) have packet boundaries that have nothing to do with pages what-so-ever. They can have such notions as packets

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-05 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 11:28:17AM -0800, Linus Torvalds wrote: The _vectors_ are needed at the very lowest levels: the levels that do not necessarily have to worry at all about completion notification etc. You want the arbitrary scatter-gather vectors passed down to the stuff that sets

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 10:28:37PM +0100, Ingo Molnar wrote: On Mon, 5 Feb 2001, Stephen C. Tweedie wrote: it's exactly these 'compound' structures i'm vehemently against. I do think it's a design nightmare. I can picture these monster kiobufs complicating the whole code for no good

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-05 Thread Stephen C. Tweedie
Hi, On Mon, Feb 05, 2001 at 11:06:48PM +, Alan Cox wrote: do you then tell the application _above_ raid0 if one of the underlying IOs succeeds and the other fails halfway through? struct { u32 flags; /* because everything needs flags */ struct io_completion

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait

2001-02-05 Thread Stephen C. Tweedie
Hi, OK, if we take a step back what does this look like: On Mon, Feb 05, 2001 at 08:54:29PM +, Stephen C. Tweedie wrote: If we are doing readahead, we want completion callbacks raised as soon as possible on IO completions, no matter how many other IOs have been merged with the current

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-02 Thread Stephen C. Tweedie
Hi, On Fri, Feb 02, 2001 at 12:51:35PM +0100, Christoph Hellwig wrote: > > > > If I have a page vector with a single offset/length pair, I can build > > a new header with the same vector and modified offset/length to split > > the vector in two without copying it. > > You just say in the

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-02 Thread Stephen C. Tweedie
Hi, On Fri, Feb 02, 2001 at 12:51:35PM +0100, Christoph Hellwig wrote: If I have a page vector with a single offset/length pair, I can build a new header with the same vector and modified offset/length to split the vector in two without copying it. You just say in the higher-level

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 09:33:27PM +0100, Christoph Hellwig wrote: > I think you want the whole kio concept only for disk-like IO. No. I want something good for zero-copy IO in general, but a lot of that concerns the problem of interacting with the user, and the basic center of that

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 09:33:27PM +0100, Christoph Hellwig wrote: > > > On Thu, Feb 01, 2001 at 05:34:49PM +, Alan Cox wrote: > > In the disk IO case, you basically don't get that (the only thing > > which comes close is raid5 parity blocks). The data which the user > > started with

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 09:46:27PM +0100, Christoph Hellwig wrote: > > Right now we can take a kiobuf and turn it into a bunch of > > buffer_heads for IO. The io_count lets us track all of those sub-IOs > > so that we know when all submitted IO has completed, so that we can > > pass the

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 07:14:03PM +0100, Christoph Hellwig wrote: > On Thu, Feb 01, 2001 at 05:41:20PM +0000, Stephen C. Tweedie wrote: > > > > > > We can't allocate a huge kiobuf structure just for requesting one page of > > > IO. It might get better with

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 06:49:50PM +0100, Christoph Hellwig wrote: > > > Adding tons of base/limit pairs to kiobufs makes it worse not better > > For disk I/O it makes the handling a little easier for the cost of the > additional offset/length fields. Umm, actually, no, it makes it much

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 05:34:49PM +, Alan Cox wrote: > > > > I don't see any real advantage for disk IO. The real advantage is that > > we can have a generic structure that is also usefull in e.g. networking > > and can lead to a unified IO buffering scheme (a little like IO-Lite). >

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 06:05:15PM +0100, Christoph Hellwig wrote: > On Thu, Feb 01, 2001 at 04:16:15PM +0000, Stephen C. Tweedie wrote: > > > > > > No, and with the current kiobufs it would not make sense, because they > > > are to heavy-weight. > >

Re: [PATCH] vma limited swapin readahead

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 02:45:04PM -0200, Rik van Riel wrote: > On Thu, 1 Feb 2001, Stephen C. Tweedie wrote: > > But only when the extra pages we're reading in don't > displace useful data from memory, making us fault in > those other pages ... causing us to go to the disk

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 04:09:53PM +0100, Christoph Hellwig wrote: > On Thu, Feb 01, 2001 at 08:14:58PM +0530, [EMAIL PROTECTED] wrote: > > > > That would require the vfs interfaces themselves (address space > > readpage/writepage ops) to take kiobufs as arguments, instead of struct > >

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 10:08:45AM -0600, Steve Lord wrote: > Christoph Hellwig wrote: > > On Thu, Feb 01, 2001 at 08:14:58PM +0530, [EMAIL PROTECTED] wrote: > > > > > > That would require the vfs interfaces themselves (address space > > > readpage/writepage ops) to take kiobufs as

Re: [PATCH] vma limited swapin readahead

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 08:53:33AM -0200, Marcelo Tosatti wrote: > > On Thu, 1 Feb 2001, Stephen C. Tweedie wrote: > > If we're under free memory shortage, "unlucky" readaheads will be harmful. I know, it's a balancing act. But given that even one successful readahe

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 01:28:33PM +0530, [EMAIL PROTECTED] wrote: > > Here's a second pass attempt, based on Ben's wait queue extensions: > Does this sound any better ? It's a mechanism, all right, but you haven't described what problems it is trying to solve, and where it is likely to be

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 10:25:22AM +0530, [EMAIL PROTECTED] wrote: > > >We _do_ need the ability to stack completion events, but as far as the > >kiobuf work goes, my current thoughts are to do that by stacking > >lightweight "clone" kiobufs. > > Would that work with stackable filesystems

Re: [PATCH] vma limited swapin readahead

2001-02-01 Thread Stephen C. Tweedie
Hi, On Wed, Jan 31, 2001 at 04:24:24PM -0800, David Gould wrote: > > I am skeptical of the argument that we can win by replacing "the least > desirable" pages with pages were even less desireable and that we have > no recent indication of any need for. It seems possible under heavy swap > to

Re: [PATCH] vma limited swapin readahead

2001-02-01 Thread Stephen C. Tweedie
Hi, On Wed, Jan 31, 2001 at 04:24:24PM -0800, David Gould wrote: I am skeptical of the argument that we can win by replacing "the least desirable" pages with pages were even less desireable and that we have no recent indication of any need for. It seems possible under heavy swap to discard

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 10:25:22AM +0530, [EMAIL PROTECTED] wrote: We _do_ need the ability to stack completion events, but as far as the kiobuf work goes, my current thoughts are to do that by stacking lightweight "clone" kiobufs. Would that work with stackable filesystems ? Only

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 01:28:33PM +0530, [EMAIL PROTECTED] wrote: Here's a second pass attempt, based on Ben's wait queue extensions: Does this sound any better ? It's a mechanism, all right, but you haven't described what problems it is trying to solve, and where it is likely to be

Re: [PATCH] vma limited swapin readahead

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 08:53:33AM -0200, Marcelo Tosatti wrote: On Thu, 1 Feb 2001, Stephen C. Tweedie wrote: If we're under free memory shortage, "unlucky" readaheads will be harmful. I know, it's a balancing act. But given that even one successful readahead per read

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 10:08:45AM -0600, Steve Lord wrote: Christoph Hellwig wrote: On Thu, Feb 01, 2001 at 08:14:58PM +0530, [EMAIL PROTECTED] wrote: That would require the vfs interfaces themselves (address space readpage/writepage ops) to take kiobufs as arguments, instead

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 04:09:53PM +0100, Christoph Hellwig wrote: On Thu, Feb 01, 2001 at 08:14:58PM +0530, [EMAIL PROTECTED] wrote: That would require the vfs interfaces themselves (address space readpage/writepage ops) to take kiobufs as arguments, instead of struct page * .

Re: [PATCH] vma limited swapin readahead

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 02:45:04PM -0200, Rik van Riel wrote: On Thu, 1 Feb 2001, Stephen C. Tweedie wrote: But only when the extra pages we're reading in don't displace useful data from memory, making us fault in those other pages ... causing us to go to the disk again and do more

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 06:05:15PM +0100, Christoph Hellwig wrote: On Thu, Feb 01, 2001 at 04:16:15PM +, Stephen C. Tweedie wrote: No, and with the current kiobufs it would not make sense, because they are to heavy-weight. Really? In what way? We can't allocate

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 05:34:49PM +, Alan Cox wrote: I don't see any real advantage for disk IO. The real advantage is that we can have a generic structure that is also usefull in e.g. networking and can lead to a unified IO buffering scheme (a little like IO-Lite).

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 06:49:50PM +0100, Christoph Hellwig wrote: Adding tons of base/limit pairs to kiobufs makes it worse not better For disk I/O it makes the handling a little easier for the cost of the additional offset/length fields. Umm, actually, no, it makes it much worse

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 07:14:03PM +0100, Christoph Hellwig wrote: On Thu, Feb 01, 2001 at 05:41:20PM +, Stephen C. Tweedie wrote: We can't allocate a huge kiobuf structure just for requesting one page of IO. It might get better with VM-level IO clustering though

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 09:46:27PM +0100, Christoph Hellwig wrote: Right now we can take a kiobuf and turn it into a bunch of buffer_heads for IO. The io_count lets us track all of those sub-IOs so that we know when all submitted IO has completed, so that we can pass the completion

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 09:33:27PM +0100, Christoph Hellwig wrote: On Thu, Feb 01, 2001 at 05:34:49PM +, Alan Cox wrote: In the disk IO case, you basically don't get that (the only thing which comes close is raid5 parity blocks). The data which the user started with is the

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 09:33:27PM +0100, Christoph Hellwig wrote: I think you want the whole kio concept only for disk-like IO. No. I want something good for zero-copy IO in general, but a lot of that concerns the problem of interacting with the user, and the basic center of that

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-01-31 Thread Stephen C. Tweedie
Hi, On Wed, Jan 31, 2001 at 07:28:01PM +0530, [EMAIL PROTECTED] wrote: > > Do the following modifications to your wait queue extension sound > reasonable ? > > 1. Change add_wait_queue to add elements to the end of queue (fifo, by > default) and instead have an add_wait_queue_lifo() routine

<    1   2   3   4   5   6   7   >