Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Tue, 9 Jan 2001, Benjamin C.R. LaHaise wrote: > Do the math again: for transmitting a single page in a kiobuf only 64 > bytes needs to be initialized. If map_array is moved to the end of > the structure, that's all contiguous data and is a single cacheline. but you are comparing apples to

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Alan Cox
> > We have already shown that the IO-plugging API sucks, I'm afraid. > > it might not be important to others, but we do hold one particular > SPECweb99 world record: on 2-way, 2 GB RAM, testing a load with a full And its real world value is exactly the same as the mindcraft NT values. Don't

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Tue, 9 Jan 2001, Stephen C. Tweedie wrote: > > we do have SLAB [which essentially caches structures, on a per-CPU basis] > > which i did take into account, but still, initializing a 600+ byte kiovec > > is probably more work than the rest of sending a packet! I mean i'd love > > to eliminate

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Stephen Frost
* Ingo Molnar ([EMAIL PROTECTED]) wrote: > > On Tue, 9 Jan 2001, Stephen Frost wrote: > > > Now, the interesting bit here is that the processes can grow to be > > pretty large (200M+, up as high as 500M, higher if we let it ;) ) and what > > happens with MOSIX is that entire processes get

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Tue, 9 Jan 2001, Stephen Frost wrote: > Now, the interesting bit here is that the processes can grow to be > pretty large (200M+, up as high as 500M, higher if we let it ;) ) and what > happens with MOSIX is that entire processes get sent over the wire to > other machines for work.

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Benjamin C.R. LaHaise
On Tue, 9 Jan 2001, Ingo Molnar wrote: > > On Tue, 9 Jan 2001, Stephen C. Tweedie wrote: > > > > please study the networking portions of the zerocopy patch and you'll see > > > why this is not desirable. An alloc_kiovec()/free_kiovec() is exactly the > > > thing we cannot afford in a

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Tue, 9 Jan 2001, Stephen C. Tweedie wrote: > Jes has also got hard numbers for the performance advantages of > jumbograms on some of the networks he's been using, and you ain't > going to get udp jumbograms through a page-by-page API, ever. i know the performance advantages of jumbograms

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Stephen C. Tweedie
Hi, On Tue, Jan 09, 2001 at 04:00:34PM +0100, Ingo Molnar wrote: > > On Tue, 9 Jan 2001, Stephen C. Tweedie wrote: > > we do have SLAB [which essentially caches structures, on a per-CPU basis] > which i did take into account, but still, initializing a 600+ byte kiovec > is probably more work

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Trond Myklebust
> David S Miller <[EMAIL PROTECTED]> writes: >I would have thought one of the main interests of doing >something like this would be to allow us to speed up large >writes to the socket for ncpfs/knfsd/nfs/smbfs/... > This is what TCP_CORK/MSG_MORE et al. are

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Stephen Frost
* Ingo Molnar ([EMAIL PROTECTED]) wrote: > > On Tue, 9 Jan 2001, Stephen C. Tweedie wrote: > > > but it just doesn't apply when you look at some other applications, > > such as streaming out video data or performing fileserving in a > > high-performance compute cluster where you are serving

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Stephen C. Tweedie
Hi, On Tue, Jan 09, 2001 at 03:40:56PM +0100, Ingo Molnar wrote: > > i'd love to first see these kinds of applications (under Linux) before > designing for them. Things like Beowulf have been around for a while now, and SGI have been doing that sort of multimedia stuff for ages. I don't think

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Tue, 9 Jan 2001, Stephen C. Tweedie wrote: > > please study the networking portions of the zerocopy patch and you'll see > > why this is not desirable. An alloc_kiovec()/free_kiovec() is exactly the > > thing we cannot afford in a sendfile() operation. sendfile() is > > lightweight, the

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Alan Cox
> designing for them. Eg. if an IO operation (eg. streaming video webcast) > does a DMA from a camera card to an outgoing networking card, would it be Most mpeg2 hardware isnt set up for that kind of use. And webcast protocols like h.263 tend to be software implemented. Capturing raw video

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Tue, 9 Jan 2001, Stephen C. Tweedie wrote: > > i used to think that this is useful, but these days it isnt. It's a waste > > of PCI bandwidth resources, and it's much cheaper to keep a cache in RAM > > instead of doing direct disk=>network DMA *all the time* some resource is > > requested. >

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Alan Cox
> Bad bad bad. We already have SCSI devices optimised for bandwidth > which don't approach decent performance until you are passing them 1MB > IOs, and even in networking the 1.5K packet limit kills us in some Even low end cheap raid cards like the AMI megaraid dearly want 128K writes. Its

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Stephen C. Tweedie
Hi, On Tue, Jan 09, 2001 at 01:04:49PM +0100, Ingo Molnar wrote: > > On Tue, 9 Jan 2001, Christoph Hellwig wrote: > > please study the networking portions of the zerocopy patch and you'll see > why this is not desirable. An alloc_kiovec()/free_kiovec() is exactly the > thing we cannot afford

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Stephen C. Tweedie
Hi, On Tue, Jan 09, 2001 at 11:23:41AM +0100, Ingo Molnar wrote: > > > Having proper kiobuf support would make it possible to, for example, > > do zerocopy network->disk data transfers and lots of other things. > > i used to think that this is useful, but these days it isnt. It's a waste > of

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread David S. Miller
From: Trond Myklebust <[EMAIL PROTECTED]> Date: 09 Jan 2001 14:52:40 +0100 I don't really want to be chiming in with another 'make it a kiobuf', but given that you already have written 'do_tcp_sendpages()' why did you make sock->ops->sendpage() take the single page as an argument

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Trond Myklebust
> " " == David S Miller <[EMAIL PROTECTED]> writes: > I've put a patch up for testing on the kernel.org mirrors: > /pub/linux/kernel/people/davem/zerocopy-2.4.0-1.diff.gz . > Finally, regardless of networking card, there should be a > measurable performance boost

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Andrew Morton
Ingo Molnar wrote: > > On Tue, 9 Jan 2001, Stephen Landamore wrote: > > > >> Sure. But sendfile is not one of the fundamental UNIX operations... > > > > Neither were eg. kernel-based semaphores. So what? Unix wasnt > > > Ehh, that's not correct. HP-UX was the first to implement sendfile(). >

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Tue, 9 Jan 2001, Stephen Landamore wrote: > >> Sure. But sendfile is not one of the fundamental UNIX operations... > > Neither were eg. kernel-based semaphores. So what? Unix wasnt > Ehh, that's not correct. HP-UX was the first to implement sendfile(). i dont think we disagree. What i

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Stephen Landamore
Ingo Molnar wrote: > On Tue, 9 Jan 2001, Christoph Hellwig wrote: > >> Sure. But sendfile is not one of the fundamental UNIX operations... > > Neither were eg. kernel-based semaphores. So what? Unix wasnt > perfect and isnt perfect - but it was a (very) good starting > point. If you are arguing

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Mon, 8 Jan 2001, David S. Miller wrote: >All I am asking is that someone lets me know if they make major >changes to my code so I can keep track of whats happening. > > We have not made any major changes to your code, in lieu of this > not being code which is actually being submitted

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Tue, 9 Jan 2001, Christoph Hellwig wrote: > Sure. But sendfile is not one of the fundamental UNIX operations... Neither were eg. kernel-based semaphores. So what? Unix wasnt perfect and isnt perfect - but it was a (very) good starting point. If you are arguing against the existence or

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread David S. Miller
Date: Tue, 9 Jan 2001 12:28:10 +0100 From: Christoph Hellwig <[EMAIL PROTECTED]> Sure. But sendfile is not one of the fundamental UNIX operations... It's a fundamental Linux interface and VFS-->networking interface. An alloc_kiovec before and an free_kiovec after the actual call

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Christoph Hellwig
On Tue, Jan 09, 2001 at 02:31:13AM -0800, David S. Miller wrote: >Date: Tue, 9 Jan 2001 11:31:45 +0100 >From: Christoph Hellwig <[EMAIL PROTECTED]> > >Yuck. A new file_opo just to get a few benchmarks right ... I >hope the writepages stuff will not be merged in Linus tree (but

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Tue, 9 Jan 2001, Christoph Hellwig wrote: > > 2.4. In any case, the zerocopy code is 'kiovec in spirit' (uses > > vectors of struct page *, offset, size entities), > Yep. That is why I was so worried aboit the writepages file op. i believe you misunderstand. kiovecs (in their current form)

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread David S. Miller
Date: Tue, 9 Jan 2001 11:31:45 +0100 From: Christoph Hellwig <[EMAIL PROTECTED]> Yuck. A new file_opo just to get a few benchmarks right ... I hope the writepages stuff will not be merged in Linus tree (but I wish the code behind it!) It's a "I know how to send a page somewhere

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Christoph Hellwig
On Tue, Jan 09, 2001 at 11:23:41AM +0100, Ingo Molnar wrote: > > On Mon, 8 Jan 2001, Rik van Riel wrote: > > > I really think the zerocopy network stuff should be ported to kiobuf > > proper. > > yep, we talked to Stephen Tweedie about this already, but it involves some > changes in kiovec

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Mon, 8 Jan 2001, Rik van Riel wrote: > I really think the zerocopy network stuff should be ported to kiobuf > proper. yep, we talked to Stephen Tweedie about this already, but it involves some changes in kiovec support and we didnt want to touch too much code for 2.4. In any case, the

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Andrea Arcangeli
On Tue, Jan 09, 2001 at 07:38:28PM +0100, Ingo Molnar wrote: On Tue, 9 Jan 2001, Jens Axboe wrote: ever seen, this is why i quoted it - the talk was about block-IO performance, and Stephen said that our block IO sucks. It used to suck, but in 2.4, with the right patch from Jens,

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Tue, 9 Jan 2001, Stephen Frost wrote: Now, the interesting bit here is that the processes can grow to be pretty large (200M+, up as high as 500M, higher if we let it ;) ) and what happens with MOSIX is that entire processes get sent over the wire to other machines for work. MOSIX

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Jens Axboe
On Tue, Jan 09 2001, Andrea Arcangeli wrote: Thats fine. Get me 128K-512K chunks nicely streaming into my raid controller and I'll be a happy man No problem, apply blk-13B and you'll get 512K chunks for SCSI and RAID. i cannot agree more - Jens' patch did wonders to IO

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Andrea Arcangeli
On Wed, Jan 10, 2001 at 12:34:35AM +0100, Jens Axboe wrote: Ah I see. It would be nice to base the QUEUE_NR_REQUEST on something else than a static number. For example, 3000 per queue translates into 281Kb of request slots per queue. On a typical system with a floppy, hard drive, and CD-ROM

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread David S. Miller
Date: Tue, 9 Jan 2001 11:31:45 +0100 From: Christoph Hellwig [EMAIL PROTECTED] Yuck. A new file_opo just to get a few benchmarks right ... I hope the writepages stuff will not be merged in Linus tree (but I wish the code behind it!) It's a "I know how to send a page somewhere

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Mon, 8 Jan 2001, Rik van Riel wrote: I really think the zerocopy network stuff should be ported to kiobuf proper. yep, we talked to Stephen Tweedie about this already, but it involves some changes in kiovec support and we didnt want to touch too much code for 2.4. In any case, the

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Christoph Hellwig
On Tue, Jan 09, 2001 at 11:23:41AM +0100, Ingo Molnar wrote: On Mon, 8 Jan 2001, Rik van Riel wrote: I really think the zerocopy network stuff should be ported to kiobuf proper. yep, we talked to Stephen Tweedie about this already, but it involves some changes in kiovec support and

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Christoph Hellwig
On Tue, Jan 09, 2001 at 02:31:13AM -0800, David S. Miller wrote: Date: Tue, 9 Jan 2001 11:31:45 +0100 From: Christoph Hellwig [EMAIL PROTECTED] Yuck. A new file_opo just to get a few benchmarks right ... I hope the writepages stuff will not be merged in Linus tree (but I

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Tue, 9 Jan 2001, Christoph Hellwig wrote: Sure. But sendfile is not one of the fundamental UNIX operations... Neither were eg. kernel-based semaphores. So what? Unix wasnt perfect and isnt perfect - but it was a (very) good starting point. If you are arguing against the existence or

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Mon, 8 Jan 2001, David S. Miller wrote: All I am asking is that someone lets me know if they make major changes to my code so I can keep track of whats happening. We have not made any major changes to your code, in lieu of this not being code which is actually being submitted yet.

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Tue, 9 Jan 2001, Stephen Landamore wrote: Sure. But sendfile is not one of the fundamental UNIX operations... Neither were eg. kernel-based semaphores. So what? Unix wasnt Ehh, that's not correct. HP-UX was the first to implement sendfile(). i dont think we disagree. What i was

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Andrew Morton
Ingo Molnar wrote: On Tue, 9 Jan 2001, Stephen Landamore wrote: Sure. But sendfile is not one of the fundamental UNIX operations... Neither were eg. kernel-based semaphores. So what? Unix wasnt Ehh, that's not correct. HP-UX was the first to implement sendfile(). i dont think

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Alan Cox
designing for them. Eg. if an IO operation (eg. streaming video webcast) does a DMA from a camera card to an outgoing networking card, would it be Most mpeg2 hardware isnt set up for that kind of use. And webcast protocols like h.263 tend to be software implemented. Capturing raw video for

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Tue, 9 Jan 2001, Stephen C. Tweedie wrote: please study the networking portions of the zerocopy patch and you'll see why this is not desirable. An alloc_kiovec()/free_kiovec() is exactly the thing we cannot afford in a sendfile() operation. sendfile() is lightweight, the setup times

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Stephen C. Tweedie
Hi, On Tue, Jan 09, 2001 at 03:40:56PM +0100, Ingo Molnar wrote: i'd love to first see these kinds of applications (under Linux) before designing for them. Things like Beowulf have been around for a while now, and SGI have been doing that sort of multimedia stuff for ages. I don't think

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Stephen Frost
* Ingo Molnar ([EMAIL PROTECTED]) wrote: On Tue, 9 Jan 2001, Stephen C. Tweedie wrote: but it just doesn't apply when you look at some other applications, such as streaming out video data or performing fileserving in a high-performance compute cluster where you are serving bulk data.

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Trond Myklebust
David S Miller [EMAIL PROTECTED] writes: I would have thought one of the main interests of doing something like this would be to allow us to speed up large writes to the socket for ncpfs/knfsd/nfs/smbfs/... This is what TCP_CORK/MSG_MORE et al. are all for,

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Stephen C. Tweedie
Hi, On Tue, Jan 09, 2001 at 11:23:41AM +0100, Ingo Molnar wrote: Having proper kiobuf support would make it possible to, for example, do zerocopy network-disk data transfers and lots of other things. i used to think that this is useful, but these days it isnt. It's a waste of PCI

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Stephen C. Tweedie
Hi, On Tue, Jan 09, 2001 at 01:04:49PM +0100, Ingo Molnar wrote: On Tue, 9 Jan 2001, Christoph Hellwig wrote: please study the networking portions of the zerocopy patch and you'll see why this is not desirable. An alloc_kiovec()/free_kiovec() is exactly the thing we cannot afford in a

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Alan Cox
Bad bad bad. We already have SCSI devices optimised for bandwidth which don't approach decent performance until you are passing them 1MB IOs, and even in networking the 1.5K packet limit kills us in some Even low end cheap raid cards like the AMI megaraid dearly want 128K writes. Its quite a

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Trond Myklebust
" " == David S Miller [EMAIL PROTECTED] writes: I've put a patch up for testing on the kernel.org mirrors: /pub/linux/kernel/people/davem/zerocopy-2.4.0-1.diff.gz . Finally, regardless of networking card, there should be a measurable performance boost for NFS

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Tue, 9 Jan 2001, Stephen C. Tweedie wrote: i used to think that this is useful, but these days it isnt. It's a waste of PCI bandwidth resources, and it's much cheaper to keep a cache in RAM instead of doing direct disk=network DMA *all the time* some resource is requested. No. I'm

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Manfred Spraul
sct wrote: We've already got measurements showing how insane this is. Raw IO requests, plus internal pagebuf contiguous requests from XFS, have to get broken down into page-sized chunks by the current ll_rw_block() API, only to get reassembled by the make_request code. It's *enormous*

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Chris Evans
On Tue, 9 Jan 2001, Ingo Molnar wrote: This is one of the busiest and most complex block-IO Linux systems i've ever seen, this is why i quoted it - the talk was about block-IO performance, and Stephen said that our block IO sucks. It used to suck, but in 2.4, with the right patch from Jens,

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Tue, 9 Jan 2001, Benjamin C.R. LaHaise wrote: I've already got fully async read and write working via a helper thread ^^^ for doing the bmaps when the page is not uptodate in the page cache. ^^^ thats

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Tue, 9 Jan 2001, Chris Evans wrote: but in 2.4, with the right patch from Jens, it doesnt suck anymore. ) Is this "right patch from Jens" on the radar for 2.4 inclusion? i do hope so! Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Dan Hollis
On Tue, 9 Jan 2001, Ingo Molnar wrote: :-) I think sendfile() should also have its logical extensions: receivefile(). I dont know how the HPUX implementation works, but in Linux, right now it's only possible to sendfile() from a file to a socket. The logical extension of this is to allow

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Linus Torvalds
In article [EMAIL PROTECTED], Christoph Hellwig [EMAIL PROTECTED] wrote: You get that multiple page call with kiobufs for free... No, you don't. kiobufs are crap. Face it. They do NOT allow proper multi-page scatter gather, regardless of what the kiobuf PR department has said. I've

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Dan Hollis
On Wed, 10 Jan 2001, Andrew Morton wrote: y'know our pals have patented it? http://www.delphion.com/details?pn=US05845280__ Bad faith patent? Actionable, treble damages? -Dan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On 9 Jan 2001, Linus Torvalds wrote: I told David that he can fix the network zero-copy code two ways: either he makes it _truly_ scatter-gather (an array of not just pages, but of proper page-offset-length tuples), or he makes it just a single area and lets the low-level TCP/whatever code

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Linus Torvalds
On Tue, 9 Jan 2001, Ingo Molnar wrote: So i do believe that the networking code is properly designed in this respect, and this concept goes to the highest level of the networking code. Absolutely. This is why I have no conceptual problems with the networking

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Christoph Hellwig
In article [EMAIL PROTECTED] you wrote: On Tue, 9 Jan 2001, Ingo Molnar wrote: So i do believe that the networking code is properly designed in this respect, and this concept goes to the highest level of the networking code. Absolutely. This is why I have

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Linus Torvalds
On Tue, 9 Jan 2001, Christoph Hellwig wrote: Also the tuple argument you gave earlier isn't right in this specific case: when doing sendfile from pagecache to an fs, you have a bunch of pages, an offset in the first and a length that makes the data end before last page's end. No. Look

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Christoph Hellwig
On Tue, Jan 09, 2001 at 12:55:51PM -0800, Linus Torvalds wrote: On Tue, 9 Jan 2001, Christoph Hellwig wrote: Also the tuple argument you gave earlier isn't right in this specific case: when doing sendfile from pagecache to an fs, you have a bunch of pages, an offset in the first

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Linus Torvalds
On Tue, 9 Jan 2001, Christoph Hellwig wrote: Look at sendfile(). You do NOT have a "bunch" of pages. Sendfile() is very much a page-at-a-time thing, and expects the actual IO layers to do it's own scatter-gather. So sendfile() doesn't want any array at all: it only wants a

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Linus Torvalds
In article [EMAIL PROTECTED], Stephen C. Tweedie [EMAIL PROTECTED] wrote: Jes has also got hard numbers for the performance advantages of jumbograms on some of the networks he's been using, and you ain't going to get udp jumbograms through a page-by-page API, ever. Wrong. The only thing you

[patch]: ac4 blk (was Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1)

2001-01-09 Thread Jens Axboe
On Tue, Jan 09 2001, Ingo Molnar wrote: but in 2.4, with the right patch from Jens, it doesnt suck anymore. ) Is this "right patch from Jens" on the radar for 2.4 inclusion? i do hope so! Here's a version against 2.4.0-ac4, blk-13B did not apply cleanly due to moving of i2o files and

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Linus Torvalds
On Tue, 9 Jan 2001, Benjamin C.R. LaHaise wrote: On Tue, 9 Jan 2001, Linus Torvalds wrote: The _lower-level_ stuff (ie TCP and the drivers) want the "array of tuples", and again, they do NOT want an array of pages, because if somebody does two sendfile() calls that fit in one packet,

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Andrea Arcangeli
On Tue, Jan 09, 2001 at 09:10:24PM +0100, Ingo Molnar wrote: On Tue, 9 Jan 2001, Andrea Arcangeli wrote: BTW, I noticed what is left in blk-13B seems to be my work (Jens's fixes for merging when the I/O queue is full are just been integrated in test1x). [...] it was Jens' [i think

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Tue, 9 Jan 2001, Dave Zarzycki wrote: In user space, how do you know when its safe to reuse the buffer that was handed to sendmsg() with the MSG_NOCOPY flag? Or does sendmsg() with that flag block until the buffer isn't needed by the kernel any more? If it does block, doesn't that defeat

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Tue, 9 Jan 2001, Stephen C. Tweedie wrote: Jes has also got hard numbers for the performance advantages of jumbograms on some of the networks he's been using, and you ain't going to get udp jumbograms through a page-by-page API, ever. i know the performance advantages of jumbograms

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread David S. Miller
Date: Tue, 9 Jan 2001 14:25:42 + From: "Stephen C. Tweedie" [EMAIL PROTECTED] Perhaps tcp can merge internal 4K requests, but if you're doing udp jumbograms (or STP or VIA), you do need an interface which can give the networking stack more than one page at once. All network

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Stephen Frost
* Ingo Molnar ([EMAIL PROTECTED]) wrote: On Tue, 9 Jan 2001, Stephen Frost wrote: Now, the interesting bit here is that the processes can grow to be pretty large (200M+, up as high as 500M, higher if we let it ;) ) and what happens with MOSIX is that entire processes get sent over

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Jens Axboe
On Tue, Jan 09 2001, Alan Cox wrote: ever seen, this is why i quoted it - the talk was about block-IO performance, and Stephen said that our block IO sucks. It used to suck, but in 2.4, with the right patch from Jens, it doesnt suck anymore. ) Thats fine. Get me 128K-512K chunks nicely

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Benjamin C.R. LaHaise
On Tue, 9 Jan 2001, Ingo Molnar wrote: this is why i ment that *right now* kiobufs are not suited for networking, at least the way we do it. Maybe if kiobufs had the same kind of internal structure as sk_frag (ie. array of (page,offset,size) triples, not array of pages), that would work out

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Christoph Hellwig
On Tue, Jan 09, 2001 at 10:38:30AM -0500, Benjamin C.R. LaHaise wrote: What you're completely ignoring is that sendpages is lacking a huge amount of functionality that is *needed*. I can't implement clean async io on top of sendpages -- it'll require keeping 1 task around per outstanding io,

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Dan Hollis
On Tue, 9 Jan 2001, Ingo Molnar wrote: On Tue, 9 Jan 2001, Dan Hollis wrote: This is not what senfile() does, it sends (to a network socket) a file (from the page cache), nothing more. Ok in any case, it would be nice to have a generic sendfile() which works on any fd's - socket or

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Stephen C. Tweedie
Hi, On Tue, Jan 09, 2001 at 12:30:39PM -0500, Benjamin C.R. LaHaise wrote: On Tue, 9 Jan 2001, Ingo Molnar wrote: this is why i ment that *right now* kiobufs are not suited for networking, at least the way we do it. Maybe if kiobufs had the same kind of internal structure as sk_frag

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Andrea Arcangeli
On Tue, Jan 09, 2001 at 09:12:04PM +0100, Jens Axboe wrote: I haven't heard anything beyond the raised QUEUE_NR_REQUEST, so I'd like to see what you have pending so we can merge :-). The tiotest seek increase was mainly due to the elevator having 3000 requests to juggle and thus being able to

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Jens Axboe
On Wed, Jan 10 2001, Andrea Arcangeli wrote: On Tue, Jan 09, 2001 at 09:12:04PM +0100, Jens Axboe wrote: I haven't heard anything beyond the raised QUEUE_NR_REQUEST, so I'd like to see what you have pending so we can merge :-). The tiotest seek increase was mainly due to the elevator

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Stephen C. Tweedie
Hi, On Tue, Jan 09, 2001 at 05:16:40PM +0100, Ingo Molnar wrote: On Tue, 9 Jan 2001, Stephen C. Tweedie wrote: i'm talking about kiovecs not kiobufs (because those are equivalent to a fragmented packet - every packet fragment can be anywhere). Initializing a kiovec involves touching a

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread David S. Miller
Date: Tue, 9 Jan 2001 15:17:25 + From: "Stephen C. Tweedie" [EMAIL PROTECTED] Jes has also got hard numbers for the performance advantages of jumbograms on some of the networks he's been using, and you ain't going to get udp jumbograms through a page-by-page API, ever. Again,

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Tue, 9 Jan 2001, Benjamin C.R. LaHaise wrote: Do the math again: for transmitting a single page in a kiobuf only 64 bytes needs to be initialized. If map_array is moved to the end of the structure, that's all contiguous data and is a single cacheline. but you are comparing apples to

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Tue, 9 Jan 2001, Stephen C. Tweedie wrote: we do have SLAB [which essentially caches structures, on a per-CPU basis] which i did take into account, but still, initializing a 600+ byte kiovec is probably more work than the rest of sending a packet! I mean i'd love to eliminate the

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Tue, 9 Jan 2001, Jens Axboe wrote: ever seen, this is why i quoted it - the talk was about block-IO performance, and Stephen said that our block IO sucks. It used to suck, but in 2.4, with the right patch from Jens, it doesnt suck anymore. ) Thats fine. Get me 128K-512K chunks

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread J Sloan
Alan Cox wrote: it might not be important to others, but we do hold one particular SPECweb99 world record: on 2-way, 2 GB RAM, testing a load with a full And its real world value is exactly the same as the mindcraft NT values. Don't forget that. In other words, devastating. jjs - To

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread David S. Miller
Date:Tue, 9 Jan 2001 17:14:33 -0800 (PST) From: Dave Zarzycki [EMAIL PROTECTED] On Tue, 9 Jan 2001, Ingo Molnar wrote: then you'll love the zerocopy patch :-) Just use sendfile() or specify MSG_NOCOPY to sendmsg(), and you'll see effective memory-to-card

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread David S. Miller
Date: Tue, 9 Jan 2001 16:27:49 +0100 (CET) From: Trond Myklebust [EMAIL PROTECTED] OK, but can you eventually generalize it to non-stream protocols (i.e. UDP)? Sure, this is what MSG_MORE is meant to accomodate. UDP could support it just fine. Later, David S. Miller [EMAIL

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Stephen C. Tweedie
Hi, On Tue, Jan 09, 2001 at 04:00:34PM +0100, Ingo Molnar wrote: On Tue, 9 Jan 2001, Stephen C. Tweedie wrote: we do have SLAB [which essentially caches structures, on a per-CPU basis] which i did take into account, but still, initializing a 600+ byte kiovec is probably more work than

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Christoph Hellwig
On Tue, Jan 09, 2001 at 01:26:44PM -0800, Linus Torvalds wrote: On Tue, 9 Jan 2001, Christoph Hellwig wrote: Look at sendfile(). You do NOT have a "bunch" of pages. Sendfile() is very much a page-at-a-time thing, and expects the actual IO layers to do it's own

storage over IP (was Re: [PLEASE-TESTME] Zerocopy networking patch,2.4.0-1)

2001-01-09 Thread dean gaudet
On Tue, 9 Jan 2001, Ingo Molnar wrote: On Mon, 8 Jan 2001, Rik van Riel wrote: Having proper kiobuf support would make it possible to, for example, do zerocopy network-disk data transfers and lots of other things. i used to think that this is useful, but these days it isnt. this seems

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Tue, 9 Jan 2001, Alan Cox wrote: We have already shown that the IO-plugging API sucks, I'm afraid. it might not be important to others, but we do hold one particular SPECweb99 world record: on 2-way, 2 GB RAM, testing a load with a full And its real world value is exactly the same

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Alan Cox
We have already shown that the IO-plugging API sucks, I'm afraid. it might not be important to others, but we do hold one particular SPECweb99 world record: on 2-way, 2 GB RAM, testing a load with a full And its real world value is exactly the same as the mindcraft NT values. Don't forget

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Linus Torvalds
On Wed, 10 Jan 2001, Christoph Hellwig wrote: Simple. Because I stated before that I DON'T even want the networking to use kiobufs in lower layers. My whole argument is to pass a kiovec into the fileop instead of a page, because it makes sense for other drivers to use multiple pages,

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Ingo Molnar
On Tue, 9 Jan 2001, Andrea Arcangeli wrote: BTW, I noticed what is left in blk-13B seems to be my work (Jens's fixes for merging when the I/O queue is full are just been integrated in test1x). [...] it was Jens' [i think those were implemented by Jens entirely] batch-freeing changes that

Re: storage over IP (was Re: [PLEASE-TESTME] Zerocopy networking patch,2.4.0-1)

2001-01-09 Thread David S. Miller
Date: Tue, 9 Jan 2001 18:56:33 -0800 (PST) From: dean gaudet [EMAIL PROTECTED] is NFS receive single copy today? With the zerocopy patches, NFS client receive is "single cpu copy" if that's what you mean. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Alan Cox
ever seen, this is why i quoted it - the talk was about block-IO performance, and Stephen said that our block IO sucks. It used to suck, but in 2.4, with the right patch from Jens, it doesnt suck anymore. ) Thats fine. Get me 128K-512K chunks nicely streaming into my raid controller and I'll

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Benjamin C.R. LaHaise
On Tue, 9 Jan 2001, Ingo Molnar wrote: On Tue, 9 Jan 2001, Stephen C. Tweedie wrote: please study the networking portions of the zerocopy patch and you'll see why this is not desirable. An alloc_kiovec()/free_kiovec() is exactly the thing we cannot afford in a sendfile() operation.

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread David S. Miller
Date:Tue, 9 Jan 2001 11:14:05 -0800 (PST) From: Dan Hollis [EMAIL PROTECTED] Just extend sendfile to allow any fd to any fd. sendfile already does file-socket and file-file. It only needs to be extended to do socket-file. This is not what senfile() does, it sends (to a

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Dan Hollis
On Tue, 9 Jan 2001, David S. Miller wrote: Just extend sendfile to allow any fd to any fd. sendfile already does file-socket and file-file. It only needs to be extended to do socket-file. This is not what senfile() does, it sends (to a network socket) a file (from the page cache),

<    1   2   3   >