subject:"RE\: sendfile\+zerocopy\: fairly sexy \(nothing to do with ECN\)"

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-02-02 Thread James Sutherland

On Fri, 2 Feb 2001, David Lang wrote:

> Thanks, that info on sendfile makes sense for the fileserver situation.
> for webservers we will have to see (many/most CGI's look at stuff from the
> client so I still have doubts as to how much use cacheing will be)

CGI performance isn't directly affected by this - the whole point is to
reduce the "cost" of handling static requests to zero (at least, as close
as possible) leaving as much CPU as possible for the CGI to use.

So sendfile won't help your CGI directly - it will just give your CGI more
resources to work with.

James.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-02-02 Thread David S. Miller

David Lang writes:
 > right, assuming that there is enough sendfile() benifit to overcome the
 > write() penalty from the stuff that can't be cached or sent from a file.
 > 
 > my question was basicly are there enough places where sendfile would
 > actually be used to make it a net gain.

There are non-performance issues as well (really, all of these points
have been mentioned in this thread btw).  One is that since paged
SKBs use only single-order page allocations, the memory allocation
subsystem is stressed less than the current scheme where SLAB
allocates multi-order pages to satisfy allocations of linear SKB data
buffers.

This has consequences and benefits system wide.

Later,
David S. Miller
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-02-02 Thread Jeff Barrow

Let's see all the work being done for clustering would definitely
benefit... all the static images on your webserver--and static images
makes up most of the bandwidth from web servers (images, activeX controls,
java apps, sound clips...)... NFS servers, Samba servers (both of which
are used more than you may think)... email servers...

Once Real Networks patches their Realserver to use sendfile (which
shouldn't bee all that hard), then that would help too

I think that sendfile can be used in a LOT of applications, and the only
ones that wouldn't benefit are mostly low-bandwidth anyway (CGI apps
almost always return either a small html file or a small image file, then
there's telnet and other interactive utilities...).

Most applications that use a lot of bandwidth (and thus a lot of CPU time
sending the packets) are capable of being patched to use sendfile.

On Fri, 2 Feb 2001, David Lang wrote:

> right, assuming that there is enough sendfile() benifit to overcome the
> write() penalty from the stuff that can't be cached or sent from a file.
> 
> my question was basicly are there enough places where sendfile would
> actually be used to make it a net gain.
> 
> David Lang
> 
> On Fri, 2 Feb 2001, David S. Miller wrote:
> 
> > Date: Fri, 2 Feb 2001 15:09:13 -0800 (PST)
> > From: David S. Miller <[EMAIL PROTECTED]>
> > To: David Lang <[EMAIL PROTECTED]>
> > Cc: Andrew Morton <[EMAIL PROTECTED]>, lkml <[EMAIL PROTECTED]>,
> >  "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> > Subject: Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)
> >
> >
> > David Lang writes:
> >  > Thanks, that info on sendfile makes sense for the fileserver situation.
> >  > for webservers we will have to see (many/most CGI's look at stuff from the
> >  > client so I still have doubts as to how much use cacheing will be)
> >
> > Also note that the decreased CPU utilization resulting from
> > zerocopy sendfile leaves more CPU available for CGI execution.
> >
> > This was a point I forgot to make.
> >
> > Later,
> > David S. Miller
> > [EMAIL PROTECTED]
> >
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-02-02 Thread David Lang


right, assuming that there is enough sendfile() benifit to overcome the
write() penalty from the stuff that can't be cached or sent from a file.

my question was basicly are there enough places where sendfile would
actually be used to make it a net gain.

David Lang

On Fri, 2 Feb 2001, David S. Miller wrote:

> Date: Fri, 2 Feb 2001 15:09:13 -0800 (PST)
> From: David S. Miller <[EMAIL PROTECTED]>
> To: David Lang <[EMAIL PROTECTED]>
> Cc: Andrew Morton <[EMAIL PROTECTED]>, lkml <[EMAIL PROTECTED]>,
>  "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> Subject: Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)
>
>
> David Lang writes:
>  > Thanks, that info on sendfile makes sense for the fileserver situation.
>  > for webservers we will have to see (many/most CGI's look at stuff from the
>  > client so I still have doubts as to how much use cacheing will be)
>
> Also note that the decreased CPU utilization resulting from
> zerocopy sendfile leaves more CPU available for CGI execution.
>
> This was a point I forgot to make.
>
> Later,
> David S. Miller
> [EMAIL PROTECTED]
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-02-02 Thread David S. Miller

David Lang writes:
 > Thanks, that info on sendfile makes sense for the fileserver situation.
 > for webservers we will have to see (many/most CGI's look at stuff from the
 > client so I still have doubts as to how much use cacheing will be)

Also note that the decreased CPU utilization resulting from
zerocopy sendfile leaves more CPU available for CGI execution.

This was a point I forgot to make.

Later,
David S. Miller
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-02-02 Thread David Lang


Thanks, that info on sendfile makes sense for the fileserver situation.
for webservers we will have to see (many/most CGI's look at stuff from the
client so I still have doubts as to how much use cacheing will be)

David Lang

On Fri, 2 Feb 2001, David S. Miller wrote:

> Date: Fri, 2 Feb 2001 14:46:07 -0800 (PST)
> From: David S. Miller <[EMAIL PROTECTED]>
> To: David Lang <[EMAIL PROTECTED]>
> Cc: Andrew Morton <[EMAIL PROTECTED]>, lkml <[EMAIL PROTECTED]>,
>  "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> Subject: Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)
>
>
> David Lang writes:
>  > 1a. for webservers that server static content (and can therefor use
>  > sendfile) I don't see this as significant becouse as your tests have been
>  > showing, even a modest machine can saturate your network (unless you are
>  > useing gigE at which time it takes a skightly larger machine)
>
> Start using more than one interface, then it begins to become
> interesting.
>
>  > 1b. for webservers that are not primarily serving static content, they
>  > have to use write() for the output from cgi's, etc and therefor pay the
>  > performance penalty without being able to use sendfile() much to get the
>  > advantages. These machines are the ones that really need the performance
>  > as the cgi's take a significant amount of your cpu.
>
> CGI's can be cached btw if the implementation is clever (f.e. CGI
> tells the web server that if the file used as input to the CGI does
> not change then the output from the CGI will not change, meaning CGI
> output is based solely on input, therefore CGI output can be cached
> by the web server).
>
>  > 2. for other fileservers sendfile() sounds like it would be useful if the
>  > client is reading the entire file, but what about the cases where the
>  > client is reading part of the file, or is writing to the file. In both of
>  > these cases it seems that the fileserver is back to the write() penalty.
>  > does anyone have stats on the types of requests that fileservers are being
>  > asked for?
>
> It helps no matter what part of the file the client reads.
>
> sendfile() can be used on an arbitrary offset+len portion of
> a file, it is not limited to just sending an entire fire.
>
> Later,
> David S. Miller
> [EMAIL PROTECTED]
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-02-02 Thread David S. Miller

David Lang writes:
 > 1a. for webservers that server static content (and can therefor use
 > sendfile) I don't see this as significant becouse as your tests have been
 > showing, even a modest machine can saturate your network (unless you are
 > useing gigE at which time it takes a skightly larger machine)

Start using more than one interface, then it begins to become
interesting.

 > 1b. for webservers that are not primarily serving static content, they
 > have to use write() for the output from cgi's, etc and therefor pay the
 > performance penalty without being able to use sendfile() much to get the
 > advantages. These machines are the ones that really need the performance
 > as the cgi's take a significant amount of your cpu.

CGI's can be cached btw if the implementation is clever (f.e. CGI
tells the web server that if the file used as input to the CGI does
not change then the output from the CGI will not change, meaning CGI
output is based solely on input, therefore CGI output can be cached
by the web server).

 > 2. for other fileservers sendfile() sounds like it would be useful if the
 > client is reading the entire file, but what about the cases where the
 > client is reading part of the file, or is writing to the file. In both of
 > these cases it seems that the fileserver is back to the write() penalty.
 > does anyone have stats on the types of requests that fileservers are being
 > asked for?

It helps no matter what part of the file the client reads.

sendfile() can be used on an arbitrary offset+len portion of
a file, it is not limited to just sending an entire fire.

Later,
David S. Miller
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-02-02 Thread David Lang


I have been watching this thread with interest for a while now, but am
wondering about the real-world use of this, given the performance penalty
for write()

As I see it there are two basic cases you are saying this will help in.

1. webservers

2. other fileservers

I also freely admit that I don't know a lot about sendfile() so it may
have some capability that makes my concerns meaningless, if so please let
me know.

1a. for webservers that server static content (and can therefor use
sendfile) I don't see this as significant becouse as your tests have been
showing, even a modest machine can saturate your network (unless you are
useing gigE at which time it takes a skightly larger machine)

1b. for webservers that are not primarily serving static content, they
have to use write() for the output from cgi's, etc and therefor pay the
performance penalty without being able to use sendfile() much to get the
advantages. These machines are the ones that really need the performance
as the cgi's take a significant amount of your cpu.

2. for other fileservers sendfile() sounds like it would be useful if the
client is reading the entire file, but what about the cases where the
client is reading part of the file, or is writing to the file. In both of
these cases it seems that the fileserver is back to the write() penalty.
does anyone have stats on the types of requests that fileservers are being
asked for?

David Lang



 On Fri, 2 Feb 2001, Andrew Morton wrote:

> Date: Fri, 02 Feb 2001 21:12:50 +1100
> From: Andrew Morton <[EMAIL PROTECTED]>
> To: David S. Miller <[EMAIL PROTECTED]>
> Cc: lkml <[EMAIL PROTECTED]>,
>  "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> Subject: Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)
>
> "David S. Miller" wrote:
> >
> > ...
> > Finally, please do some tests on loopback.  It is usually a great
> > way to get "pure software overhead" measurements of our TCP stack.
>
> Here we are.  TCP and NFS/UDP over lo.
>
> Machine is a dual-PII.  I didn't bother running CPU utilisation
> testing while benchmarking loopback, although this may be of
> some interest for SMP.  I just looked at the throughput.
>
> Machine is a dual 500MHz PII (again).  Memory read bandwidth
> is 320 meg/sec.  Write b/w is 130 meg/sec.  The working set
> is 60 ~300k files, everything cached. We run the following
> tests:
>
> 1: sendfile() to localhost, sender and receiver pinned to
>separate CPUs
>
> 2: sendfile() to localhost, sender and receiver pinned to
>the same CPU
>
> 3: sendfile() to localhost, no explicit pinning.
>
> 4, 5, 6: same as above, except we use send() in 8kbyte
>chunks.
>
> Repeat with and without zerocopy patch 2.4.1-2.
>
> The receiver reads 64k hunks and throws them away. sendfile()
> sends the entire file.
>
> Also, do an NFS mount of localhost, rsize=wsize=8192, see how
> long it takes to `cp' a 100 meg file from the "server" to
> /dev/null.  The file is cached on the "server".  Do this for
> the three pinning cases as well - all the NFS kernel processes
> were pinned as a group and `cp' was the other group.
>
>
> sendfile() send(8k)   NFS
>  Mbyte/s   Mbyte/s   Mbyte/s
>
> No explicit bonding
>   2.4.1:  666007 25600
>   2.4.1-zc:  20800069000 25000
>
> Bond client and server to separate CPUs
>   2.4.1:  6670068000 27800
>   2.4.1-zc:  21304766000 25700
>
> Bond client and server to same CPU:
>   2.4.1:  5600057000 23300
>   2.4.1-zc:  17600055000 22100
>
>
>
> Much the same story.  Big increase in sendfile() efficiency,
> small drop in send() and NFS unchanged.
>
> The relative increase in sendfile() efficiency is much higher
> than with a real NIC, presumably because we've factored out
> the constant (and large) cost of the device driver.
>
> All the bits and pieces to reproduce this are at
>
>   http://www.uow.edu.au/~andrewm/linux/#zc
>
> -
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-02-02 Thread Trond Myklebust

> " " == Andrew Morton <[EMAIL PROTECTED]> writes:

 > Much the same story.  Big increase in sendfile() efficiency,
 > small drop in send() and NFS unchanged.

This is normal. The server doesn't do zero copy reads, but instead
copies from the page cache into an NFS-specific buffer using
file.f_op->read(). Alexey and Dave's changes are therefore unlikely to
register on NFS performance (other than on CPU use as has been
mentioned before) until we implement a sendfile-like scheme for knfsd
over TCP.
I've been wanting to start doing that (and also to finish the client
conversion to use the TCP zero-copy), but I'm pretty pressed for time
at the moment.

Cheers,
  Trond
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-02-02 Thread Andrew Morton

"David S. Miller" wrote:
> 
> ...
> Finally, please do some tests on loopback.  It is usually a great
> way to get "pure software overhead" measurements of our TCP stack.

Here we are.  TCP and NFS/UDP over lo.

Machine is a dual-PII.  I didn't bother running CPU utilisation
testing while benchmarking loopback, although this may be of
some interest for SMP.  I just looked at the throughput.

Machine is a dual 500MHz PII (again).  Memory read bandwidth
is 320 meg/sec.  Write b/w is 130 meg/sec.  The working set
is 60 ~300k files, everything cached. We run the following
tests:

1: sendfile() to localhost, sender and receiver pinned to
   separate CPUs

2: sendfile() to localhost, sender and receiver pinned to
   the same CPU

3: sendfile() to localhost, no explicit pinning.

4, 5, 6: same as above, except we use send() in 8kbyte
   chunks.

Repeat with and without zerocopy patch 2.4.1-2.

The receiver reads 64k hunks and throws them away. sendfile()
sends the entire file.

Also, do an NFS mount of localhost, rsize=wsize=8192, see how
long it takes to `cp' a 100 meg file from the "server" to
/dev/null.  The file is cached on the "server".  Do this for
the three pinning cases as well - all the NFS kernel processes
were pinned as a group and `cp' was the other group.

sendfile() send(8k)   NFS
 Mbyte/s   Mbyte/s   Mbyte/s

No explicit bonding
  2.4.1:  666007 25600
  2.4.1-zc:  20800069000 25000

Bond client and server to separate CPUs
  2.4.1:  6670068000 27800
  2.4.1-zc:  21304766000 25700

Bond client and server to same CPU:
  2.4.1:  5600057000 23300
  2.4.1-zc:  17600055000 22100

Much the same story.  Big increase in sendfile() efficiency,
small drop in send() and NFS unchanged.

The relative increase in sendfile() efficiency is much higher
than with a real NIC, presumably because we've factored out
the constant (and large) cost of the device driver.

All the bits and pieces to reproduce this are at

http://www.uow.edu.au/~andrewm/linux/#zc

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-02-02 Thread Andrew Morton


"David S. Miller" wrote:
 
 ...
 Finally, please do some tests on loopback.  It is usually a great
 way to get "pure software overhead" measurements of our TCP stack.

Here we are.  TCP and NFS/UDP over lo.

Machine is a dual-PII.  I didn't bother running CPU utilisation
testing while benchmarking loopback, although this may be of
some interest for SMP.  I just looked at the throughput.

Machine is a dual 500MHz PII (again).  Memory read bandwidth
is 320 meg/sec.  Write b/w is 130 meg/sec.  The working set
is 60 ~300k files, everything cached. We run the following
tests:

1: sendfile() to localhost, sender and receiver pinned to
   separate CPUs

2: sendfile() to localhost, sender and receiver pinned to
   the same CPU

3: sendfile() to localhost, no explicit pinning.

4, 5, 6: same as above, except we use send() in 8kbyte
   chunks.

Repeat with and without zerocopy patch 2.4.1-2.

The receiver reads 64k hunks and throws them away. sendfile()
sends the entire file.

Also, do an NFS mount of localhost, rsize=wsize=8192, see how
long it takes to `cp' a 100 meg file from the "server" to
/dev/null.  The file is cached on the "server".  Do this for
the three pinning cases as well - all the NFS kernel processes
were pinned as a group and `cp' was the other group.


sendfile() send(8k)   NFS
 Mbyte/s   Mbyte/s   Mbyte/s

No explicit bonding
  2.4.1:  666007 25600
  2.4.1-zc:  20800069000 25000

Bond client and server to separate CPUs
  2.4.1:  6670068000 27800
  2.4.1-zc:  21304766000 25700

Bond client and server to same CPU:
  2.4.1:  5600057000 23300
  2.4.1-zc:  17600055000 22100



Much the same story.  Big increase in sendfile() efficiency,
small drop in send() and NFS unchanged.

The relative increase in sendfile() efficiency is much higher
than with a real NIC, presumably because we've factored out
the constant (and large) cost of the device driver.

All the bits and pieces to reproduce this are at

http://www.uow.edu.au/~andrewm/linux/#zc

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-02-02 Thread Trond Myklebust


 " " == Andrew Morton [EMAIL PROTECTED] writes:


  Much the same story.  Big increase in sendfile() efficiency,
  small drop in send() and NFS unchanged.

This is normal. The server doesn't do zero copy reads, but instead
copies from the page cache into an NFS-specific buffer using
file.f_op-read(). Alexey and Dave's changes are therefore unlikely to
register on NFS performance (other than on CPU use as has been
mentioned before) until we implement a sendfile-like scheme for knfsd
over TCP.
I've been wanting to start doing that (and also to finish the client
conversion to use the TCP zero-copy), but I'm pretty pressed for time
at the moment.

Cheers,
  Trond
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-02-02 Thread David Lang


I have been watching this thread with interest for a while now, but am
wondering about the real-world use of this, given the performance penalty
for write()

As I see it there are two basic cases you are saying this will help in.

1. webservers

2. other fileservers

I also freely admit that I don't know a lot about sendfile() so it may
have some capability that makes my concerns meaningless, if so please let
me know.

1a. for webservers that server static content (and can therefor use
sendfile) I don't see this as significant becouse as your tests have been
showing, even a modest machine can saturate your network (unless you are
useing gigE at which time it takes a skightly larger machine)

1b. for webservers that are not primarily serving static content, they
have to use write() for the output from cgi's, etc and therefor pay the
performance penalty without being able to use sendfile() much to get the
advantages. These machines are the ones that really need the performance
as the cgi's take a significant amount of your cpu.

2. for other fileservers sendfile() sounds like it would be useful if the
client is reading the entire file, but what about the cases where the
client is reading part of the file, or is writing to the file. In both of
these cases it seems that the fileserver is back to the write() penalty.
does anyone have stats on the types of requests that fileservers are being
asked for?

David Lang



 On Fri, 2 Feb 2001, Andrew Morton wrote:

 Date: Fri, 02 Feb 2001 21:12:50 +1100
 From: Andrew Morton [EMAIL PROTECTED]
 To: David S. Miller [EMAIL PROTECTED]
 Cc: lkml [EMAIL PROTECTED],
  "[EMAIL PROTECTED]" [EMAIL PROTECTED]
 Subject: Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

 "David S. Miller" wrote:
 
  ...
  Finally, please do some tests on loopback.  It is usually a great
  way to get "pure software overhead" measurements of our TCP stack.

 Here we are.  TCP and NFS/UDP over lo.

 Machine is a dual-PII.  I didn't bother running CPU utilisation
 testing while benchmarking loopback, although this may be of
 some interest for SMP.  I just looked at the throughput.

 Machine is a dual 500MHz PII (again).  Memory read bandwidth
 is 320 meg/sec.  Write b/w is 130 meg/sec.  The working set
 is 60 ~300k files, everything cached. We run the following
 tests:

 1: sendfile() to localhost, sender and receiver pinned to
separate CPUs

 2: sendfile() to localhost, sender and receiver pinned to
the same CPU

 3: sendfile() to localhost, no explicit pinning.

 4, 5, 6: same as above, except we use send() in 8kbyte
chunks.

 Repeat with and without zerocopy patch 2.4.1-2.

 The receiver reads 64k hunks and throws them away. sendfile()
 sends the entire file.

 Also, do an NFS mount of localhost, rsize=wsize=8192, see how
 long it takes to `cp' a 100 meg file from the "server" to
 /dev/null.  The file is cached on the "server".  Do this for
 the three pinning cases as well - all the NFS kernel processes
 were pinned as a group and `cp' was the other group.


 sendfile() send(8k)   NFS
  Mbyte/s   Mbyte/s   Mbyte/s

 No explicit bonding
   2.4.1:  666007 25600
   2.4.1-zc:  20800069000 25000

 Bond client and server to separate CPUs
   2.4.1:  6670068000 27800
   2.4.1-zc:  21304766000 25700

 Bond client and server to same CPU:
   2.4.1:  5600057000 23300
   2.4.1-zc:  17600055000 22100



 Much the same story.  Big increase in sendfile() efficiency,
 small drop in send() and NFS unchanged.

 The relative increase in sendfile() efficiency is much higher
 than with a real NIC, presumably because we've factored out
 the constant (and large) cost of the device driver.

 All the bits and pieces to reproduce this are at

   http://www.uow.edu.au/~andrewm/linux/#zc

 -
 -
 To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 the body of a message to [EMAIL PROTECTED]
 Please read the FAQ at http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-02-02 Thread David S. Miller



David Lang writes:
  1a. for webservers that server static content (and can therefor use
  sendfile) I don't see this as significant becouse as your tests have been
  showing, even a modest machine can saturate your network (unless you are
  useing gigE at which time it takes a skightly larger machine)

Start using more than one interface, then it begins to become
interesting.

  1b. for webservers that are not primarily serving static content, they
  have to use write() for the output from cgi's, etc and therefor pay the
  performance penalty without being able to use sendfile() much to get the
  advantages. These machines are the ones that really need the performance
  as the cgi's take a significant amount of your cpu.

CGI's can be cached btw if the implementation is clever (f.e. CGI
tells the web server that if the file used as input to the CGI does
not change then the output from the CGI will not change, meaning CGI
output is based solely on input, therefore CGI output can be cached
by the web server).

  2. for other fileservers sendfile() sounds like it would be useful if the
  client is reading the entire file, but what about the cases where the
  client is reading part of the file, or is writing to the file. In both of
  these cases it seems that the fileserver is back to the write() penalty.
  does anyone have stats on the types of requests that fileservers are being
  asked for?

It helps no matter what part of the file the client reads.

sendfile() can be used on an arbitrary offset+len portion of
a file, it is not limited to just sending an entire fire.

Later,
David S. Miller
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-02-02 Thread David Lang


Thanks, that info on sendfile makes sense for the fileserver situation.
for webservers we will have to see (many/most CGI's look at stuff from the
client so I still have doubts as to how much use cacheing will be)

David Lang

On Fri, 2 Feb 2001, David S. Miller wrote:

 Date: Fri, 2 Feb 2001 14:46:07 -0800 (PST)
 From: David S. Miller [EMAIL PROTECTED]
 To: David Lang [EMAIL PROTECTED]
 Cc: Andrew Morton [EMAIL PROTECTED], lkml [EMAIL PROTECTED],
  "[EMAIL PROTECTED]" [EMAIL PROTECTED]
 Subject: Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)


 David Lang writes:
   1a. for webservers that server static content (and can therefor use
   sendfile) I don't see this as significant becouse as your tests have been
   showing, even a modest machine can saturate your network (unless you are
   useing gigE at which time it takes a skightly larger machine)

 Start using more than one interface, then it begins to become
 interesting.

   1b. for webservers that are not primarily serving static content, they
   have to use write() for the output from cgi's, etc and therefor pay the
   performance penalty without being able to use sendfile() much to get the
   advantages. These machines are the ones that really need the performance
   as the cgi's take a significant amount of your cpu.

 CGI's can be cached btw if the implementation is clever (f.e. CGI
 tells the web server that if the file used as input to the CGI does
 not change then the output from the CGI will not change, meaning CGI
 output is based solely on input, therefore CGI output can be cached
 by the web server).

   2. for other fileservers sendfile() sounds like it would be useful if the
   client is reading the entire file, but what about the cases where the
   client is reading part of the file, or is writing to the file. In both of
   these cases it seems that the fileserver is back to the write() penalty.
   does anyone have stats on the types of requests that fileservers are being
   asked for?

 It helps no matter what part of the file the client reads.

 sendfile() can be used on an arbitrary offset+len portion of
 a file, it is not limited to just sending an entire fire.

 Later,
 David S. Miller
 [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-02-02 Thread David S. Miller



David Lang writes:
  Thanks, that info on sendfile makes sense for the fileserver situation.
  for webservers we will have to see (many/most CGI's look at stuff from the
  client so I still have doubts as to how much use cacheing will be)

Also note that the decreased CPU utilization resulting from
zerocopy sendfile leaves more CPU available for CGI execution.

This was a point I forgot to make.

Later,
David S. Miller
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-02-02 Thread David Lang


right, assuming that there is enough sendfile() benifit to overcome the
write() penalty from the stuff that can't be cached or sent from a file.

my question was basicly are there enough places where sendfile would
actually be used to make it a net gain.

David Lang

On Fri, 2 Feb 2001, David S. Miller wrote:

 Date: Fri, 2 Feb 2001 15:09:13 -0800 (PST)
 From: David S. Miller [EMAIL PROTECTED]
 To: David Lang [EMAIL PROTECTED]
 Cc: Andrew Morton [EMAIL PROTECTED], lkml [EMAIL PROTECTED],
  "[EMAIL PROTECTED]" [EMAIL PROTECTED]
 Subject: Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)


 David Lang writes:
   Thanks, that info on sendfile makes sense for the fileserver situation.
   for webservers we will have to see (many/most CGI's look at stuff from the
   client so I still have doubts as to how much use cacheing will be)

 Also note that the decreased CPU utilization resulting from
 zerocopy sendfile leaves more CPU available for CGI execution.

 This was a point I forgot to make.

 Later,
 David S. Miller
 [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-02-02 Thread Jeff Barrow



Let's see all the work being done for clustering would definitely
benefit... all the static images on your webserver--and static images
makes up most of the bandwidth from web servers (images, activeX controls,
java apps, sound clips...)... NFS servers, Samba servers (both of which
are used more than you may think)... email servers...

Once Real Networks patches their Realserver to use sendfile (which
shouldn't bee all that hard), then that would help too

I think that sendfile can be used in a LOT of applications, and the only
ones that wouldn't benefit are mostly low-bandwidth anyway (CGI apps
almost always return either a small html file or a small image file, then
there's telnet and other interactive utilities...).

Most applications that use a lot of bandwidth (and thus a lot of CPU time
sending the packets) are capable of being patched to use sendfile.


On Fri, 2 Feb 2001, David Lang wrote:

 right, assuming that there is enough sendfile() benifit to overcome the
 write() penalty from the stuff that can't be cached or sent from a file.
 
 my question was basicly are there enough places where sendfile would
 actually be used to make it a net gain.
 
 David Lang
 
 On Fri, 2 Feb 2001, David S. Miller wrote:
 
  Date: Fri, 2 Feb 2001 15:09:13 -0800 (PST)
  From: David S. Miller [EMAIL PROTECTED]
  To: David Lang [EMAIL PROTECTED]
  Cc: Andrew Morton [EMAIL PROTECTED], lkml [EMAIL PROTECTED],
   "[EMAIL PROTECTED]" [EMAIL PROTECTED]
  Subject: Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)
 
 
  David Lang writes:
Thanks, that info on sendfile makes sense for the fileserver situation.
for webservers we will have to see (many/most CGI's look at stuff from the
client so I still have doubts as to how much use cacheing will be)
 
  Also note that the decreased CPU utilization resulting from
  zerocopy sendfile leaves more CPU available for CGI execution.
 
  This was a point I forgot to make.
 
  Later,
  David S. Miller
  [EMAIL PROTECTED]
 
 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-02-02 Thread James Sutherland


On Fri, 2 Feb 2001, David Lang wrote:

 Thanks, that info on sendfile makes sense for the fileserver situation.
 for webservers we will have to see (many/most CGI's look at stuff from the
 client so I still have doubts as to how much use cacheing will be)

CGI performance isn't directly affected by this - the whole point is to
reduce the "cost" of handling static requests to zero (at least, as close
as possible) leaving as much CPU as possible for the CGI to use.

So sendfile won't help your CGI directly - it will just give your CGI more
resources to work with.


James.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-31 Thread Pavel Machek


Hi!

> Advantage of Tulip and AMD is that they perform much better in my experience
> on half duplex Ethernet than other cards because they a modified 
> patented backoff scheme. Without it Linux 2.1+ tends to suffer badly from
> ethernet congestion by colliding with the own acks, probably because it 
> sends too fast.

Is that real problem? If so, some strategic delay loop should do the
trick...
Pavel
-- 
I'm [EMAIL PROTECTED] "In my country we have almost anarchy and I don't care."
Panos Katsaloulis describing me w.r.t. patents at [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Still not sexy! (Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-31 Thread Malcolm Beattie

Ingo Molnar writes:
> 
> On Tue, 30 Jan 2001, jamal wrote:
> 
> > > - is this UDP or TCP based? (UDP i guess)
> > >
> > TCP
> 
> well then i'd suggest to do:
> 
>   echo 10 10 10 > /proc/sys/net/ipv4/tcp_wmem
> 
> does this make any difference?

For the last week I've been benchmarking Linux network and I/O on a
couple of machines with 3c985 gigabit cards and some other stuff
(see below). One of the things I tried yesterday was a beta test
version of a secure ftpd written by Chris Evans which happens to use
sendfile() making it a convenient extra benchmark. I'd already put
net.core.{r,w}mem_max up to 262144 for the sake of gensink and other
benchmarks which raise SO_{SND,RCV}BUF. I hadn't however, tried
raising tcp_wmem as per your suggestion above.

Currently the systems are linked back to back with fibre with jumbo
frames (MTU 9000) on and running pure kernel 2.4.1. I transferred a 300
MByte file repeatedly from the server to the client with an ftp "get"
client-side. The file will have been completely in page cache on the
server (both machines have 512MB RAM) and was written to /dev/null on
the client side. (Yes, I checked the client was doing ordinary
read/write and not throwing it away).

Without the raised tcp_wmem setting I was getting 81 MByte/s.
With tcp_wmem set as above I got 86 MByte/s. Nice increase. Any other
setting I can tweak apart from {r,w}mem_max and tcp_{w,r}mem? The CPU
on the client (350 MHz PII) is the bottleneck: gensink4 maxes out at
69 Mbyte/s pulling TCP from the server and 94 Mbyte/s pushing. (The
other system, 733 MHz PIII pushes >100MByte/s UDP with ttcp but the
client drops most of it).

I'll be following up Dave Miller's "please benchmark zerocopy"
request when I've got some more numbers written down since I've only
just put the zerocopy patch in and haven't rebooted yet.

If anyone wants any other specific benchmarks done (I/O or network)
I may get some time to do them: the PIII system has an 8-port
Escalade card with 8 x 46GB disks (117 MByte/s block writes as
measured by Bonnie on a RAID1/0 mixed RAIDset) and there are also
four dual-port eepro fast ethernet cards, a Cisco 8-port 3508G gigabit
switch and a 24-port 3524 fast ethernet switch (gigastack linked to
the 3508G).  I'm benchmarking and looking into the possibility of a DIY
NAS or SAN-type thing.

--Malcolm

-- 
Malcolm Beattie <[EMAIL PROTECTED]>
Unix Systems Programmer
Oxford University Computing Services
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-31 Thread Pavel Machek


Hi!

 Advantage of Tulip and AMD is that they perform much better in my experience
 on half duplex Ethernet than other cards because they a modified 
 patented backoff scheme. Without it Linux 2.1+ tends to suffer badly from
 ethernet congestion by colliding with the own acks, probably because it 
 sends too fast.

Is that real problem? If so, some strategic delay loop should do the
trick...
Pavel
-- 
I'm [EMAIL PROTECTED] "In my country we have almost anarchy and I don't care."
Panos Katsaloulis describing me w.r.t. patents at [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-30 Thread Bernd Eckenfels


In article <[EMAIL PROTECTED]> you wrote:
> On Tue, Jan 30, 2001 at 02:17:57PM -0800, David S. Miller wrote:

> 8.5MB/sec sounds like half-duplex 100baseT.

> No; I'm 100% its  FD; HD gives 40k/sec TCP because of collisions and
> such like.

> Positive you are running at full duplex all the way to the
> netapp, and if so how many switches sit between you and this
> netapp?

> It's FD all the way (we hardwire everything to 100-FD and never trust
> auto-negotiate); I see no errors or such like anywhere.

> There are ...  ... 3 switches between four switches in
> between, mostly linked via GE. I'm not sure if latency might be an
> issue here, is it was critical I can imagine 10 km of glass might be
> a problem but it's not _that_ far...

>   --cw

> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/



> ---
>Achtung: diese Newsgruppe ist eine unidirektional gegatete Mailingliste.
>  Antworten nur per Mail an die im Reply-To-Header angegebene Adresse.
>Fragen zum Gateway -> [EMAIL PROTECTED]
> ---
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-30 Thread David S. Miller

Chris Wedgwood writes:
 > There are ...  ... 3 switches between four switches in
 > between, mostly linked via GE. I'm not sure if latency might be an
 > issue here, is it was critical I can imagine 10 km of glass might be
 > a problem but it's not _that_ far...

Other than this, I don't know what to postulate.  Really,
most reports and my own experimentation (directly connected
Linux knfsd to 2.4.x nfs client) supports the fact that our
client can saturate 100baseT rather fully.

Later,
David S. Miller
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-30 Thread David S. Miller

Andrew Morton writes:
 > The box has 130 mbyte/sec memory write bandwidth, so saving
 > a copy should save 10% of this.   (Wanders away, scratching
 > head...)

Are you sure your measurment program will account properly
for all system cycles spent in softnet processing?  This is
where the bulk of the cpu cycle savings will occur.

Later,
David S. Miller
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-30 Thread David S. Miller

Chris Wedgwood writes:
 > What server are you using here? Using NetApp filers I don't see
 > anything like this, probably only 8.5MB/s at most and this number is
 > fairly noisy.

8.5MB/sec sounds like half-duplex 100baseT.  Positive you are running
at full duplex all the way to the netapp, and if so how many switches
sit between you and this netapp?

Later,
David S. Miller
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-30 Thread David S. Miller

Andrew Morton writes:
 > BTW: can you suggest why I'm not observing any change in NFS client
 > efficiency?

As in "filecopy speed" or "cpu usage while copying a file"?

The current fragmentation code eliminates a full SKB allocation and
data copy on the NFS file data receive path in the client, CPU has to
be saved compared to pre-zerocopy or something is very wrong.

File copy speed, well you should be link speed limited as even without
the zerocopy patches you ought to have enough cpu to keep it busy.

Later,
David S. Miller
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-30 Thread David S. Miller



Andrew Morton writes:
  BTW: can you suggest why I'm not observing any change in NFS client
  efficiency?

As in "filecopy speed" or "cpu usage while copying a file"?

The current fragmentation code eliminates a full SKB allocation and
data copy on the NFS file data receive path in the client, CPU has to
be saved compared to pre-zerocopy or something is very wrong.

File copy speed, well you should be link speed limited as even without
the zerocopy patches you ought to have enough cpu to keep it busy.

Later,
David S. Miller
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-30 Thread David S. Miller



Chris Wedgwood writes:
  What server are you using here? Using NetApp filers I don't see
  anything like this, probably only 8.5MB/s at most and this number is
  fairly noisy.

8.5MB/sec sounds like half-duplex 100baseT.  Positive you are running
at full duplex all the way to the netapp, and if so how many switches
sit between you and this netapp?

Later,
David S. Miller
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-30 Thread David S. Miller



Andrew Morton writes:
  The box has 130 mbyte/sec memory write bandwidth, so saving
  a copy should save 10% of this.   (Wanders away, scratching
  head...)

Are you sure your measurment program will account properly
for all system cycles spent in softnet processing?  This is
where the bulk of the cpu cycle savings will occur.

Later,
David S. Miller
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-30 Thread Bernd Eckenfels


In article [EMAIL PROTECTED] you wrote:
 On Tue, Jan 30, 2001 at 02:17:57PM -0800, David S. Miller wrote:

 8.5MB/sec sounds like half-duplex 100baseT.

 No; I'm 100% its  FD; HD gives 40k/sec TCP because of collisions and
 such like.

 Positive you are running at full duplex all the way to the
 netapp, and if so how many switches sit between you and this
 netapp?

 It's FD all the way (we hardwire everything to 100-FD and never trust
 auto-negotiate); I see no errors or such like anywhere.

 There are ... pause ... 3 switches between four switches in
 between, mostly linked via GE. I'm not sure if latency might be an
 issue here, is it was critical I can imagine 10 km of glass might be
 a problem but it's not _that_ far...

   --cw

 -
 To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 the body of a message to [EMAIL PROTECTED]
 Please read the FAQ at http://www.tux.org/lkml/



 ---
Achtung: diese Newsgruppe ist eine unidirektional gegatete Mailingliste.
  Antworten nur per Mail an die im Reply-To-Header angegebene Adresse.
Fragen zum Gateway - [EMAIL PROTECTED]
 ---
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-29 Thread David S. Miller



The "more expensive" write/send in zerocopy is a known cost of paged
SKBs.  This cost may be decreased a bit with some fine tuning, but not
eliminated entirely.

What do we get for this cost?

Basically, the big win is not that the card checksums the packet.
We could get that for free while copying the data from userspace
into the kernel pages during the sendmsg(), using the combined
"copy+checksum" hand-coded assembly routines we already have.

It is in fact the better use of memory.  Firstly, we use page
allocations, only single ones.  With linear buffers SLAB could
use multiple pages which strain the memory subsystem quite a bit at
times.  Secondly, we fill pages with socket data precisely whereas
SLAB can only get as tight packing as any general purpose memory
allocator can.

This, I feel, outweighs the slight performance decrease.  And I would
wager a bet that the better usage of memory will result in better
all around performance.

The problem with microscopic tests is that you do not see the world
around the thing being focused on.  I feel Andrew/Jamal's test are
very valuable, but lets keep things in perspective when doing cost
analysis.

Finally, please do some tests on loopback.  It is usually a great
way to get "pure software overhead" measurements of our TCP stack.

Later,
David S. Miller
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-29 Thread Ion Badulescu

On Mon, 29 Jan 2001, jamal wrote:

> > 11.5kBps, quite consistently.
>
> This gige card is really sick. Are you sure? Please double check.

Umm.. the starfire chipset is 100Mbit only. So 11.5MBps (sorry, that was a
typo, it's mega not kilo) is really all I'd expect out of it.

Ion

-- 
  It is better to keep your mouth shut and be thought a fool,
than to open it and remove all doubt.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-29 Thread jamal

On Mon, 29 Jan 2001, Ion Badulescu wrote:

> 11.5kBps, quite consistently.

This gige card is really sick. Are you sure? Please double check.

>
> I've tried it, but I'm not really sure what I can report. ttcp's
> measurements are clearly misleading, so I used Andrew's cyclesoak instead.

The ttcp CPU (times()) measurements are misleading. In particular when
doing sendfile. All they say is how much time ttcp spent in kernel space
vs user space. So all CPU measurement i have posted in the past
should be considered bogus. It is interesting to note, however, that
the trend reported by ttcp's CPU measurements as well as Andrew (and
yourself) are similar;->
But the point is: CPU is not the only measure that is of interest.
Throughput is definetly one of those that is of extreme importance.
100Mbps is not exciting. You seem to have gigE. I think your 11KB looks
suspiciously wrong. Can you double check please?

cheers,
jamal

PS:- another important parameter is latency, but that might not be as
important in file serving (maybe in short file tranfers ala http).

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-29 Thread Ion Badulescu


On Sat, 27 Jan 2001, jamal wrote:

> > starfire:
> > 2.4.1-pre10+zerocopy, using sendfile():  9.6% CPU
> > 2.4.1-pre10+zerocopy, using read()/write(): 18.3%-29.6% CPU * why so much 
>variance?
> >
>
> What are your throughput numbers?

11.5kBps, quite consistently.

BTW, Andrew's new tool (with 8k reads/writes) has shown the load in the
read/write case to be essentially the lower margin of the intervals I got
in the first mail.

> Could you also, please, test using:
>
> http://www.cyberus.ca/~hadi/ttcp-sf.tar.gz
>
> post both sender and receiver data. Repeat each test about
> 5 times.

I've tried it, but I'm not really sure what I can report. ttcp's
measurements are clearly misleading, so I used Andrew's cyclesoak instead.
The numbers are (with 2.4.1-pre10+zerocopy):

[starfire, hw csum & sg enabled]
sending with sendfile:  10.0-10.2%
sending with send/write:13.5-13.7%
receiving:  20.0-20.2%

[starfire, hw csum & sg disabled]
sending with sendfile:  18.1-18.3%
sending with send/write:13.9-14.1%
receiving:  24.3-24.5%

[eepro100, i82559, no hw fancies]
sending with sendfile:  16.2-16.4%
sending with send/write:12.0-12.2%
receiving:  21.5-21.7%

Same tests, this time with 2.4.1-pre10 vanilla:

[starfire]
sending with sendfile:  18.1-18.3%
sending with send/write:12.5-12.7%
receiving:  23.0-23.1%

[eepro100, i82559]
sending with sendfile:  16.7-16.9%
sending with send/write:12.0-12.2%
receiving:  20.8-20.9%


Ion

-- 
  It is better to keep your mouth shut and be thought a fool,
than to open it and remove all doubt.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-29 Thread Rick Jones


> I'll give this a shot later. Can you try with the sendfiled-ttcp?
> http://www.cyberus.ca/~hadi/ttcp-sf.tar.gz

I guess I need to "leverage" some bits for netperf :)


WRT getting data with links that cannot saturate a system, having
something akin to the netperf service demand measure can help. Nothing
terribly fancy - simply a conversion of the CPU utilization and
throughput to a microseconds of CPU to transfer a KB of data. 

As for CKO and avoiding copies and such, if past experience is any guide
(ftp://ftp.cup.hp.com/dist/networking/briefs/copyavoid.ps) you get a
very nice synergistic effect once the last "access" of data is removed.
CKO gets you say 10%, avoiding the copy gets you say 10%, but doing both
at the same time gets you 30%.

rick jones
http://www.netperf.org/
-- 
ftp://ftp.cup.hp.com/dist/networking/misc/rachel/
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to email, OR post, but please do NOT do BOTH...
my email address is raj in the cup.hp.com domain...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

RE: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-29 Thread Antonin Kral



> > Throughput: 100Mbps is really nothing. Linux never had a problem with
> > 4-500Mbps file serving. So throughput is an important number. so is
> > end to end latency, but in file serving case, latency might 
> > not be a big deal so ignore it.
> 
> If I try to route more than 40mbps (40% line utilization) through a 100mbps
> port (tulip) on a 2.4.0-test kernel running on a pIII 500 (or higher)
> system, not only does the performance drop to nearly 0, the system gets all
> sluggish and unusable.  This is with or without Jamal's FF patches.
> 
> How are you managing to get such high throughput?
> 

I have used 2.2.13 to 2.2.18 and 2.4.0, for first approach, with no
patches and with no probles I managed bandwidth about 200 and 300 Mbps

Antonin


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

RE: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-29 Thread Jonathan Earle


> Throughput: 100Mbps is really nothing. Linux never had a problem with
> 4-500Mbps file serving. So throughput is an important number. so is
> end to end latency, but in file serving case, latency might 
> not be a big deal so ignore it.

If I try to route more than 40mbps (40% line utilization) through a 100mbps
port (tulip) on a 2.4.0-test kernel running on a pIII 500 (or higher)
system, not only does the performance drop to nearly 0, the system gets all
sluggish and unusable.  This is with or without Jamal's FF patches.

How are you managing to get such high throughput?

Jon
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

RE: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-29 Thread Jonathan Earle


 Throughput: 100Mbps is really nothing. Linux never had a problem with
 4-500Mbps file serving. So throughput is an important number. so is
 end to end latency, but in file serving case, latency might 
 not be a big deal so ignore it.

If I try to route more than 40mbps (40% line utilization) through a 100mbps
port (tulip) on a 2.4.0-test kernel running on a pIII 500 (or higher)
system, not only does the performance drop to nearly 0, the system gets all
sluggish and unusable.  This is with or without Jamal's FF patches.

How are you managing to get such high throughput?

Jon
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

RE: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-29 Thread Antonin Kral



  Throughput: 100Mbps is really nothing. Linux never had a problem with
  4-500Mbps file serving. So throughput is an important number. so is
  end to end latency, but in file serving case, latency might 
  not be a big deal so ignore it.
 
 If I try to route more than 40mbps (40% line utilization) through a 100mbps
 port (tulip) on a 2.4.0-test kernel running on a pIII 500 (or higher)
 system, not only does the performance drop to nearly 0, the system gets all
 sluggish and unusable.  This is with or without Jamal's FF patches.
 
 How are you managing to get such high throughput?
 

I have used 2.2.13 to 2.2.18 and 2.4.0, for first approach, with no
patches and with no probles I managed bandwidth about 200 and 300 Mbps

Antonin


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-29 Thread Rick Jones


 I'll give this a shot later. Can you try with the sendfiled-ttcp?
 http://www.cyberus.ca/~hadi/ttcp-sf.tar.gz

I guess I need to "leverage" some bits for netperf :)


WRT getting data with links that cannot saturate a system, having
something akin to the netperf service demand measure can help. Nothing
terribly fancy - simply a conversion of the CPU utilization and
throughput to a microseconds of CPU to transfer a KB of data. 

As for CKO and avoiding copies and such, if past experience is any guide
(ftp://ftp.cup.hp.com/dist/networking/briefs/copyavoid.ps) you get a
very nice synergistic effect once the last "access" of data is removed.
CKO gets you say 10%, avoiding the copy gets you say 10%, but doing both
at the same time gets you 30%.

rick jones
http://www.netperf.org/
-- 
ftp://ftp.cup.hp.com/dist/networking/misc/rachel/
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to email, OR post, but please do NOT do BOTH...
my email address is raj in the cup.hp.com domain...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-29 Thread Ion Badulescu


On Sat, 27 Jan 2001, jamal wrote:

  starfire:
  2.4.1-pre10+zerocopy, using sendfile():  9.6% CPU
  2.4.1-pre10+zerocopy, using read()/write(): 18.3%-29.6% CPU * why so much 
variance?
 

 What are your throughput numbers?

11.5kBps, quite consistently.

BTW, Andrew's new tool (with 8k reads/writes) has shown the load in the
read/write case to be essentially the lower margin of the intervals I got
in the first mail.

 Could you also, please, test using:

 http://www.cyberus.ca/~hadi/ttcp-sf.tar.gz

 post both sender and receiver data. Repeat each test about
 5 times.

I've tried it, but I'm not really sure what I can report. ttcp's
measurements are clearly misleading, so I used Andrew's cyclesoak instead.
The numbers are (with 2.4.1-pre10+zerocopy):

[starfire, hw csum  sg enabled]
sending with sendfile:  10.0-10.2%
sending with send/write:13.5-13.7%
receiving:  20.0-20.2%

[starfire, hw csum  sg disabled]
sending with sendfile:  18.1-18.3%
sending with send/write:13.9-14.1%
receiving:  24.3-24.5%

[eepro100, i82559, no hw fancies]
sending with sendfile:  16.2-16.4%
sending with send/write:12.0-12.2%
receiving:  21.5-21.7%

Same tests, this time with 2.4.1-pre10 vanilla:

[starfire]
sending with sendfile:  18.1-18.3%
sending with send/write:12.5-12.7%
receiving:  23.0-23.1%

[eepro100, i82559]
sending with sendfile:  16.7-16.9%
sending with send/write:12.0-12.2%
receiving:  20.8-20.9%


Ion

-- 
  It is better to keep your mouth shut and be thought a fool,
than to open it and remove all doubt.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-29 Thread jamal




On Mon, 29 Jan 2001, Ion Badulescu wrote:

 11.5kBps, quite consistently.

This gige card is really sick. Are you sure? Please double check.


 I've tried it, but I'm not really sure what I can report. ttcp's
 measurements are clearly misleading, so I used Andrew's cyclesoak instead.

The ttcp CPU (times()) measurements are misleading. In particular when
doing sendfile. All they say is how much time ttcp spent in kernel space
vs user space. So all CPU measurement i have posted in the past
should be considered bogus. It is interesting to note, however, that
the trend reported by ttcp's CPU measurements as well as Andrew (and
yourself) are similar;-
But the point is: CPU is not the only measure that is of interest.
Throughput is definetly one of those that is of extreme importance.
100Mbps is not exciting. You seem to have gigE. I think your 11KB looks
suspiciously wrong. Can you double check please?

cheers,
jamal

PS:- another important parameter is latency, but that might not be as
important in file serving (maybe in short file tranfers ala http).


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Choosing Linux NICs (was: Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN))

2001-01-28 Thread Felix von Leitner


Thus spake Felix von Leitner ([EMAIL PROTECTED]):
> What is missing here is a good authoritative web ressource that tells
> people which NIC to buy.

I started one now.

It's at http://www.fefe.de/linuxeth/, but there is not much content yet.
Please contribute!

Felix
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-28 Thread Gregory Maxwell


On Sun, Jan 28, 2001 at 02:37:48PM +0100, Felix von Leitner wrote:
> Thus spake Andrew Morton ([EMAIL PROTECTED]):
> > Conclusions:
> 
> >   For a NIC which cannot do scatter/gather/checksums, the zerocopy
> >   patch makes no change in throughput in all case.
> 
> >   For a NIC which can do scatter/gather/checksums, sendfile()
> >   efficiency is improved by 40% and send() efficiency is decreased by
> >   10%.  The increase and decrease caused by the zerocopy patch will in
> >   fact be significantly larger than these two figures, because the
> >   measurements here include a constant base load caused by the device
> >   driver.
> 
> What is missing here is a good authoritative web ressource that tells
> people which NIC to buy.
> 
> I have a tulip NIC because a few years ago that apparently was the NIC
> of choice.  It has good multicast (which is important to me), but AFAIK
> it has neither scatter-gather nor hardware checksumming.
> 
> Is there such a web page already?
> If not, I volunteer to create amd maintain one.

Additionally, it would be useful to have some boot messages comment on the
abilities of cards. I am sick and tired of dealing with people telling me
that 'Linux performance sucks' when they keep putting Linux on systems with
pci 8139 adaptors. 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-28 Thread Andrew Morton

jamal wrote:
> 
> PS:- can you try it out with the ttcp testcode i posted?

Yup.  See below.  The numbers are almost the same as
with `zcs' and `zcc'.

The CPU utilisation code which was in `zcc' has been
broken out into a standalone tool, so the new `cyclesoak'
app is a general-purpose system load measurement tool.
It's fascinating to play with, if you're into that sort
of thing.

`cyclesoak' was used to measure ttcp-sf and NFS/UDP client
and server throughput.  The times()-based instrumentation
inside ttcp-sf doesn't (can't) give correct numbers.  2-4% CPU
at 100 mbps?  We wish :)

The zerocopy patch doesn't seem to affect NFS efficiency at
all.  Confused.

Excerpt from the rapidly swelling README:

NFS/UDP client results
==

Reading a 100 meg file across 100baseT.  The file is fully cached on
the server.  The client is the above machine.  You need to unmount the
server between runs to avoid client-side caching.

The server is mounted with various rsize and wsize options.

  Kernel   rsize wsize   mbyte/sec CPU

  2.4.1-pre10+zc   1024  1024 2.4 10.3%
  2.4.1-pre10+zc   2048  2048 3.7 11.4%
  2.4.1-pre10+zc   4096  4096 10.129.0%
  2.4.1-pre10+zc   8199  8192 11.928.2%
  2.4.1-pre10+zc  16384 16384 11.928.2%

  2.4.1-pre10  1024  1024  2.4 9.7%
  2.4.1-pre10  2048  2048  3.711.8%
  2.4.1-pre10  4096  4096 10.733.6%
  2.4.1-pre10  8199  8192 11.929.5%
  2.4.1-pre10 16384 16384 11.929.2%

Small diff at 8192.

NFS/UDP server results
==

Reading a 100 meg file across 100baseT.  The file is fully cached on
the server.  The server is the above machine.

  Kernel   rsize wsize   mbyte/sec CPU

  2.4.1-pre10+zc   1024  1024  2.619.1%
  2.4.1-pre10+zc   2048  2048  3.918.8%
  2.4.1-pre10+zc   4096  4096 10.034.5%
  2.4.1-pre10+zc   8199  8192 11.828.9%
  2.4.1-pre10+zc  16384 16384 11.829.0%

  2.4.1-pre10  1024  1024  2.618.5%
  2.4.1-pre10  2048  2048  3.918.6%
  2.4.1-pre10  4096  4096 10.933.8%
  2.4.1-pre10  8199  8192 11.829.0%
  2.4.1-pre10 16384 16384 11.829.0%

No diff.

ttcp-sf Results
===

Jamal Hadi Salim has taught ttcp to use sendfile.  See
http://www.cyberus.ca/~hadi/ttcp-sf.tar.gz

Using the same machine as above, and the following commands:

Sender:./ttcp-sf -t -c -l 32768 -v receiver_host
Receiver:  ./ttcp-sf -c -r -l 32768 -v sender_host

CPU

2.4.1-pre10-zerocopy, sending with ttcp-sf:10.5%
2.4.1-pre10-zerocopy, receiving with ttcp-sf:  16.1%

2.4.1-pre10-vanilla, sending with ttcp-sf: 18.5%
2.4.1-pre10-vanilla, receiving with ttcp-sf:   16.0%

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-28 Thread Andi Kleen

On Sun, Jan 28, 2001 at 02:37:48PM +0100, Felix von Leitner wrote:
> What is missing here is a good authoritative web ressource that tells
> people which NIC to buy.
> 
> I have a tulip NIC because a few years ago that apparently was the NIC
> of choice.  It has good multicast (which is important to me), but AFAIK
> it has neither scatter-gather nor hardware checksumming.
> 
> Is there such a web page already?
> If not, I volunteer to create amd maintain one.

Here a try for FastEthernet. Corrections/additions welcome.

Currently the 3c9xx cards look like the best commonly affordable ones,
at least when you care about zero copy networking. The newer ones have
all the necessary toys for it. 
Don't use them with the 3com vendor driver though, their driver is crap. 

eepro100 seems to have mostly the same facilities, but Intel doesn't 
document it fully so it is not usable in the standard Linux driver. They have
an own driver available (e100.c) which does more, but it of course lags in
progress to the normal stack.

I don't know about starfire, but it seems to be hard to even buy them anyways.

Sun HME seems to be on similar level as 3c9xx, but near impossible to buy or
very expensive.

Realtek is ok for low cost and being relatively hazzle free, but they
miss lots of useful facilities and basically require a copy for RX.

SMC epic/100 is handicapped by not being able to receive to unaligned addresses,
requiring in linux a driver level copy (that may change in 2.5 though, zero
copy has the necessary infrastructure to only copy the header in this case,
not the whole packet) 

AMD pcnet afaik doesn't do hardware checksums.

Tulip doesn't do hardware checksum and is a bit constrained by the 
required long word alignment in RX (causing problems with misaligned IP 
headers, see above on epic100) 

Advantage of Tulip and AMD is that they perform much better in my experience
on half duplex Ethernet than other cards because they a modified 
patented backoff scheme. Without it Linux 2.1+ tends to suffer badly from
ethernet congestion by colliding with the own acks, probably because it 
sends too fast.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-28 Thread Dan Hollis


On Sun, 28 Jan 2001, Felix von Leitner wrote:
> What is missing here is a good authoritative web ressource that tells
> people which NIC to buy.
> I have a tulip NIC because a few years ago that apparently was the NIC
> of choice.  It has good multicast (which is important to me), but AFAIK
> it has neither scatter-gather nor hardware checksumming.
> Is there such a web page already?

http://www.anime.net/~goemon/cardz/

Based on discussions I've had with Donald Becker about chipsets.

For 100bt, 3c905C is the most efficient card at the moment.
I've no idea about gigglebit ethernet.

-Dan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-28 Thread Felix von Leitner


Thus spake Andrew Morton ([EMAIL PROTECTED]):
> Conclusions:

>   For a NIC which cannot do scatter/gather/checksums, the zerocopy
>   patch makes no change in throughput in all case.

>   For a NIC which can do scatter/gather/checksums, sendfile()
>   efficiency is improved by 40% and send() efficiency is decreased by
>   10%.  The increase and decrease caused by the zerocopy patch will in
>   fact be significantly larger than these two figures, because the
>   measurements here include a constant base load caused by the device
>   driver.

What is missing here is a good authoritative web ressource that tells
people which NIC to buy.

I have a tulip NIC because a few years ago that apparently was the NIC
of choice.  It has good multicast (which is important to me), but AFAIK
it has neither scatter-gather nor hardware checksumming.

Is there such a web page already?
If not, I volunteer to create amd maintain one.

Felix
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-28 Thread Felix von Leitner


Thus spake Andrew Morton ([EMAIL PROTECTED]):
 Conclusions:

   For a NIC which cannot do scatter/gather/checksums, the zerocopy
   patch makes no change in throughput in all case.

   For a NIC which can do scatter/gather/checksums, sendfile()
   efficiency is improved by 40% and send() efficiency is decreased by
   10%.  The increase and decrease caused by the zerocopy patch will in
   fact be significantly larger than these two figures, because the
   measurements here include a constant base load caused by the device
   driver.

What is missing here is a good authoritative web ressource that tells
people which NIC to buy.

I have a tulip NIC because a few years ago that apparently was the NIC
of choice.  It has good multicast (which is important to me), but AFAIK
it has neither scatter-gather nor hardware checksumming.

Is there such a web page already?
If not, I volunteer to create amd maintain one.

Felix
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-28 Thread Dan Hollis


On Sun, 28 Jan 2001, Felix von Leitner wrote:
 What is missing here is a good authoritative web ressource that tells
 people which NIC to buy.
 I have a tulip NIC because a few years ago that apparently was the NIC
 of choice.  It has good multicast (which is important to me), but AFAIK
 it has neither scatter-gather nor hardware checksumming.
 Is there such a web page already?

http://www.anime.net/~goemon/cardz/

Based on discussions I've had with Donald Becker about chipsets.

For 100bt, 3c905C is the most efficient card at the moment.
I've no idea about gigglebit ethernet.

-Dan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-28 Thread Andi Kleen


On Sun, Jan 28, 2001 at 02:37:48PM +0100, Felix von Leitner wrote:
 What is missing here is a good authoritative web ressource that tells
 people which NIC to buy.
 
 I have a tulip NIC because a few years ago that apparently was the NIC
 of choice.  It has good multicast (which is important to me), but AFAIK
 it has neither scatter-gather nor hardware checksumming.
 
 Is there such a web page already?
 If not, I volunteer to create amd maintain one.

Here a try for FastEthernet. Corrections/additions welcome.

Currently the 3c9xx cards look like the best commonly affordable ones,
at least when you care about zero copy networking. The newer ones have
all the necessary toys for it. 
Don't use them with the 3com vendor driver though, their driver is crap. 

eepro100 seems to have mostly the same facilities, but Intel doesn't 
document it fully so it is not usable in the standard Linux driver. They have
an own driver available (e100.c) which does more, but it of course lags in
progress to the normal stack.

I don't know about starfire, but it seems to be hard to even buy them anyways.

Sun HME seems to be on similar level as 3c9xx, but near impossible to buy or
very expensive.

Realtek is ok for low cost and being relatively hazzle free, but they
miss lots of useful facilities and basically require a copy for RX.

SMC epic/100 is handicapped by not being able to receive to unaligned addresses,
requiring in linux a driver level copy (that may change in 2.5 though, zero
copy has the necessary infrastructure to only copy the header in this case,
not the whole packet) 

AMD pcnet afaik doesn't do hardware checksums.

Tulip doesn't do hardware checksum and is a bit constrained by the 
required long word alignment in RX (causing problems with misaligned IP 
headers, see above on epic100) 

Advantage of Tulip and AMD is that they perform much better in my experience
on half duplex Ethernet than other cards because they a modified 
patented backoff scheme. Without it Linux 2.1+ tends to suffer badly from
ethernet congestion by colliding with the own acks, probably because it 
sends too fast.

-Andi


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-28 Thread Andrew Morton


jamal wrote:
 
 PS:- can you try it out with the ttcp testcode i posted?

Yup.  See below.  The numbers are almost the same as
with `zcs' and `zcc'.

The CPU utilisation code which was in `zcc' has been
broken out into a standalone tool, so the new `cyclesoak'
app is a general-purpose system load measurement tool.
It's fascinating to play with, if you're into that sort
of thing.

`cyclesoak' was used to measure ttcp-sf and NFS/UDP client
and server throughput.  The times()-based instrumentation
inside ttcp-sf doesn't (can't) give correct numbers.  2-4% CPU
at 100 mbps?  We wish :)

The zerocopy patch doesn't seem to affect NFS efficiency at
all.  Confused.

Excerpt from the rapidly swelling README:



NFS/UDP client results
==

Reading a 100 meg file across 100baseT.  The file is fully cached on
the server.  The client is the above machine.  You need to unmount the
server between runs to avoid client-side caching.

The server is mounted with various rsize and wsize options.

  Kernel   rsize wsize   mbyte/sec CPU

  2.4.1-pre10+zc   1024  1024 2.4 10.3%
  2.4.1-pre10+zc   2048  2048 3.7 11.4%
  2.4.1-pre10+zc   4096  4096 10.129.0%
  2.4.1-pre10+zc   8199  8192 11.928.2%
  2.4.1-pre10+zc  16384 16384 11.928.2%

  2.4.1-pre10  1024  1024  2.4 9.7%
  2.4.1-pre10  2048  2048  3.711.8%
  2.4.1-pre10  4096  4096 10.733.6%
  2.4.1-pre10  8199  8192 11.929.5%
  2.4.1-pre10 16384 16384 11.929.2%

Small diff at 8192.


NFS/UDP server results
==

Reading a 100 meg file across 100baseT.  The file is fully cached on
the server.  The server is the above machine.

  Kernel   rsize wsize   mbyte/sec CPU

  2.4.1-pre10+zc   1024  1024  2.619.1%
  2.4.1-pre10+zc   2048  2048  3.918.8%
  2.4.1-pre10+zc   4096  4096 10.034.5%
  2.4.1-pre10+zc   8199  8192 11.828.9%
  2.4.1-pre10+zc  16384 16384 11.829.0%

  2.4.1-pre10  1024  1024  2.618.5%
  2.4.1-pre10  2048  2048  3.918.6%
  2.4.1-pre10  4096  4096 10.933.8%
  2.4.1-pre10  8199  8192 11.829.0%
  2.4.1-pre10 16384 16384 11.829.0%

No diff.


ttcp-sf Results
===

Jamal Hadi Salim has taught ttcp to use sendfile.  See
http://www.cyberus.ca/~hadi/ttcp-sf.tar.gz

Using the same machine as above, and the following commands:

Sender:./ttcp-sf -t -c -l 32768 -v receiver_host
Receiver:  ./ttcp-sf -c -r -l 32768 -v sender_host

CPU

2.4.1-pre10-zerocopy, sending with ttcp-sf:10.5%
2.4.1-pre10-zerocopy, receiving with ttcp-sf:  16.1%

2.4.1-pre10-vanilla, sending with ttcp-sf: 18.5%
2.4.1-pre10-vanilla, receiving with ttcp-sf:   16.0%

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-28 Thread Gregory Maxwell


On Sun, Jan 28, 2001 at 02:37:48PM +0100, Felix von Leitner wrote:
 Thus spake Andrew Morton ([EMAIL PROTECTED]):
  Conclusions:
 
For a NIC which cannot do scatter/gather/checksums, the zerocopy
patch makes no change in throughput in all case.
 
For a NIC which can do scatter/gather/checksums, sendfile()
efficiency is improved by 40% and send() efficiency is decreased by
10%.  The increase and decrease caused by the zerocopy patch will in
fact be significantly larger than these two figures, because the
measurements here include a constant base load caused by the device
driver.
 
 What is missing here is a good authoritative web ressource that tells
 people which NIC to buy.
 
 I have a tulip NIC because a few years ago that apparently was the NIC
 of choice.  It has good multicast (which is important to me), but AFAIK
 it has neither scatter-gather nor hardware checksumming.
 
 Is there such a web page already?
 If not, I volunteer to create amd maintain one.

Additionally, it would be useful to have some boot messages comment on the
abilities of cards. I am sick and tired of dealing with people telling me
that 'Linux performance sucks' when they keep putting Linux on systems with
pci 8139 adaptors. 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Choosing Linux NICs (was: Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN))

2001-01-28 Thread Felix von Leitner


Thus spake Felix von Leitner ([EMAIL PROTECTED]):
 What is missing here is a good authoritative web ressource that tells
 people which NIC to buy.

I started one now.

It's at http://www.fefe.de/linuxeth/, but there is not much content yet.
Please contribute!

Felix
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-27 Thread Andrew Morton

[EMAIL PROTECTED] wrote:
> 
> Hello!
> 
> > 2.4.1-pre10+zercopy, using read()/write():  38.1% CPU
> 
> write() on zc card is worse than normal write() by definition.
> It generates split buffers.

yes.  The figures below show this.  Disabling SG+checksums speeds
up write() and send().

> Split buffers are more expensive and we have to pay for this.
> You have paid too much for slow card though. 8)
>
> Do you measure load correctly?

Yes.  Quite confident about this.  Here's the algorithm:

1: Run a cycle-soaker on each CPU on an otherwise unloaded
   system.  See how much "work" they all do per second.

2: Run the cycle-soakers again, but with network traffic happening.
   See how much their "work" is reduced. Deduce networking CPU load
   from this difference.

   The networking code all runs SCHED_FIFO or in interrupt context,
   so the cycle-soakers have no effect upon the network code's access
   to the CPU.

   The "cycle-soakers" just sit there spinning and dirtying 10,000
   cachelines per second.

> > 2.4.1-pre10+zercopy, using read()/write():  39.2% CPU* hardware tx 
>checksums disabled
> 
> This is illegal combination of parameters. You force two memory accesses,
> doing this. The fact that it does not add to load is dubious. 8)8)

mm.. Perhaps with read()/write() the data is already in cache?

Anyway, I've tweaked up the tool again so it can do send() or
write() (then I looked at the implementation and wondered why
I'd bothered).  It also does TCP_CORK now.

I ran another set of tests.  The zerocopy patch improves sendfile()
hugely but slows down send()/write() significantly, with a 3c905C:

http://www.uow.edu.au/~andrewm/linux/#zc

The kernels which were tested were 2.4.1-pre10 with and without the
zerocopy patch.  We only look at client load (the TCP sender).

In all tests the link throughput was 11.5 mbytes/sec at all times
(saturated 100baseT) unless otherwise noted.

The client (the thing which sends data) is a dual 500MHz PII with a
3c905C.

For the write() and send() tests, the chunk size was 64 kbytes.

The workload was 63 files with an average length of 350 kbytes.

 CPU

2.4.1-pre10+zerocopy, using sendfile():  9.6%
2.4.1-pre10+zerocopy, using send(): 24.1%
2.4.1-pre10+zerocopy, using write():24.2%

2.4.1-pre10+zerocopy, using sendfile(): 16.2%   * checksums and SG 
disabled
2.4.1-pre10+zerocopy, using send(): 21.5%   * checksums and SG 
disabled
2.4.1-pre10+zerocopy, using write():21.5%   * checksums and SG 
disabled

2.4.1-pre10-vanilla, using sendfile():  17.1%
2.4.1-pre10-vanilla, using send():  21.1%
2.4.1-pre10-vanilla, using write(): 21.1%

Bearing in mind that a large amount of the load is in the device
driver, the zerocopy patch makes a large improvement in sendfile
efficiency.  But read() and send() performance is decreased by 10% -
more than this if you factor out the constant device driver overhead.

TCP_CORK makes no difference.  The files being sent are much larger
than a single frame.

Conclusions:

  For a NIC which cannot do scatter/gather/checksums, the zerocopy
  patch makes no change in throughput in all case.

  For a NIC which can do scatter/gather/checksums, sendfile()
  efficiency is improved by 40% and send() efficiency is decreased by
  10%.  The increase and decrease caused by the zerocopy patch will in
  fact be significantly larger than these two figures, because the
  measurements here include a constant base load caused by the device
  driver.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-27 Thread jamal

On Sun, 28 Jan 2001, Andrew Morton wrote:

> jamal wrote:
> >
> > ..
> > It is also useful to have both client and server stats.
> > BTW, since the laptop (with the 3C card) is the client, the SG
> > shouldnt kick in at all.
>
> The `client' here is doing the sendfiling, so yes, the
> gathering occurs on the client.
>

OK, semantics. Maybe we should stick to sender and receiver.
(server normally will translate to "serve" files)

> > I'll give this a shot later. Can you try with the sendfiled-ttcp?
> > http://www.cyberus.ca/~hadi/ttcp-sf.tar.gz
>
> hmm..  I didn't bother with TCP_CORK because the files being
> sent are "much" larger than a frame.  Guess I should.

It doesnt make much sense to use sendfile without TCP_CORK.

> The problem with things like ttcp is the measurement of CPU load.
> If your network is so fast that your machine can't keep up then
> fine, raw throughput is a good measure. But if the link is saturated
> then normal process accounting doesn't cut it.

ttcp's CPU measure is not the best. Part of my plan was to change that.
It uses times(). So the measurement is not good. It is infact not
very reflective on SMP. The way to do it there is to break it down
by CPU.
Throughput: 100Mbps is really nothing. Linux never had a problem with
4-500Mbps file serving. So throughput is an important number. so is
end to end latency, but in file serving case, latency might not be a big
deal so ignore it.

> For example, at 100 mbps, `top' says ttcp is chewing 4% CPU. But guess
> what?  A low-priority process running on the same machine is in fact
> slowed down by 30%.  top lies.  Most of the cost of the networking layer
> is being accounted to swapper, and lost.  And who accounts for cache
> eviction, bus utilisation, etc.  We're better off measuring what's
> left behind, rather than measuring what is consumed.
>
> You can in fact do this with ttcp: run it with a super-high priority
> and run a little task in the background (dummyload.c in the above
> tarball does this).  See how much the dummy task is slowed down
> wrt an unloaded system.  It gets tricky on SMP though.
>

The best way to do CPU measurement is via /proc. The way top
does it. You measure it from within your nettest program. This does
measure what is "left behind" since your proggie is in user space.
Actually, it shouldnt matter whether you do it from your test program or
from dummyload.c. With dummyload you might have to sigkill the program
every  time a test terminates.
You also should break down utilization by CPU.

cheers,
jamal

PS:- can you try it out with the ttcp testcode i posted?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-27 Thread Andrew Morton

jamal wrote:
> 
> ..
> It is also useful to have both client and server stats.
> BTW, since the laptop (with the 3C card) is the client, the SG
> shouldnt kick in at all.

The `client' here is doing the sendfiling, so yes, the
gathering occurs on the client.

> ...
> > The test tool is, of course, documented [ :-)/2 ].  It's at
> >
> >   http://www.uow.edu.au/~andrewm/linux/#zc
> >
> 
> I'll give this a shot later. Can you try with the sendfiled-ttcp?
> http://www.cyberus.ca/~hadi/ttcp-sf.tar.gz

hmm..  I didn't bother with TCP_CORK because the files being
sent are "much" larger than a frame.  Guess I should.

The problem with things like ttcp is the measurement of CPU load.
If your network is so fast that your machine can't keep up then
fine, raw throughput is a good measure. But if the link is saturated
then normal process accounting doesn't cut it.

For example, at 100 mbps, `top' says ttcp is chewing 4% CPU. But guess
what?  A low-priority process running on the same machine is in fact
slowed down by 30%.  top lies.  Most of the cost of the networking layer
is being accounted to swapper, and lost.  And who accounts for cache
eviction, bus utilisation, etc.  We're better off measuring what's
left behind, rather than measuring what is consumed.

You can in fact do this with ttcp: run it with a super-high priority
and run a little task in the background (dummyload.c in the above
tarball does this).  See how much the dummy task is slowed down
wrt an unloaded system.  It gets tricky on SMP though.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-27 Thread jamal




On Sat, 27 Jan 2001, Ion Badulescu wrote:

>
> 750MHz PIII, Adaptec Starfire NIC, driver modified to use hardware sg+csum
> (both Tx/Rx), and Intel i82559 (eepro100), no hardware csum support,
> vanilla driver.
>
> The box has 512MB of RAM, and I'm using a 100MB file, so it's entirely cached.
>
> starfire:
> 2.4.1-pre10+zerocopy, using sendfile():9.6% CPU
> 2.4.1-pre10+zerocopy, using read()/write():   18.3%-29.6% CPU * why so much 
>variance?
>

What are your throughput numbers?

Could you also, please, test using:

http://www.cyberus.ca/~hadi/ttcp-sf.tar.gz

post both sender and receiver data. Repeat each test about
5 times.

cheers,
jamal


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-27 Thread jamal




On Sat, 27 Jan 2001, Andrew Morton wrote:

> (Please keep netdev copied, else Jamal will grump at you, and
>  you don't want that).
>

Thanks, Andrew ;-> Isnt netdev where networking stuff should be
discussed? I think i give up and will join lk, RSN ;->

> The kernels which were tested were 2.4.1-pre10 with and without the
> zerocopy patch.  We only look at client load (the TCP sender).
>
> The link throughput was 11.5 mbytes/sec at all times (saturated 100baseT)
>
> 2.4.1-pre10-vanilla, using sendfile():  29.6% CPU
> 2.4.1-pre10-vanilla, using read()/write():  34.5% CPU
>
> 2.4.1-pre10+zercopy, using sendfile():  18.2% CPU
> 2.4.1-pre10+zercopy, using read()/write():  38.1% CPU
>
> 2.4.1-pre10+zercopy, using sendfile():  22.9% CPU* hardware tx checksums 
>disabled
> 2.4.1-pre10+zercopy, using read()/write():  39.2% CPU* hardware tx checksums 
>disabled
>
>
> What can we conclude?
>
> - sendfile is 10% cheaper than read()-then-write() on 2.4.1-pre10.
>
> - sendfile() with the zerocopy patch is 40% cheaper than
>   sendfile() without the zerocopy patch.
>

It is also useful to have both client and server stats.
BTW, since the laptop (with the 3C card) is the client, the SG
shouldnt kick in at all.

> - hardware Tx checksums don't make much difference.  hmm...
>
> Bear in mind that the 3c59x driver uses a one-interrupt-per-packet
> algorithm.  Mitigation reduces this to 0.3 ints/packet.
> So we're absorbing 4,500 interrupts/sec while processing
> 12,000 packets/sec.  gigE NICs do much better mitigation than
> this and the relative benefits of zerocopy will be much higher
> for these.  Hopefully Jamal can do some testing.
>

I dont have my babies right now, but as soon as i can get access to
them

> BTW: I could not reproduce Jamal's oops when sending large
> files (2 gigs with sendfile()).

Alexey was concerned about this. Good. But maybe it will still
happen with my setupo. We'll see.

>
> The test tool is, of course, documented [ :-)/2 ].  It's at
>
>   http://www.uow.edu.au/~andrewm/linux/#zc
>

I'll give this a shot later. Can you try with the sendfiled-ttcp?
http://www.cyberus.ca/~hadi/ttcp-sf.tar.gz
Anyways, you are NIC-challenged ;-> Get GigE. 100Mbps doesnt give
much information.

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-27 Thread Andrew Morton

Ion Badulescu wrote:
> 
> On Sat, 27 Jan 2001 19:19:01 +1100, Andrew Morton <[EMAIL PROTECTED]> wrote:
> 
> > The figures I quoted for the no-hw-checksum case were still
> > using scatter/gather.  That can be turned off as well and
> > it makes it a tiny bit quicker.
> 
> Hmm. Are you sure the differences are not just noise?

I don't think so.  It's all pretty repeatable.

> Unless you
> modified the zerocopy patch yourself, it won't use SG without
> checksums...

I believe it in fact does use SG when hardware tx checksums are unavailable,
but this capability wil be removed RSN because userspace can scribble
on the pagecache after the checksum has been calculated, and before
the frame has hit the wire.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-27 Thread Andrew Morton

Ion Badulescu wrote:
> 
> 2.4.1-pre10+zerocopy, using read()/write(): 18.3%-29.6% CPU * why so 
>much variance?

The variance is presumably because of the naive read/write
implementation.  It sucks in 16 megs and writes out out again.
With a 100 megabyte file you'll get aliasing effects between
the sampling interval and the client's activity.

You will get more repeatable results using smaller files.  I'm
just sending /usr/local/bin/* ten times, with

./zcc -s otherhost -c /usr/local/bin/* -n10 -N2 -S

Maybe that 16 meg buffer should be shorter...  Yes, making it
smaller smooths things out.

Heh, look at this.  It's a simple read-some, send-some loop.
Plot CPU utilisation against the transfer size:

Size   %CPU

256 31
512 25
102422
204818
409617
819216
16384   18
32768   19
65536   21
128k22
256k22.5

8192 bytes is best.

I've added the `-b' option to zcc to set the transfer size.  Same
URL.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-27 Thread Ion Badulescu

On Sat, 27 Jan 2001 19:19:01 +1100, Andrew Morton <[EMAIL PROTECTED]> wrote:

> The figures I quoted for the no-hw-checksum case were still
> using scatter/gather.  That can be turned off as well and
> it makes it a tiny bit quicker.

Hmm. Are you sure the differences are not just noise? Unless you
modified the zerocopy patch yourself, it won't use SG without
checksums...

In fact it would be interesting to revert that policy and
see how much SG alone helps. Probably not much, since the
CPU checksumming is close to onecopy.

Ion

-- 
  It is better to keep your mouth shut and be thought a fool,
than to open it and remove all doubt.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-27 Thread Ion Badulescu

On Sat, 27 Jan 2001 16:45:43 +1100, Andrew Morton <[EMAIL PROTECTED]> wrote:

> The client is a 650 MHz PIII.  The NIC is a 3CCFE575CT Cardbus 3com.
> It supports Scatter/Gather and hardware checksums.  The NIC's interrupt
> is shared with the Cardbus controller, so this will impact throughput
> slightly.
> 
> The kernels which were tested were 2.4.1-pre10 with and without the
> zerocopy patch.  We only look at client load (the TCP sender).
> 
> The link throughput was 11.5 mbytes/sec at all times (saturated 100baseT)
> 
> 2.4.1-pre10-vanilla, using sendfile():  29.6% CPU
> 2.4.1-pre10-vanilla, using read()/write():  34.5% CPU
> 
> 2.4.1-pre10+zercopy, using sendfile():  18.2% CPU
> 2.4.1-pre10+zercopy, using read()/write():  38.1% CPU
> 
> 2.4.1-pre10+zercopy, using sendfile():  22.9% CPU* hardware tx checksums 
>disabled
> 2.4.1-pre10+zercopy, using read()/write():  39.2% CPU* hardware tx checksums 
>disabled

750MHz PIII, Adaptec Starfire NIC, driver modified to use hardware sg+csum
(both Tx/Rx), and Intel i82559 (eepro100), no hardware csum support,
vanilla driver.

The box has 512MB of RAM, and I'm using a 100MB file, so it's entirely cached.

starfire:
2.4.1-pre10+zerocopy, using sendfile():  9.6% CPU
2.4.1-pre10+zerocopy, using read()/write(): 18.3%-29.6% CPU * why so much 
variance?

2.4.1-pre10+zerocopy, using sendfile(): 17.4% CPU   * hardware 
csum disabled
2.4.1-pre10+zerocopy, using read()/write(): 16.5%-26.8% CPU * idem, again 
why so much variance?

2.4.1-pre10-vanilla, using sendfile():  16.5% CPU
2.4.1-pre10-vanilla, using read()/write():  14.5%-24.5% CPU * high 
variance again

eepro100:
2.4.1-pre10+zerocopy, using sendfile(): 16.0% CPU
2.4.1-pre10+zerocopy, using read()/write(): 15.0%-24.5% CPU * why so much 
variance?

2.4.1-pre10-vanilla, using sendfile():  16.7% CPU
2.4.1-pre10-vanilla, using read()/write():  14.5%-24.6% CPU * high 
variance again

The read+write case is really weird. I'm getting results like this:

CPU load: 27.9491
CPU load: 25.4763
CPU load: 15.8544
CPU load: 25.455
CPU load: 25.2072
CPU load: 15.8677
CPU load: 25.4896
CPU load: 25.2791
CPU load: 15.8837

i.e. 2 slow, 1 fast, 2 slow, 1 fast, and so on so forth.

> What can we conclude?
> 
> - sendfile is 10% cheaper than read()-then-write() on 2.4.1-pre10.

Hard to tell, with such inconclusive results...

> - sendfile() with the zerocopy patch is 40% cheaper than
>   sendfile() without the zerocopy patch.

Indeed. Close to 50% in fact.

> - hardware Tx checksums don't make much difference.  hmm...

Actually it makes all the difference in the world for the starfire.
Interesting...

> Bear in mind that the 3c59x driver uses a one-interrupt-per-packet
> algorithm.  Mitigation reduces this to 0.3 ints/packet.
> So we're absorbing 4,500 interrupts/sec while processing
> 12,000 packets/sec.  gigE NICs do much better mitigation than
> this and the relative benefits of zerocopy will be much higher
> for these.  Hopefully Jamal can do some testing.

Hmm.. the starfire also has quite advanced interrupt mitigation,
but I have not played with it. Maybe tomorrow. So these results
are with one-interrupt-per-packet.

P.S. The starfire still doesn't like tinygrams (skb's with 1-byte
fragments). Fortunately your test program doesn't seem to generate
them. :-)

Ion

-- 
  It is better to keep your mouth shut and be thought a fool,
than to open it and remove all doubt.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-27 Thread Ion Badulescu


On Sat, 27 Jan 2001 19:19:01 +1100, Andrew Morton [EMAIL PROTECTED] wrote:

 The figures I quoted for the no-hw-checksum case were still
 using scatter/gather.  That can be turned off as well and
 it makes it a tiny bit quicker.

Hmm. Are you sure the differences are not just noise? Unless you
modified the zerocopy patch yourself, it won't use SG without
checksums...

In fact it would be interesting to revert that policy and
see how much SG alone helps. Probably not much, since the
CPU checksumming is close to onecopy.

Ion

-- 
  It is better to keep your mouth shut and be thought a fool,
than to open it and remove all doubt.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-27 Thread Andrew Morton


Ion Badulescu wrote:
 
 On Sat, 27 Jan 2001 19:19:01 +1100, Andrew Morton [EMAIL PROTECTED] wrote:
 
  The figures I quoted for the no-hw-checksum case were still
  using scatter/gather.  That can be turned off as well and
  it makes it a tiny bit quicker.
 
 Hmm. Are you sure the differences are not just noise?

I don't think so.  It's all pretty repeatable.

 Unless you
 modified the zerocopy patch yourself, it won't use SG without
 checksums...

I believe it in fact does use SG when hardware tx checksums are unavailable,
but this capability wil be removed RSN because userspace can scribble
on the pagecache after the checksum has been calculated, and before
the frame has hit the wire.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-27 Thread jamal




On Sat, 27 Jan 2001, Andrew Morton wrote:

 (Please keep netdev copied, else Jamal will grump at you, and
  you don't want that).


Thanks, Andrew ;- Isnt netdev where networking stuff should be
discussed? I think i give up and will join lk, RSN ;-

 The kernels which were tested were 2.4.1-pre10 with and without the
 zerocopy patch.  We only look at client load (the TCP sender).

 The link throughput was 11.5 mbytes/sec at all times (saturated 100baseT)

 2.4.1-pre10-vanilla, using sendfile():  29.6% CPU
 2.4.1-pre10-vanilla, using read()/write():  34.5% CPU

 2.4.1-pre10+zercopy, using sendfile():  18.2% CPU
 2.4.1-pre10+zercopy, using read()/write():  38.1% CPU

 2.4.1-pre10+zercopy, using sendfile():  22.9% CPU* hardware tx checksums 
disabled
 2.4.1-pre10+zercopy, using read()/write():  39.2% CPU* hardware tx checksums 
disabled


 What can we conclude?

 - sendfile is 10% cheaper than read()-then-write() on 2.4.1-pre10.

 - sendfile() with the zerocopy patch is 40% cheaper than
   sendfile() without the zerocopy patch.


It is also useful to have both client and server stats.
BTW, since the laptop (with the 3C card) is the client, the SG
shouldnt kick in at all.

 - hardware Tx checksums don't make much difference.  hmm...

 Bear in mind that the 3c59x driver uses a one-interrupt-per-packet
 algorithm.  Mitigation reduces this to 0.3 ints/packet.
 So we're absorbing 4,500 interrupts/sec while processing
 12,000 packets/sec.  gigE NICs do much better mitigation than
 this and the relative benefits of zerocopy will be much higher
 for these.  Hopefully Jamal can do some testing.


I dont have my babies right now, but as soon as i can get access to
them

 BTW: I could not reproduce Jamal's oops when sending large
 files (2 gigs with sendfile()).

Alexey was concerned about this. Good. But maybe it will still
happen with my setupo. We'll see.


 The test tool is, of course, documented [ :-)/2 ].  It's at

   http://www.uow.edu.au/~andrewm/linux/#zc


I'll give this a shot later. Can you try with the sendfiled-ttcp?
http://www.cyberus.ca/~hadi/ttcp-sf.tar.gz
Anyways, you are NIC-challenged ;- Get GigE. 100Mbps doesnt give
much information.

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-27 Thread jamal




On Sat, 27 Jan 2001, Ion Badulescu wrote:


 750MHz PIII, Adaptec Starfire NIC, driver modified to use hardware sg+csum
 (both Tx/Rx), and Intel i82559 (eepro100), no hardware csum support,
 vanilla driver.

 The box has 512MB of RAM, and I'm using a 100MB file, so it's entirely cached.

 starfire:
 2.4.1-pre10+zerocopy, using sendfile():9.6% CPU
 2.4.1-pre10+zerocopy, using read()/write():   18.3%-29.6% CPU * why so much 
variance?


What are your throughput numbers?

Could you also, please, test using:

http://www.cyberus.ca/~hadi/ttcp-sf.tar.gz

post both sender and receiver data. Repeat each test about
5 times.

cheers,
jamal


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-27 Thread Andrew Morton


jamal wrote:
 
 ..
 It is also useful to have both client and server stats.
 BTW, since the laptop (with the 3C card) is the client, the SG
 shouldnt kick in at all.

The `client' here is doing the sendfiling, so yes, the
gathering occurs on the client.

 ...
  The test tool is, of course, documented [ :-)/2 ].  It's at
 
http://www.uow.edu.au/~andrewm/linux/#zc
 
 
 I'll give this a shot later. Can you try with the sendfiled-ttcp?
 http://www.cyberus.ca/~hadi/ttcp-sf.tar.gz

hmm..  I didn't bother with TCP_CORK because the files being
sent are "much" larger than a frame.  Guess I should.

The problem with things like ttcp is the measurement of CPU load.
If your network is so fast that your machine can't keep up then
fine, raw throughput is a good measure. But if the link is saturated
then normal process accounting doesn't cut it.

For example, at 100 mbps, `top' says ttcp is chewing 4% CPU. But guess
what?  A low-priority process running on the same machine is in fact
slowed down by 30%.  top lies.  Most of the cost of the networking layer
is being accounted to swapper, and lost.  And who accounts for cache
eviction, bus utilisation, etc.  We're better off measuring what's
left behind, rather than measuring what is consumed.

You can in fact do this with ttcp: run it with a super-high priority
and run a little task in the background (dummyload.c in the above
tarball does this).  See how much the dummy task is slowed down
wrt an unloaded system.  It gets tricky on SMP though.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-27 Thread jamal




On Sun, 28 Jan 2001, Andrew Morton wrote:

 jamal wrote:
 
  ..
  It is also useful to have both client and server stats.
  BTW, since the laptop (with the 3C card) is the client, the SG
  shouldnt kick in at all.

 The `client' here is doing the sendfiling, so yes, the
 gathering occurs on the client.


OK, semantics. Maybe we should stick to sender and receiver.
(server normally will translate to "serve" files)

  I'll give this a shot later. Can you try with the sendfiled-ttcp?
  http://www.cyberus.ca/~hadi/ttcp-sf.tar.gz

 hmm..  I didn't bother with TCP_CORK because the files being
 sent are "much" larger than a frame.  Guess I should.

It doesnt make much sense to use sendfile without TCP_CORK.

 The problem with things like ttcp is the measurement of CPU load.
 If your network is so fast that your machine can't keep up then
 fine, raw throughput is a good measure. But if the link is saturated
 then normal process accounting doesn't cut it.

ttcp's CPU measure is not the best. Part of my plan was to change that.
It uses times(). So the measurement is not good. It is infact not
very reflective on SMP. The way to do it there is to break it down
by CPU.
Throughput: 100Mbps is really nothing. Linux never had a problem with
4-500Mbps file serving. So throughput is an important number. so is
end to end latency, but in file serving case, latency might not be a big
deal so ignore it.

 For example, at 100 mbps, `top' says ttcp is chewing 4% CPU. But guess
 what?  A low-priority process running on the same machine is in fact
 slowed down by 30%.  top lies.  Most of the cost of the networking layer
 is being accounted to swapper, and lost.  And who accounts for cache
 eviction, bus utilisation, etc.  We're better off measuring what's
 left behind, rather than measuring what is consumed.

 You can in fact do this with ttcp: run it with a super-high priority
 and run a little task in the background (dummyload.c in the above
 tarball does this).  See how much the dummy task is slowed down
 wrt an unloaded system.  It gets tricky on SMP though.


The best way to do CPU measurement is via /proc. The way top
does it. You measure it from within your nettest program. This does
measure what is "left behind" since your proggie is in user space.
Actually, it shouldnt matter whether you do it from your test program or
from dummyload.c. With dummyload you might have to sigkill the program
every  time a test terminates.
You also should break down utilization by CPU.

cheers,
jamal

PS:- can you try it out with the ttcp testcode i posted?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-27 Thread Andrew Morton


[EMAIL PROTECTED] wrote:
 
 Hello!
 
  2.4.1-pre10+zercopy, using read()/write():  38.1% CPU
 
 write() on zc card is worse than normal write() by definition.
 It generates split buffers.

yes.  The figures below show this.  Disabling SG+checksums speeds
up write() and send().

 Split buffers are more expensive and we have to pay for this.
 You have paid too much for slow card though. 8)

 Do you measure load correctly?

Yes.  Quite confident about this.  Here's the algorithm:

1: Run a cycle-soaker on each CPU on an otherwise unloaded
   system.  See how much "work" they all do per second.

2: Run the cycle-soakers again, but with network traffic happening.
   See how much their "work" is reduced. Deduce networking CPU load
   from this difference.

   The networking code all runs SCHED_FIFO or in interrupt context,
   so the cycle-soakers have no effect upon the network code's access
   to the CPU.

   The "cycle-soakers" just sit there spinning and dirtying 10,000
   cachelines per second.

  2.4.1-pre10+zercopy, using read()/write():  39.2% CPU* hardware tx 
checksums disabled
 
 This is illegal combination of parameters. You force two memory accesses,
 doing this. The fact that it does not add to load is dubious. 8)8)

mm.. Perhaps with read()/write() the data is already in cache?

Anyway, I've tweaked up the tool again so it can do send() or
write() (then I looked at the implementation and wondered why
I'd bothered).  It also does TCP_CORK now.

I ran another set of tests.  The zerocopy patch improves sendfile()
hugely but slows down send()/write() significantly, with a 3c905C:

http://www.uow.edu.au/~andrewm/linux/#zc



The kernels which were tested were 2.4.1-pre10 with and without the
zerocopy patch.  We only look at client load (the TCP sender).

In all tests the link throughput was 11.5 mbytes/sec at all times
(saturated 100baseT) unless otherwise noted.

The client (the thing which sends data) is a dual 500MHz PII with a
3c905C.

For the write() and send() tests, the chunk size was 64 kbytes.

The workload was 63 files with an average length of 350 kbytes.

 CPU

2.4.1-pre10+zerocopy, using sendfile():  9.6%
2.4.1-pre10+zerocopy, using send(): 24.1%
2.4.1-pre10+zerocopy, using write():24.2%

2.4.1-pre10+zerocopy, using sendfile(): 16.2%   * checksums and SG 
disabled
2.4.1-pre10+zerocopy, using send(): 21.5%   * checksums and SG 
disabled
2.4.1-pre10+zerocopy, using write():21.5%   * checksums and SG 
disabled



2.4.1-pre10-vanilla, using sendfile():  17.1%
2.4.1-pre10-vanilla, using send():  21.1%
2.4.1-pre10-vanilla, using write(): 21.1%


Bearing in mind that a large amount of the load is in the device
driver, the zerocopy patch makes a large improvement in sendfile
efficiency.  But read() and send() performance is decreased by 10% -
more than this if you factor out the constant device driver overhead.

TCP_CORK makes no difference.  The files being sent are much larger
than a single frame.

Conclusions:

  For a NIC which cannot do scatter/gather/checksums, the zerocopy
  patch makes no change in throughput in all case.

  For a NIC which can do scatter/gather/checksums, sendfile()
  efficiency is improved by 40% and send() efficiency is decreased by
  10%.  The increase and decrease caused by the zerocopy patch will in
  fact be significantly larger than these two figures, because the
  measurements here include a constant base load caused by the device
  driver.
 


-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-26 Thread Andrew Morton

Aaron Lehmann wrote:
> 
> On Sat, Jan 27, 2001 at 04:45:43PM +1100, Andrew Morton wrote:
> > 2.4.1-pre10-vanilla, using read()/write():  34.5% CPU
> > 2.4.1-pre10+zercopy, using read()/write():  38.1% CPU
> 
> Am I right to be bothered by this?
> 
> The majority of Unix network traffic is handled with read()/write().
> Why would zerocopy slow that down?
> 
> If zerocopy is simply unoptimized, that's fine for now. But if the
> problem is inherent in the implementation or design, that might be a
> problem. Any patch which incurs a signifigant slowdown on traditional
> networking should be contraversial.

Good point.

The figures I quoted for the no-hw-checksum case were still
using scatter/gather.  That can be turned off as well and
it makes it a tiny bit quicker.  So the table is now:

2.4.1-pre10-vanilla, using sendfile():  29.6% CPU
2.4.1-pre10-vanilla, using read()/write():  34.5% CPU

2.4.1-pre10+zercopy, using sendfile():  18.2% CPU
2.4.1-pre10+zercopy, using read()/write():  38.1% CPU

2.4.1-pre10+zercopy, using sendfile():  22.9% CPU* hardware tx checksums 
disabled
2.4.1-pre10+zercopy, using read()/write():  39.2% CPU* hardware tx checksums 
disabled

2.4.1-pre10+zercopy, using sendfile():  22.4% CPU* hardware tx checksums 
and SG disabled
2.4.1-pre10+zercopy, using read()/write():  38.5% CPU* hardware tx checksums 
and SG disabled

But that's not relevant.

I just retested everything.  Yes, the zerocopy patch does
appear to decrease the efficiency of TCP on non-SG+checksumming
hardware by 5% - 10%.  Others need to test...

With an RTL8139/8139too.  CPU is 500MHz PII Celeron, uniprocessor:

2.4.1-pre10-vanilla, using sendfile():  43.8% CPU
2.4.1-pre10-vanilla, using read()/write():  54.1% CPU

2.4.1-pre10+zerocopy, using sendfile(): 43.1% CPU
2.4.1-pre10+zerocopy, using read()/write(): 55.5% CPU

Note that the 8139 only gets 10.8 Mbytes/sec here.  it randomly
jumps up to 11.5 occasionally, but spends most of its time at
10.8. Hard to know what to make of this.  Of course, if you're
using an 8139 you don't care about performance anyway :)

Contradictory results.  rtl8139 doesn't do Rx checksums,
and I think has an extra copy in the driver, so caching effects
may be obscuring things here.

I can test with eepro100 in a couple of days.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-26 Thread Aaron Lehmann

On Sat, Jan 27, 2001 at 04:45:43PM +1100, Andrew Morton wrote:
> 2.4.1-pre10-vanilla, using read()/write():  34.5% CPU
> 2.4.1-pre10+zercopy, using read()/write():  38.1% CPU

Am I right to be bothered by this?

The majority of Unix network traffic is handled with read()/write().
Why would zerocopy slow that down?

If zerocopy is simply unoptimized, that's fine for now. But if the
problem is inherent in the implementation or design, that might be a
problem. Any patch which incurs a signifigant slowdown on traditional
networking should be contraversial.

Aaron Lehmann

please ignore me if I don't know what I'm talking about.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-26 Thread Aaron Lehmann


On Sat, Jan 27, 2001 at 04:45:43PM +1100, Andrew Morton wrote:
 2.4.1-pre10-vanilla, using read()/write():  34.5% CPU
 2.4.1-pre10+zercopy, using read()/write():  38.1% CPU

Am I right to be bothered by this?

The majority of Unix network traffic is handled with read()/write().
Why would zerocopy slow that down?

If zerocopy is simply unoptimized, that's fine for now. But if the
problem is inherent in the implementation or design, that might be a
problem. Any patch which incurs a signifigant slowdown on traditional
networking should be contraversial.

Aaron Lehmann

please ignore me if I don't know what I'm talking about.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

2001-01-26 Thread Andrew Morton


Aaron Lehmann wrote:
 
 On Sat, Jan 27, 2001 at 04:45:43PM +1100, Andrew Morton wrote:
  2.4.1-pre10-vanilla, using read()/write():  34.5% CPU
  2.4.1-pre10+zercopy, using read()/write():  38.1% CPU
 
 Am I right to be bothered by this?
 
 The majority of Unix network traffic is handled with read()/write().
 Why would zerocopy slow that down?
 
 If zerocopy is simply unoptimized, that's fine for now. But if the
 problem is inherent in the implementation or design, that might be a
 problem. Any patch which incurs a signifigant slowdown on traditional
 networking should be contraversial.

Good point.

The figures I quoted for the no-hw-checksum case were still
using scatter/gather.  That can be turned off as well and
it makes it a tiny bit quicker.  So the table is now:

2.4.1-pre10-vanilla, using sendfile():  29.6% CPU
2.4.1-pre10-vanilla, using read()/write():  34.5% CPU

2.4.1-pre10+zercopy, using sendfile():  18.2% CPU
2.4.1-pre10+zercopy, using read()/write():  38.1% CPU

2.4.1-pre10+zercopy, using sendfile():  22.9% CPU* hardware tx checksums 
disabled
2.4.1-pre10+zercopy, using read()/write():  39.2% CPU* hardware tx checksums 
disabled

2.4.1-pre10+zercopy, using sendfile():  22.4% CPU* hardware tx checksums 
and SG disabled
2.4.1-pre10+zercopy, using read()/write():  38.5% CPU* hardware tx checksums 
and SG disabled

But that's not relevant.

I just retested everything.  Yes, the zerocopy patch does
appear to decrease the efficiency of TCP on non-SG+checksumming
hardware by 5% - 10%.  Others need to test...


With an RTL8139/8139too.  CPU is 500MHz PII Celeron, uniprocessor:

2.4.1-pre10-vanilla, using sendfile():  43.8% CPU
2.4.1-pre10-vanilla, using read()/write():  54.1% CPU

2.4.1-pre10+zerocopy, using sendfile(): 43.1% CPU
2.4.1-pre10+zerocopy, using read()/write(): 55.5% CPU

Note that the 8139 only gets 10.8 Mbytes/sec here.  it randomly
jumps up to 11.5 occasionally, but spends most of its time at
10.8. Hard to know what to make of this.  Of course, if you're
using an 8139 you don't care about performance anyway :)


Contradictory results.  rtl8139 doesn't do Rx checksums,
and I think has an extra copy in the driver, so caching effects
may be obscuring things here.

I can test with eepro100 in a couple of days.


-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

75 matches

Mail list logo