Re: Maximizing File/Network I/O
* nixlists [2010-01-14 08:39]: > On Wed, Jan 13, 2010 at 11:43 PM, Henning Brauer > wrote: > > * nixlists [2010-01-14 03:21]: > >> > test results on old P4 are unfortunately pretty much pointless. > >> > >> Why? > >> > >> cpu0: Intel(R) Pentium(R) 4 CPU 2.53GHz ("GenuineIntel" 686-class) 2.52 > GHz > >> > >> Isn't 2.52GHz fast enough for gigabit links? I know that's like half > >> that in P3 cycles, but still... What's the issue? > > > > cache > > What about it? Please elaborate. it's very different in P4 and sucks -- Henning Brauer, h...@bsws.de, henn...@openbsd.org BS Web Services, http://bsws.de Full-Service ISP - Secure Hosting, Mail and DNS Services Dedicated Servers, Rootservers, Application Hosting
Re: Maximizing File/Network I/O
On 2010-01-06, Stuart Henderson wrote: > With a quick test with PCIE RTL8111B on a core2 T7200 machine > and PCI-X BCM5704C on an opteron 146 (both 2GHz), using 1500 MTU and > D-Link DGS-1224T and SMC GS16-Smart switches between them, I get > about 540Mb/s with the re(4) transmitting, 920Mb/s with the bge(4) > transmitting. > > Conn: 1 Mbps: 537.855 Peak Mbps: 557.250 Avg Mbps: 537.855 > Conn: 1 Mbps: 923.241 Peak Mbps: 928.758 Avg Mbps: 923.241 > > (Last time I tried it, enabling jumbos on the bge actually made > things worse, not better). Hmm, strange. I just thought to do another test, this time between two identical BCM5704C bge(4) and was a little surprised at the results; Conn: 1 Mbps: 325.719 Peak Mbps: 342.655 Avg Mbps: 325.719 guess I should break out a profiled kernel sometime...
Re: Maximizing File/Network I/O
--- On Thu, 1/14/10, Jean-Francois wrote: > From: Jean-Francois > Subject: Re: Maximizing File/Network I/O > To: misc@openbsd.org > Received: Thursday, January 14, 2010, 12:53 PM > Le mardi 05 janvier 2010 09:04:53, > nixlists a icrit : > > On Tue, Jan 5, 2010 at 1:45 AM, Bret S. Lambert > wrote: > > > Start with mount_nfs options, specifically -r and > -w; I assume that > > > you would have mentioned tweaking those if you > had already done so. > > > > Setting -r and -w to 16384, and jumbo frames to 9000 > yields just a > > couple of MB/s more. Far from 10 MB/s more the network > can do ;( > > > > For some reasone, when I mount NFS drives with -r=4096 and > -w=4096 I reach > the best transfer rates. > This is possibly because the OS is able to match the request to a single memory page for your architecture. Other architectures offer larger page sizes. Not saying that's the case, but a possibility. __ Looking for the perfect gift? Give the gift of Flickr! http://www.flickr.com/gift/
Re: Maximizing File/Network I/O
Le mardi 05 janvier 2010 09:04:53, nixlists a icrit : > On Tue, Jan 5, 2010 at 1:45 AM, Bret S. Lambert wrote: > > Start with mount_nfs options, specifically -r and -w; I assume that > > you would have mentioned tweaking those if you had already done so. > > Setting -r and -w to 16384, and jumbo frames to 9000 yields just a > couple of MB/s more. Far from 10 MB/s more the network can do ;( > For some reasone, when I mount NFS drives with -r=4096 and -w=4096 I reach the best transfer rates.
Re: Maximizing File/Network I/O
On Wed, Jan 13, 2010 at 11:43 PM, Henning Brauer wrote: > * nixlists [2010-01-14 03:21]: >> > test results on old P4 are unfortunately pretty much pointless. >> >> Why? >> >> cpu0: Intel(R) Pentium(R) 4 CPU 2.53GHz ("GenuineIntel" 686-class) 2.52 GHz >> >> Isn't 2.52GHz fast enough for gigabit links? I know that's like half >> that in P3 cycles, but still... What's the issue? > > cache What about it? Please elaborate. Thanks!
Re: Maximizing File/Network I/O
very OT : Is there some tool for inspection of CPU cache like this one http://docs.sun.com/app/docs/doc/819-2240/cpustat-1m?l=en&a=view ? I found in man pages memconfig(8), but if I'm understand it correctly then it's just for setting. On Thu, Jan 14, 2010 at 5:43 AM, Henning Brauer wrote: > * nixlists [2010-01-14 03:21]: >> > test results on old P4 are unfortunately pretty much pointless. >> >> Why? >> >> B cpu0: Intel(R) Pentium(R) 4 CPU 2.53GHz ("GenuineIntel" 686-class) 2.52 GHz >> >> Isn't 2.52GHz fast enough for gigabit links? I know that's like half >> that in P3 cycles, but still... What's the issue? > > cache > > -- > Henning Brauer, h...@bsws.de, henn...@openbsd.org > BS Web Services, http://bsws.de > Full-Service ISP - Secure Hosting, Mail and DNS Services > Dedicated Servers, Rootservers, Application Hosting
Re: Maximizing File/Network I/O
What shows 'systat vmstat' during your tests plus other "windows" like mbufs and similar, what shows 'vmstat -m' and so on. It will say much more about actual situation of whole system then tcpbench. On Thu, Jan 14, 2010 at 12:49 AM, nixlists wrote: > On Tue, Jan 5, 2010 at 2:32 PM, Henning Brauer wrote: >> I really like the 275 -> 420MBit/s change for 4.6 -> current with pf. >> > > Update: both machines run -current again this time. I think my initial > tcpbench results were poor because of running cbq queuing on 4.6. The > server has em NIC , the client has msk. Jumbo frames are set to 9000 > on both, but don't make much difference. This is with a $20 D-link > switch. > > tcpbench results: > > pf disabled on both machines: 883 Mb/s > > pf enabled on tcpbench server only - simple ruleset like the documentation > example: 619 Mb/s > > pf enabled on both machines - the tcpbench client box has the standard > -current default install pf.conf: 585 Mb/s > > pf enabled on just the tcpbench server: with cbq queuing enabled on > the internal interface as follows (for tcpbench only, not for real > network use) - no other queues defined on $int_if: > > B altq on $int_if cbq bandwidth 1Gb queue { std_in, ssh_im_in, dns_in B } > B queue std_in B B bandwidth 999.9Mb cbq(default,borrow) > > 401 Mb/s > > Why is that? cbq code overhead? The machine doesn't have enough CPU? > Or am I missing something? Admittedly it's an old P4. > > After a while, during benching, even if pf is disabled on both > machines the throughput drops to 587 Mbit/s. The only way to bring it > back up to 883 Mb/s is to reboot the tcpbench client. Anyone know why? > > Thanks!
Re: Maximizing File/Network I/O
* nixlists [2010-01-14 03:21]: > > test results on old P4 are unfortunately pretty much pointless. > > Why? > > cpu0: Intel(R) Pentium(R) 4 CPU 2.53GHz ("GenuineIntel" 686-class) 2.52 GHz > > Isn't 2.52GHz fast enough for gigabit links? I know that's like half > that in P3 cycles, but still... What's the issue? cache -- Henning Brauer, h...@bsws.de, henn...@openbsd.org BS Web Services, http://bsws.de Full-Service ISP - Secure Hosting, Mail and DNS Services Dedicated Servers, Rootservers, Application Hosting
Re: Maximizing File/Network I/O
On Wed, Jan 13, 2010 at 8:39 PM, Henning Brauer wrote: >> pf enabled on just the tcpbench server: with cbq queuing enabled on >> the internal interface as follows (for tcpbench only, not for real >> network use) - no other queues defined on $int_if: >> >> altq on $int_if cbq bandwidth 1Gb queue { std_in, ssh_im_in, dns_in } >> queue std_inbandwidth 999.9Mb cbq(default,borrow) >> >> 401 Mb/s >> >> Why is that? cbq code overhead? The machine doesn't have enough CPU? >> Or am I missing something? Admittedly it's an old P4. > > test results on old P4 are unfortunately pretty much pointless. Why? cpu0: Intel(R) Pentium(R) 4 CPU 2.53GHz ("GenuineIntel" 686-class) 2.52 GHz Isn't 2.52GHz fast enough for gigabit links? I know that's like half that in P3 cycles, but still... What's the issue? Thank you.
Re: Maximizing File/Network I/O
* nixlists [2010-01-14 01:09]: > On Tue, Jan 5, 2010 at 2:32 PM, Henning Brauer wrote: > > I really like the 275 -> 420MBit/s change for 4.6 -> current with pf. > > > > Update: both machines run -current again this time. I think my initial > tcpbench results were poor because of running cbq queuing on 4.6. The > server has em NIC , the client has msk. Jumbo frames are set to 9000 > on both, but don't make much difference. This is with a $20 D-link > switch. > > tcpbench results: > > pf disabled on both machines: 883 Mb/s > > pf enabled on tcpbench server only - simple ruleset like the documentation > example: 619 Mb/s > > pf enabled on both machines - the tcpbench client box has the standard > -current default install pf.conf: 585 Mb/s > > pf enabled on just the tcpbench server: with cbq queuing enabled on > the internal interface as follows (for tcpbench only, not for real > network use) - no other queues defined on $int_if: > > altq on $int_if cbq bandwidth 1Gb queue { std_in, ssh_im_in, dns_in } > queue std_inbandwidth 999.9Mb cbq(default,borrow) > > 401 Mb/s > > Why is that? cbq code overhead? The machine doesn't have enough CPU? > Or am I missing something? Admittedly it's an old P4. test results on old P4 are unfortunately pretty much pointless. > After a while, during benching, even if pf is disabled on both > machines the throughput drops to 587 Mbit/s. The only way to bring it > back up to 883 Mb/s is to reboot the tcpbench client. Anyone know why? that seems weird. CPU throttling down becuase it overheated, perhaps? could be some ressource issue in OpenBSD as well, but i've never seen such behaviour -- Henning Brauer, h...@bsws.de, henn...@openbsd.org BS Web Services, http://bsws.de Full-Service ISP - Secure Hosting, Mail and DNS Services Dedicated Servers, Rootservers, Application Hosting
Re: Maximizing File/Network I/O
On Tue, Jan 5, 2010 at 2:32 PM, Henning Brauer wrote: > I really like the 275 -> 420MBit/s change for 4.6 -> current with pf. > Update: both machines run -current again this time. I think my initial tcpbench results were poor because of running cbq queuing on 4.6. The server has em NIC , the client has msk. Jumbo frames are set to 9000 on both, but don't make much difference. This is with a $20 D-link switch. tcpbench results: pf disabled on both machines: 883 Mb/s pf enabled on tcpbench server only - simple ruleset like the documentation example: 619 Mb/s pf enabled on both machines - the tcpbench client box has the standard -current default install pf.conf: 585 Mb/s pf enabled on just the tcpbench server: with cbq queuing enabled on the internal interface as follows (for tcpbench only, not for real network use) - no other queues defined on $int_if: altq on $int_if cbq bandwidth 1Gb queue { std_in, ssh_im_in, dns_in } queue std_inbandwidth 999.9Mb cbq(default,borrow) 401 Mb/s Why is that? cbq code overhead? The machine doesn't have enough CPU? Or am I missing something? Admittedly it's an old P4. After a while, during benching, even if pf is disabled on both machines the throughput drops to 587 Mbit/s. The only way to bring it back up to 883 Mb/s is to reboot the tcpbench client. Anyone know why? Thanks!
Re: Maximizing File/Network I/O
On Fri, Jan 8, 2010 at 10:13 PM, Henning Brauer wrote: > * nixlists [2010-01-06 09:33]: >> On Wed, Jan 6, 2010 at 2:31 PM, Henning Brauer wrote: >> > I really like the 275 -> 420MBit/s change for 4.6 -> current with pf. >> >> Disabling pf gives a couple of MB/s more. > > really. what a surprise. Anything wrong with http://everything2.com/title/stating+the+obvious ? But I guess, there's nothing wrong with making fun of it, either...
Re: Maximizing File/Network I/O
* nixlists [2010-01-06 09:33]: > On Wed, Jan 6, 2010 at 2:31 PM, Henning Brauer wrote: > > I really like the 275 -> 420MBit/s change for 4.6 -> current with pf. > > Disabling pf gives a couple of MB/s more. really. what a surprise. -- Henning Brauer, h...@bsws.de, henn...@openbsd.org BS Web Services, http://bsws.de Full-Service ISP - Secure Hosting, Mail and DNS Services Dedicated Servers, Rootservers, Application Hosting
Re: Maximizing File/Network I/O
* Uwe Werler [2010-01-08 23:38]: > > I really like the 275 -> 420MBit/s change for 4.6 -> current with pf. > Oh cool! There's this explained a little bit deeper? Sounds VERY > interesting. well, yu know, i have been working on pf and general network stack performance for years. others have improved performance in subsystems used. i almost always bench my changes. i cannot point my finger to one change between 4.6 and -current that is the cause for this improvement, there were a few - and i keep forgetting what made 4.6 and what was after. -- Henning Brauer, h...@bsws.de, henn...@openbsd.org BS Web Services, http://bsws.de Full-Service ISP - Secure Hosting, Mail and DNS Services Dedicated Servers, Rootservers, Application Hosting
Re: Maximizing File/Network I/O
> * Iqigo Ortiz de Urbina tarom...@gmail.com [2010-01-05 11:24]: >> On Tue, Jan 5, 2010 at 9:13 AM, Tomas Bodzar tomas.bod...@gmail.com >> wrote: >> >> > There is much more to do. You can find some ideas eg. here >> > http://www.openbsd.org/papers/tuning-openbsd.ps . It's good idea to >> > follow outputs of systat, vmstat and top for some time to find >> > bottlenecks. >> > >> > >> I recall a message in misc (which I am not able to find on the archives) >> about someone posting here the results of his research on optimizing and >> improving OpenBSD overall performance (fs, network, etc). >> >> Among the links he posted on his comprehensive compilation, he sent >> tuning-openbsd.ps. > > I'm one of the two authors of this paper. > ignore it. it is obsolete. > >> I remember one reply of a developer stating that some of those tuning >> measures are not needed anymore as OpenBSD has grown quite a bit since that >> time. Which are the recommended -always working- directions, then, to tune >> a >> system for its particular needs? > > there isn't really all that much needed these days, defaults are good. > some very specific situations benefit from some specific things, but > usually, you are wasting time trying to "tune". > >> My point is we all have to be careful and not follow guides or try values on >> sysctls blindly (although experimenting is welcome and healthy) as we can >> harm more than benefit we can get. Still, some enviroments will need >> adjustment to push much more traffic than GENERIC can, and this is a really >> hard task to accomplish unless you are a @henning or @claudio :) > > heh :) > > I really like the 275 -> 420MBit/s change for 4.6 -> current with pf. Oh cool! There's this explained a little bit deeper? Sounds VERY interesting.
Re: Maximizing File/Network I/O
On 2010-01-05, Aaron Mason wrote: > With top notch stuff (we're talking HP Procurve/Cisco Catalyst and > Intel PRO/1000+ cards here) plus tuning for Jumbo frames, you can get > to the 95MB/sec range. Things on the computer side (NICs, motherboard, drivers etc) affect performance much more than switches. Many of the cheaper 'web-managed' switches have very acceptable performance. I much prefer Procurves but if the budget isn't there, well-chosen cheaper switches can do pretty well. With a quick test with PCIE RTL8111B on a core2 T7200 machine and PCI-X BCM5704C on an opteron 146 (both 2GHz), using 1500 MTU and D-Link DGS-1224T and SMC GS16-Smart switches between them, I get about 540Mb/s with the re(4) transmitting, 920Mb/s with the bge(4) transmitting. Conn: 1 Mbps: 537.855 Peak Mbps: 557.250 Avg Mbps: 537.855 Conn: 1 Mbps: 923.241 Peak Mbps: 928.758 Avg Mbps: 923.241 (Last time I tried it, enabling jumbos on the bge actually made things worse, not better). I don't know much about the el-cheapo unmanaged switches, I've saved enough time in tracking down problems through having port error and traffic stats (via snmp on the dlink) that I don't bother with them at all any more.
Re: Maximizing File/Network I/O
On Wed, Jan 6, 2010 at 2:31 PM, Henning Brauer wrote: > I really like the 275 -> 420MBit/s change for 4.6 -> current with pf. Disabling pf gives a couple of MB/s more.
Re: Maximizing File/Network I/O
* Iqigo Ortiz de Urbina [2010-01-05 11:24]: > On Tue, Jan 5, 2010 at 9:13 AM, Tomas Bodzar wrote: > > > There is much more to do. You can find some ideas eg. here > > http://www.openbsd.org/papers/tuning-openbsd.ps . It's good idea to > > follow outputs of systat, vmstat and top for some time to find > > bottlenecks. > > > > > I recall a message in misc (which I am not able to find on the archives) > about someone posting here the results of his research on optimizing and > improving OpenBSD overall performance (fs, network, etc). > > Among the links he posted on his comprehensive compilation, he sent > tuning-openbsd.ps. I'm one of the two authors of this paper. ignore it. it is obsolete. > I remember one reply of a developer stating that some of those tuning > measures are not needed anymore as OpenBSD has grown quite a bit since that > time. Which are the recommended -always working- directions, then, to tune a > system for its particular needs? there isn't really all that much needed these days, defaults are good. some very specific situations benefit from some specific things, but usually, you are wasting time trying to "tune". > My point is we all have to be careful and not follow guides or try values on > sysctls blindly (although experimenting is welcome and healthy) as we can > harm more than benefit we can get. Still, some enviroments will need > adjustment to push much more traffic than GENERIC can, and this is a really > hard task to accomplish unless you are a @henning or @claudio :) heh :) I really like the 275 -> 420MBit/s change for 4.6 -> current with pf. -- Henning Brauer, h...@bsws.de, henn...@openbsd.org BS Web Services, http://bsws.de Full-Service ISP - Secure Hosting, Mail and DNS Services Dedicated Servers, Rootservers, Application Hosting
Re: Maximizing File/Network I/O
Rpc unfortunately is slow. On Jan 5, 2010, at 2:04, nixlists wrote: On Tue, Jan 5, 2010 at 1:45 AM, Bret S. Lambert wrote: Start with mount_nfs options, specifically -r and -w; I assume that you would have mentioned tweaking those if you had already done so. Setting -r and -w to 16384, and jumbo frames to 9000 yields just a couple of MB/s more. Far from 10 MB/s more the network can do ;(
Re: Maximizing File/Network I/O
On Tue, Jan 5, 2010 at 9:13 AM, Tomas Bodzar wrote: > There is much more to do. You can find some ideas eg. here > http://www.openbsd.org/papers/tuning-openbsd.ps . It's good idea to > follow outputs of systat, vmstat and top for some time to find > bottlenecks. > > I recall a message in misc (which I am not able to find on the archives) about someone posting here the results of his research on optimizing and improving OpenBSD overall performance (fs, network, etc). Among the links he posted on his comprehensive compilation, he sent tuning-openbsd.ps. I remember one reply of a developer stating that some of those tuning measures are not needed anymore as OpenBSD has grown quite a bit since that time. Which are the recommended -always working- directions, then, to tune a system for its particular needs? My point is we all have to be careful and not follow guides or try values on sysctls blindly (although experimenting is welcome and healthy) as we can harm more than benefit we can get. Still, some enviroments will need adjustment to push much more traffic than GENERIC can, and this is a really hard task to accomplish unless you are a @henning or @claudio :) > On Tue, Jan 5, 2010 at 9:04 AM, nixlists wrote: > > On Tue, Jan 5, 2010 at 1:45 AM, Bret S. Lambert > wrote: > >> Start with mount_nfs options, specifically -r and -w; I assume that > >> you would have mentioned tweaking those if you had already done so. > > > > Setting -r and -w to 16384, and jumbo frames to 9000 yields just a > > couple of MB/s more. Far from 10 MB/s more the network can do ;( > > > > > > > > -- > http://www.openbsd.org/lyrics.html
Re: Maximizing File/Network I/O
On Tue, Jan 05, 2010 at 03:04:53AM -0500, nixlists wrote: > On Tue, Jan 5, 2010 at 1:45 AM, Bret S. Lambert wrote: > > Start with mount_nfs options, specifically -r and -w; I assume that > > you would have mentioned tweaking those if you had already done so. > > Setting -r and -w to 16384, and jumbo frames to 9000 yields just a > couple of MB/s more. Far from 10 MB/s more the network can do ;( > Jumbo frames on em(4) will not gane you that much since the driver is already very efficent. msk(4) is a bit a different story one problem for high speed low delay links is the interrupt mitigation in those cards. msk(4) delays packets a lot more then em(4). Plus you should run -current on msk(4) systems (a few things got fixed at f2k9). -- :wq Claudio
Re: Maximizing File/Network I/O
There is much more to do. You can find some ideas eg. here http://www.openbsd.org/papers/tuning-openbsd.ps . It's good idea to follow outputs of systat, vmstat and top for some time to find bottlenecks. On Tue, Jan 5, 2010 at 9:04 AM, nixlists wrote: > On Tue, Jan 5, 2010 at 1:45 AM, Bret S. Lambert wrote: >> Start with mount_nfs options, specifically -r and -w; I assume that >> you would have mentioned tweaking those if you had already done so. > > Setting -r and -w to 16384, and jumbo frames to 9000 yields just a > couple of MB/s more. Far from 10 MB/s more the network can do ;( > > -- http://www.openbsd.org/lyrics.html
Re: Maximizing File/Network I/O
On Tue, Jan 5, 2010 at 1:45 AM, Bret S. Lambert wrote: > Start with mount_nfs options, specifically -r and -w; I assume that > you would have mentioned tweaking those if you had already done so. Setting -r and -w to 16384, and jumbo frames to 9000 yields just a couple of MB/s more. Far from 10 MB/s more the network can do ;(
Re: Maximizing File/Network I/O
On Tue, Jan 05, 2010 at 01:02:08AM -0500, nixlists wrote: > On Tue, Jan 5, 2010 at 12:40 AM, Aaron Mason > wrote: > > It would be best put this way - if you go for the lowest bidder, in > > most cases you get what you pay for. Your results aren't too bad > > considering what's in use. > > Thanks. Where could I find more info on tuning jumbo frames? Both > cards support it... Start with mount_nfs options, specifically -r and -w; I assume that you would have mentioned tweaking those if you had already done so. > > Update: after upgrading the other machine to -current. tcpbench > performs around 420 Mbit/s now :D > > One of the machines is using pf...
Re: Maximizing File/Network I/O
On Tue, Jan 5, 2010 at 12:40 AM, Aaron Mason wrote: > It would be best put this way - if you go for the lowest bidder, in > most cases you get what you pay for. Your results aren't too bad > considering what's in use. Thanks. Where could I find more info on tuning jumbo frames? Both cards support it... Update: after upgrading the other machine to -current. tcpbench performs around 420 Mbit/s now :D One of the machines is using pf...
Re: Maximizing File/Network I/O
On Tue, Jan 5, 2010 at 2:05 PM, nixlists wrote: > Hi. > > I have two machines one running 4.6, the other running a recent > snapshot of current. tcpbench reports maximum throughput of 275 Mbit - > that's around 34 MB/s between them over a gig-E link. What should one > expect with an el-cheapo gig-e switch and 'em' Intel NIC and a msk > NIC? Is that reasonable or too slow? > > The 4.6 machine has a softraid mirror and can read off it at around 55 > MB/s as shown by 'dd', and the -current machine has an eSATA enclosure > mounted async for the purpose of quickly backing up to it, that I can > write to at around 45 MB/s as shown by 'dd'. However copying over the > network to it - through NFS I can only get around 15 MB/s. Where is > the bottleneck? > How to fix?? > > Copying with rsync over ssh is even slower due to rsync and ssh eating > quite a bit of CPU - but that's to be expected. > > Thanks a bunch. > > It would be best put this way - if you go for the lowest bidder, in most cases you get what you pay for. Your results aren't too bad considering what's in use. With top notch stuff (we're talking HP Procurve/Cisco Catalyst and Intel PRO/1000+ cards here) plus tuning for Jumbo frames, you can get to the 95MB/sec range. And it's not just CPU usage that slows rsync over ssh - the transfer rate is only counting the data that gets pushed through - it doesn't cover the encrypted data, which is something like 30-40% bigger than the original. -- Aaron Mason - Programmer, open source addict I've taken my software vows - for beta or for worse