Re: [SunRay-Users] Performance tanks after upgrading to 1GB switch, had 100MB
-Original message- To: sunray-users@filibeto.org; From: Kevin J Kelly kevinjmke...@verizon.net Sent: Thu 27-09-2012 02:43 Subject:[SunRay-Users] Performance tanks after upgrading to 1GB switch, had 100MB blocks of lines across the screen when I play local video files. I am running Ubuntu 11.04 with Sun Ray Software 5.2 bundle Firmware version=4.3_146928-01_2011.06.03.14.41, revision=2 Sunray server connection is 1GB, Sun Ray 2's primarily Essentially I had a unmanaged 100MB Netgear switch worked great user experience was fantastic even while videos playing (avi, mp4, etc...) but when I changed to a 1GB managed NetGear switch the performance has deteriorated considerably. I can't play the videos the experience simply is horrible, just lines across the screen when the sunray is under load like playing videos I saw some posts regarding similar issues and it looked like the outcome was either the switch or a potential bug ?? utcapture indicates: With the 100MB switch in place, no packet loss or latency With the 1GB switch in place, packet loss around 10%, got to 16% at one point, latency over 1.0 Any info would be great, it looks like I may need to dumb down the switch ? ___ SunRay-Users mailing list SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users Hi Kevin, we had a lot of problem with performance, too. As Craig already mentioned, it is heavily related to poor handling of UDP traffic by the switches. All DTUs prior the SunRay 3 are operating at 100Mb/s or less. If the server interface operates at 1Gb/s the switch has to buffer UDP packets until the DTU is able to receive more packets. If the switch isn't capable buffering the UDP stream, you will get lost packets. We finally added a dedicated ethernet interface to the servers for the SunRay interconnect. This interface is bound to 100Mb/s (we reduced the speed at the switch port, as we had difficulties to reduce speed at the linux side in a reliable way). Since we set it up like this, the performance issues went away (mostly). As most servers are equipped with two (or more) NICs nowadays, it shouldn't be a big deal. Carsten ___ SunRay-Users mailing list SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users
Re: [SunRay-Users] Performance tanks after upgrading to 1GB switch, had 100MB
To follow up on this, I suffered the same problem myself. I found that disabling flow control on the ports on the switch improved performance dramatically Hope this is useful Cheers Cj On 27 Sep 2012, at 04:10, Craig Bender craig.ben...@oracle.com wrote: Very few (enterprise) switches handle udp buffering well. This limits the rate to prevent overrunning the switch. In essence, it's granting a lower rate. At higher rate, switches tend to drop the udp stream. Ironically, the more expensive the switch, the more apt this is bound to happen. Cheap switches just forward stuff on, they never buffer. On 9/26/12 7:50 PM, David Bullock wrote: On 27 September 2012 11:30, Craig Bender craig.ben...@oracle.com mailto:craig.ben...@oracle.com wrote: Try adding set hires_tick = 1 to /etc/system and reboot Hi Craig, you seem to be referring to lore written up in section 18.11.4.1 of http://docs.oracle.com/html/E22661_15/Troubleshooting-Performance.html where it mentions The X server is allowed to send at a certain specific rate granted by the Sun Ray Client. So, does setting the hires_tick on the server ultimately cause the Sun Ray Client to 'grant' a higher rate? Or does it affect only the server so that it delivers data in a smoother (less bursty) fashion (fill, drain, fill, drain instead of fill,fill,drain,drain where the switch can only take so many un-drained fills before dropping a packet)? Assuming the latter, is it more preferable to have a switch which can handle the buffering, or to set the hires timer? thanks, thanks, David. ___ SunRay-Users mailing list SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users ___ SunRay-Users mailing list SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users ___ SunRay-Users mailing list SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users
Re: [SunRay-Users] Performance tanks after upgrading to 1GB switch, had 100MB
On 09/26/2012 08:30 PM, Craig Bender wrote: Try adding set hires_tick = 1 to /etc/system and reboot I thought this is only for solaris Karl CONFIDENTIALITY NOTICE: This communication (including all attachments) is confidential and is intended for the use of the named addressee(s) only and may contain information that is private, confidential, privileged, and exempt from disclosure under law. All rights to privilege are expressly claimed and reserved and are not waived. Any use, dissemination, distribution, copying or disclosure of this message and any attachments, in whole or in part, by anyone other than the intended recipient(s) is strictly prohibited. If you have received this communication in error, please notify the sender immediately, delete this communication from all data storage devices and destroy all hard copies. ___ SunRay-Users mailing list SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users
Re: [SunRay-Users] Performance tanks after upgrading to 1GB switch, had 100MB
Hi folks, I was (rightfully) questioned about using the term lower rate, I'll attempt to clarify as that's incorrect. What the hires_tick setting does is reduce the interval of when data is sent. Same amount of data, just sent in smaller chunks, over a smaller amount of time. While the following numbers are for illustration only, a scenario where the server was trying to send 200 Mb each second, would now send at 20 Mb every tenth of a second with hires_tick set. The end-result is actually a *higher* effective rate since it's sustainable and not filling up the switches buffer. Switches only have a fixed amount of cache to buffer. So in the above scenario, if your switch can only buffer 100 Mb, you're in trouble from the start because you've sent more than it can handle at once. With hires_tick set, it won't have to buffer since it's well within the switches capability to buffer. When the buffer is overran, packet loss can occur, which makes everything worse as things try to back-off and negotiate a rate where it doesn't happen. Flow control can theoretically help by telling the sender to wait, but it's not always implemented/honored correctly from vendor to vendor, nor is always enabled by default usually due to it not always implemented/honored correctly ;) A switch with a 1 Gb ingress and 100 Mb egress will always buffer, hires_tick reduces the size of what it has to buffer. 100 Mb ingress to 100 Mb egress, no buffering. Thus my statement about enterprise switches was misleading at best. Regardless, hires_tick is an easy tunable doesn't require you to mess with switches, clients, etc. Give it a try. On 9/27/12 6:06 AM, Cj Cant wrote: To follow up on this, I suffered the same problem myself. I found that disabling flow control on the ports on the switch improved performance dramatically Hope this is useful Cheers Cj On 27 Sep 2012, at 04:10, Craig Bender craig.ben...@oracle.com wrote: Very few (enterprise) switches handle udp buffering well. This limits the rate to prevent overrunning the switch. In essence, it's granting a lower rate. At higher rate, switches tend to drop the udp stream. Ironically, the more expensive the switch, the more apt this is bound to happen. Cheap switches just forward stuff on, they never buffer. On 9/26/12 7:50 PM, David Bullock wrote: On 27 September 2012 11:30, Craig Bender craig.ben...@oracle.com mailto:craig.ben...@oracle.com wrote: Try adding set hires_tick = 1 to /etc/system and reboot Hi Craig, you seem to be referring to lore written up in section 18.11.4.1 of http://docs.oracle.com/html/E22661_15/Troubleshooting-Performance.html where it mentions The X server is allowed to send at a certain specific rate granted by the Sun Ray Client. So, does setting the hires_tick on the server ultimately cause the Sun Ray Client to 'grant' a higher rate? Or does it affect only the server so that it delivers data in a smoother (less bursty) fashion (fill, drain, fill, drain instead of fill,fill,drain,drain where the switch can only take so many un-drained fills before dropping a packet)? Assuming the latter, is it more preferable to have a switch which can handle the buffering, or to set the hires timer? thanks, thanks, David. ___ SunRay-Users mailing list SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users ___ SunRay-Users mailing list SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users ___ SunRay-Users mailing list SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users ___ SunRay-Users mailing list SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users
Re: [SunRay-Users] Performance tanks after upgrading to 1GB switch, had 100MB
And that's where I missed the mention of Ubuntu. Sorry. You can check what it's running at by do: cat /boot/config-kernel | grep HZ Where kernel is the version you are running Examples: Ubuntu Jaunty $cat /boot/config-2.6.28-11-generic | grep CONFIG_HZ # CONFIG_HZ_1000 is not set # CONFIG_HZ_300 is not set CONFIG_MACHZ_WDT=m CONFIG_NO_HZ=y CONFIG_HZ=250 # CONFIG_HZ_100 is not set CONFIG_HZ_250=y Oracle EL 5 $ cat /boot/config-2.6.18-164.el5 | grep CONFIG_HZ # CONFIG_HZ_100 is not set # CONFIG_HZ_250 is not set CONFIG_HZ_1000=y CONFIG_HZ=1000 For Ubuntu, you'll want to investigate make menuconfig to compile the kernel with a different timer if not set @ 1000. The relevant option is Timer frequency under Processor type and features. There are also other settings on modern kernels that you can look into that may yield what you want, such as CONFIG_HIGH_RES_TIMERS, without having the change the frequency of the clock. There's even an option to go tick-less with CONFIG_NO_HZ. I don't have experience using either of these, but they sound interesting based on a little research. On 09/26/2012 08:30 PM, Craig Bender wrote: Try adding set hires_tick = 1 to /etc/system and reboot I thought this is only for solaris Karl CONFIDENTIALITY NOTICE: This communication (including all attachments) is confidential and is intended for the use of the named addressee(s) only and may contain information that is private, confidential, privileged, and exempt from disclosure under law. All rights to privilege are expressly claimed and reserved and are not waived. Any use, dissemination, distribution, copying or disclosure of this message and any attachments, in whole or in part, by anyone other than the intended recipient(s) is strictly prohibited. If you have received this communication in error, please notify the sender immediately, delete this communication from all data storage devices and destroy all hard copies. ___ SunRay-Users mailing list SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users ___ SunRay-Users mailing list SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users
Re: [SunRay-Users] Performance tanks after upgrading to 1GB switch, had 100MB
Dear Craig and All, We here at Aberystwyth have too suffered fairly bad sunray performance for a long time. [By the way, in the end I abandoned using Sunray with Solaris 11 as while I got it working, there were lots of niggling problems and time ran out...] I have the hires_tick set. I have the two servers [two T5140s] connected to ports on exactly the same switch as our class room of 40 SunRay 2 units. [I also have a handful of sunrays connected via another linked switch such as the one I am using as I type]. I allow the servers to auto-negotiate with the switch, but I tell the switch to only offer 100mbps full duplex as part of the auto-negotiate and indeed, that's how the two ports connecting to the T5140 servers are shown when I look at current state on the switchor on the servers. I have wire frame window moves set. The switch is a Dell PowerConnect 5448. The performance of screen interactivity can get quite poor when we have more than 20ish students using the system. The two T5140s are often still showing as 90%+ idle on all CPUs even though clicking windows in tools like netbeans or Oracle Studio or moving/clicking tabs in firefox can be very slow. We had a student earlier using a PC with Xming coming in via the other ports on the T5140s and reporting massively better performance than when using the SunRays. Does anyone use a Dell PowerConnect 5448? Does anyone have a list of known good/bad switches to use in a Sunray network? Any other bright ideas/suggestions? Thanks, Dave Price -- Dave Price, Email: d...@aber.ac.uk PHONE: +44 1970 622428 FAX: +44 1970 628536 Post: Computer Science, Aberystwyth University, Penglais, Aberystwyth, WALES, UK, SY23 3DB. ___ SunRay-Users mailing list SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users
Re: [SunRay-Users] Performance tanks after upgrading to 1GB switch, had 100MB
We use Cisco Catalyst 2970 switches for the servers...though in the client areas, 2960's and 2970's are used, too. There was a point in time where performance on the 2970's was really bad (video and everything else was very 'glitchy') unless forced compression on the DTUs was turned on. The servers connected to the 2970's are forced into 1gb mode, and everything else downstream is set to auto negotiate. No problem existed when the servers were connected to 2960's. When we first ran into the issue, I opened a ticket with Sun and Cisco about it (probably on this list, too). Sun gave a bunch of help...Cisco said, ALP?--NOT US! We basically did everything described in this thread: hires_tick = 1, turn off flow control, check switch logs for UDP buffer usage (ours weren't buffering UDP at all), using sunray gather, etc... the solution on our end just came down to leaving forced compression on since nothing seemed obviously wrong on the servers or switches. Over time, we did the usual sun ray server and firmware updates and network ops did their switch IOS updates and this problem just went away. Forced compression on or off, performance is just fine. Matt -Original Message- From: sunray-users-boun...@filibeto.org [mailto:sunray-users-boun...@filibeto.org] On Behalf Of Dave Price Sent: Thursday, September 27, 2012 11:11 AM To: SunRay-Users mailing list Subject: Re: [SunRay-Users] Performance tanks after upgrading to 1GB switch, had 100MB Dear Craig and All, We here at Aberystwyth have too suffered fairly bad sunray performance for a long time. [By the way, in the end I abandoned using Sunray with Solaris 11 as while I got it working, there were lots of niggling problems and time ran out...] I have the hires_tick set. I have the two servers [two T5140s] connected to ports on exactly the same switch as our class room of 40 SunRay 2 units. [I also have a handful of sunrays connected via another linked switch such as the one I am using as I type]. I allow the servers to auto-negotiate with the switch, but I tell the switch to only offer 100mbps full duplex as part of the auto-negotiate and indeed, that's how the two ports connecting to the T5140 servers are shown when I look at current state on the switchor on the servers. I have wire frame window moves set. The switch is a Dell PowerConnect 5448. The performance of screen interactivity can get quite poor when we have more than 20ish students using the system. The two T5140s are often still showing as 90%+ idle on all CPUs even though clicking windows in tools like netbeans or Oracle Studio or moving/clicking tabs in firefox can be very slow. We had a student earlier using a PC with Xming coming in via the other ports on the T5140s and reporting massively better performance than when using the SunRays. Does anyone use a Dell PowerConnect 5448? Does anyone have a list of known good/bad switches to use in a Sunray network? Any other bright ideas/suggestions? Thanks, Dave Price -- Dave Price, Email: d...@aber.ac.uk PHONE: +44 1970 622428 FAX: +44 1970 628536 Post: Computer Science, Aberystwyth University, Penglais, Aberystwyth, WALES, UK, SY23 3DB. ___ SunRay-Users mailing list SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users ___ SunRay-Users mailing list SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users
Re: [SunRay-Users] Performance tanks after upgrading to 1GB switch, had 100MB
When dealing with the interaction of Sun and Cisco equipment, ensure the Sun connected ports are set to Host mode. (ie: disable arbitration for Channeling and Trunking) The arbitration timeouts caused by the Cisco equipment attempting to arbitrate protocols which the Sun's know nothing about, causes nothing but headaches. Note: Cisco ships all equipment with all ports set to arbitrate everything. -Original Message- From: sunray-users-boun...@filibeto.org [mailto:sunray-users-boun...@filibeto.org] On Behalf Of Matthew Arensberg Wieben Sent: Thursday, September 27, 2012 3:54 PM To: SunRay-Users mailing list Subject: Re: [SunRay-Users] Performance tanks after upgrading to 1GB switch, had 100MB We use Cisco Catalyst 2970 switches for the servers...though in the client areas, 2960's and 2970's are used, too. There was a point in time where performance on the 2970's was really bad (video and everything else was very 'glitchy') unless forced compression on the DTUs was turned on. The servers connected to the 2970's are forced into 1gb mode, and everything else downstream is set to auto negotiate. No problem existed when the servers were connected to 2960's. When we first ran into the issue, I opened a ticket with Sun and Cisco about it (probably on this list, too). Sun gave a bunch of help...Cisco said, ALP?--NOT US! We basically did everything described in this thread: hires_tick = 1, turn off flow control, check switch logs for UDP buffer usage (ours weren't buffering UDP at all), using sunray gather, etc... the solution on our end just came down to leaving forced compression on since nothing seemed obviously wrong on the servers or switches. Over time, we did the usual sun ray server and firmware updates and network ops did their switch IOS updates and this problem just went away. Forced compression on or off, performance is just fine. Matt -Original Message- From: sunray-users-boun...@filibeto.org [mailto:sunray-users-boun...@filibeto.org] On Behalf Of Dave Price Sent: Thursday, September 27, 2012 11:11 AM To: SunRay-Users mailing list Subject: Re: [SunRay-Users] Performance tanks after upgrading to 1GB switch, had 100MB Dear Craig and All, We here at Aberystwyth have too suffered fairly bad sunray performance for a long time. [By the way, in the end I abandoned using Sunray with Solaris 11 as while I got it working, there were lots of niggling problems and time ran out...] I have the hires_tick set. I have the two servers [two T5140s] connected to ports on exactly the same switch as our class room of 40 SunRay 2 units. [I also have a handful of sunrays connected via another linked switch such as the one I am using as I type]. I allow the servers to auto-negotiate with the switch, but I tell the switch to only offer 100mbps full duplex as part of the auto-negotiate and indeed, that's how the two ports connecting to the T5140 servers are shown when I look at current state on the switchor on the servers. I have wire frame window moves set. The switch is a Dell PowerConnect 5448. The performance of screen interactivity can get quite poor when we have more than 20ish students using the system. The two T5140s are often still showing as 90%+ idle on all CPUs even though clicking windows in tools like netbeans or Oracle Studio or moving/clicking tabs in firefox can be very slow. We had a student earlier using a PC with Xming coming in via the other ports on the T5140s and reporting massively better performance than when using the SunRays. Does anyone use a Dell PowerConnect 5448? Does anyone have a list of known good/bad switches to use in a Sunray network? Any other bright ideas/suggestions? Thanks, Dave Price -- Dave Price, Email: d...@aber.ac.uk PHONE: +44 1970 622428 FAX: +44 1970 628536 Post: Computer Science, Aberystwyth University, Penglais, Aberystwyth, WALES, UK, SY23 3DB. ___ SunRay-Users mailing list SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users ___ SunRay-Users mailing list SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users ___ SunRay-Users mailing list SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users
Re: [SunRay-Users] Performance tanks after upgrading to 1GB switch, had 100MB
Hi Kevin, Most likely thing is that the new switch is not successfully negotiating full-duplex on the ethernet link. It really kills throughput and ... potentially leads to collisions (which gets you packet loss). You should force both ends of the relevant links to full-duplex. *David Bullock Machaira Enterprises Pty Ltd * PO Box 31 Canowindra NSW 2804 02 9043 7200 http://machaira.com.au/ On 27 September 2012 10:40, Kevin J Kelly kevinjmke...@verizon.net wrote: blocks of lines across the screen when I play local video files. I am running Ubuntu 11.04 with Sun Ray Software 5.2 bundle Firmware version=4.3_146928-01_2011.06.03.14.41, revision=2 Sunray server connection is 1GB, Sun Ray 2's primarily Essentially I had a unmanaged 100MB Netgear switch worked great user experience was fantastic even while videos playing (avi, mp4, etc...) but when I changed to a 1GB managed NetGear switch the performance has deteriorated considerably. I can't play the videos the experience simply is horrible, just lines across the screen when the sunray is under load like playing videos I saw some posts regarding similar issues and it looked like the outcome was either the switch or a potential bug ?? utcapture indicates: With the 100MB switch in place, no packet loss or latency With the 1GB switch in place, packet loss around 10%, got to 16% at one point, latency over 1.0 Any info would be great, it looks like I may need to dumb down the switch ? ___ SunRay-Users mailing list SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users ___ SunRay-Users mailing list SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users
Re: [SunRay-Users] Performance tanks after upgrading to 1GB switch, had 100MB
Try adding set hires_tick = 1 to /etc/system and reboot On 9/26/12 6:14 PM, David Bullock wrote: Hi Kevin, Most likely thing is that the new switch is not successfully negotiating full-duplex on the ethernet link. It really kills throughput and ... potentially leads to collisions (which gets you packet loss). You should force both ends of the relevant links to full-duplex. *David Bullock Machaira Enterprises Pty Ltd * PO Box 31 Canowindra NSW 2804 02 9043 7200 http://machaira.com.au/ On 27 September 2012 10:40, Kevin J Kelly kevinjmke...@verizon.net mailto:kevinjmke...@verizon.net wrote: blocks of lines across the screen when I play local video files. I am running Ubuntu 11.04 with Sun Ray Software 5.2 bundle Firmware version=4.3_146928-01_2011.06.03.14.41, revision=2 Sunray server connection is 1GB, Sun Ray 2's primarily Essentially I had a unmanaged 100MB Netgear switch worked great user experience was fantastic even while videos playing (avi, mp4, etc...) but when I changed to a 1GB managed NetGear switch the performance has deteriorated considerably. I can't play the videos the experience simply is horrible, just lines across the screen when the sunray is under load like playing videos I saw some posts regarding similar issues and it looked like the outcome was either the switch or a potential bug ?? utcapture indicates: With the 100MB switch in place, no packet loss or latency With the 1GB switch in place, packet loss around 10%, got to 16% at one point, latency over 1.0 Any info would be great, it looks like I may need to dumb down the switch ? ___ SunRay-Users mailing list SunRay-Users@filibeto.org mailto:SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users ___ SunRay-Users mailing list SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users ___ SunRay-Users mailing list SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users
Re: [SunRay-Users] Performance tanks after upgrading to 1GB switch, had 100MB
On 27 September 2012 11:30, Craig Bender craig.ben...@oracle.com wrote: Try adding set hires_tick = 1 to /etc/system and reboot Hi Craig, you seem to be referring to lore written up in section 18.11.4.1 of http://docs.oracle.com/html/E22661_15/Troubleshooting-Performance.htmlwhere it mentions The X server is allowed to send at a certain specific rate granted by the Sun Ray Client. So, does setting the hires_tick on the server ultimately cause the Sun Ray Client to 'grant' a higher rate? Or does it affect only the server so that it delivers data in a smoother (less bursty) fashion (fill, drain, fill, drain instead of fill,fill,drain,drain where the switch can only take so many un-drained fills before dropping a packet)? Assuming the latter, is it more preferable to have a switch which can handle the buffering, or to set the hires timer? thanks, thanks, David. ___ SunRay-Users mailing list SunRay-Users@filibeto.org http://www.filibeto.org/mailman/listinfo/sunray-users