Re: Performance Issues (Was Re: RB server upgrade from 1.6.1 to 1.7.4)
Sorry for the late response. I missed this reply. For Apache settings the worker and prefork configurations are the exact same between the two vms: # prefork MPM # StartServers: number of server processes to start # MinSpareServers: minimum number of server processes which are kept spare # MaxSpareServers: maximum number of server processes which are kept spare # ServerLimit: maximum value for MaxClients for the lifetime of the server # MaxClients: maximum number of server processes allowed to start # MaxRequestsPerChild: maximum number of requests a server process serves IfModule prefork.c StartServers 8 MinSpareServers5 MaxSpareServers 20 ServerLimit 256 MaxClients 256 MaxRequestsPerChild 100 /IfModule # worker MPM # StartServers: initial number of server processes to start # MaxClients: maximum number of simultaneous client connections # MinSpareThreads: minimum number of worker threads which are kept spare # MaxSpareThreads: maximum number of worker threads which are kept spare # ThreadsPerChild: constant number of worker threads in each server process # MaxRequestsPerChild: maximum number of requests a server process serves IfModule worker.c StartServers 2 MaxClients 150 MinSpareThreads 25 MaxSpareThreads 75 ThreadsPerChild 25 MaxRequestsPerChild 0 /IfModule *Symptoms from our old production vm:* Previously, only about 2-3 times a day at random times, we would get a build of Apache processes that would hit the server at the same time, which results in the load average on top going up to 100-200 in the worse case scenario. During this time, any operations done in the website are extremely slow and often times users will report not receiving an email after a publish. Since then, we've increased the sendmail Queue and Refuse limits from their default values of (12 and 15) to (20 and 150) respectively. The stats on this vm were: - 10 GB RAM - 23 GB Swap - 4 Cores - RHEL 5.3 - Red Hat Enterprise Linux Server release 5.3 (Tikanga) - Server version: Apache/2.2.3 *Symptoms on our new production vm:* We moved the production to RHEL6.4 as recommended by our IT team and have since been noticing that we we get these stalled processes more often and users tend to notice the performance hits much more. Another odd thing that we noticed from our performance monitoring tool Zenoss is that the IO spikes on writes every 5 minutes, which was on the case previously (screenshots attached). The stats on this vm were: - 16 GB RAM - 4 GB Swap - 4 Cores - Red Hat Enterprise Linux Server release 6.4 (Santiago) - Server version: Apache/2.2.15 (Unix) We're trying a lot of different things on our end, but if you have any ideas or if anyone has seen this issue, it would help. https://lh4.googleusercontent.com/-L5GVIinnSbo/UxpXxkdzJDI/Ckg/Aq9DYWW6AXs/s1600/Screen+Shot+2014-03-07+at+3.35.13+PM.png https://lh6.googleusercontent.com/-7FD3HifikjY/UxpXjy3QUwI/CkY/BFO-tGDUwoE/s1600/Screen+Shot+2014-03-07+at+3.33.09+PM.png Ze On Thursday, March 6, 2014 8:21:00 PM UTC-8, Christian Hammond wrote: Okay, well, I was hoping it'd be simple :) Can you give me some examples of operations that are very slow, and operations that remain fast? Or does everything basically slow to a grind? How do the Apache settings (worker vs prefork, and their config) compare between installs? Christian On Thursday, March 6, 2014, Ze Xiao ilackno...@gmail.com javascript: wrote: Thanks for the quick reply. Yes, memcached is running. Here is what I see from the Admin Server Cache page I've got it running on two different vms, which I've obfuscated as VM1 and VM2 SERVER CACHE Cache backend: django.core.cache.backends.memcached.CacheClass vm1 Memory usage: 1.8 GB Keys in cache: 61079 of 257077 Cache hits: 5289571 of 5458860: 96% Cache misses: 169289 of 5458860: 3% Cache evictions: 139881 Cache traffic: 10.2 GB in, 27.9 GB out Uptime: 3683047 seconds vm2 Memory usage: 1.8 GB Keys in cache: 54978 of 401980 Cache hits: 5999634 of 6277198: 95% Cache misses: 277564 of 6277198: 4% Cache evictions: 307751 Cache traffic: 16.8 GB in, 26.2 GB out Uptime: 938019 seconds On Thu, Mar 6, 2014 at 5:20 PM, Christian Hammond chip...@chipx86.comwrote: Hi Ze, Those warnings are probably unrelated. I want to get a better sense of the performance problems. First thing I want to check is that your server is properly accessing and using memcached. If you log into the admin UI, do you see any stats on memcached, and any keys stored in the cache? Christian -- Christian Hammond - chip...@chipx86.com Review Board - http://www.reviewboard.org Beanbag, Inc. - http://www.beanbaginc.com On Thu, Mar 6, 2014 at 4:52 PM, Ze Lin Xiao ilacknormal...@gmail.comwrote: Hi Christian, We're facing some pretty bad performance issues on our production system after we moved our application to a different
Re: Performance Issues (Was Re: RB server upgrade from 1.6.1 to 1.7.4)
Hi Ze, The spikes every 5 minutes are interesting. Sounds like a cronjob or something, perhaps? Are you using search indexing? What are you using for the database? Remind me what version of RB you guys are using? - Christian -- Christian Hammond - chip...@chipx86.com Review Board - http://www.reviewboard.org Beanbag, Inc. - http://www.beanbaginc.com On Fri, Mar 7, 2014 at 3:36 PM, Ze Lin Xiao ilacknormal...@gmail.comwrote: Sorry for the late response. I missed this reply. For Apache settings the worker and prefork configurations are the exact same between the two vms: # prefork MPM # StartServers: number of server processes to start # MinSpareServers: minimum number of server processes which are kept spare # MaxSpareServers: maximum number of server processes which are kept spare # ServerLimit: maximum value for MaxClients for the lifetime of the server # MaxClients: maximum number of server processes allowed to start # MaxRequestsPerChild: maximum number of requests a server process serves IfModule prefork.c StartServers 8 MinSpareServers5 MaxSpareServers 20 ServerLimit 256 MaxClients 256 MaxRequestsPerChild 100 /IfModule # worker MPM # StartServers: initial number of server processes to start # MaxClients: maximum number of simultaneous client connections # MinSpareThreads: minimum number of worker threads which are kept spare # MaxSpareThreads: maximum number of worker threads which are kept spare # ThreadsPerChild: constant number of worker threads in each server process # MaxRequestsPerChild: maximum number of requests a server process serves IfModule worker.c StartServers 2 MaxClients 150 MinSpareThreads 25 MaxSpareThreads 75 ThreadsPerChild 25 MaxRequestsPerChild 0 /IfModule *Symptoms from our old production vm:* Previously, only about 2-3 times a day at random times, we would get a build of Apache processes that would hit the server at the same time, which results in the load average on top going up to 100-200 in the worse case scenario. During this time, any operations done in the website are extremely slow and often times users will report not receiving an email after a publish. Since then, we've increased the sendmail Queue and Refuse limits from their default values of (12 and 15) to (20 and 150) respectively. The stats on this vm were: - 10 GB RAM - 23 GB Swap - 4 Cores - RHEL 5.3 - Red Hat Enterprise Linux Server release 5.3 (Tikanga) - Server version: Apache/2.2.3 *Symptoms on our new production vm:* We moved the production to RHEL6.4 as recommended by our IT team and have since been noticing that we we get these stalled processes more often and users tend to notice the performance hits much more. Another odd thing that we noticed from our performance monitoring tool Zenoss is that the IO spikes on writes every 5 minutes, which was on the case previously (screenshots attached). The stats on this vm were: - 16 GB RAM - 4 GB Swap - 4 Cores - Red Hat Enterprise Linux Server release 6.4 (Santiago) - Server version: Apache/2.2.15 (Unix) We're trying a lot of different things on our end, but if you have any ideas or if anyone has seen this issue, it would help. https://lh4.googleusercontent.com/-L5GVIinnSbo/UxpXxkdzJDI/Ckg/Aq9DYWW6AXs/s1600/Screen+Shot+2014-03-07+at+3.35.13+PM.png https://lh6.googleusercontent.com/-7FD3HifikjY/UxpXjy3QUwI/CkY/BFO-tGDUwoE/s1600/Screen+Shot+2014-03-07+at+3.33.09+PM.png Ze On Thursday, March 6, 2014 8:21:00 PM UTC-8, Christian Hammond wrote: Okay, well, I was hoping it'd be simple :) Can you give me some examples of operations that are very slow, and operations that remain fast? Or does everything basically slow to a grind? How do the Apache settings (worker vs prefork, and their config) compare between installs? Christian On Thursday, March 6, 2014, Ze Xiao ilackno...@gmail.com wrote: Thanks for the quick reply. Yes, memcached is running. Here is what I see from the Admin Server Cache page I've got it running on two different vms, which I've obfuscated as VM1 and VM2 SERVER CACHE Cache backend: django.core.cache.backends.memcached.CacheClass vm1 Memory usage: 1.8 GB Keys in cache: 61079 of 257077 Cache hits: 5289571 of 5458860: 96% Cache misses: 169289 of 5458860: 3% Cache evictions: 139881 Cache traffic: 10.2 GB in, 27.9 GB out Uptime: 3683047 seconds vm2 Memory usage: 1.8 GB Keys in cache: 54978 of 401980 Cache hits: 5999634 of 6277198: 95% Cache misses: 277564 of 6277198: 4% Cache evictions: 307751 Cache traffic: 16.8 GB in, 26.2 GB out Uptime: 938019 seconds On Thu, Mar 6, 2014 at 5:20 PM, Christian Hammond chip...@chipx86.comwrote: Hi Ze, Those warnings are probably unrelated. I want to get a better sense of the performance problems. First thing I want to check is that your server is properly accessing and using
Performance Issues (Was Re: RB server upgrade from 1.6.1 to 1.7.4)
Hi Ze, Those warnings are probably unrelated. I want to get a better sense of the performance problems. First thing I want to check is that your server is properly accessing and using memcached. If you log into the admin UI, do you see any stats on memcached, and any keys stored in the cache? Christian -- Christian Hammond - chip...@chipx86.com Review Board - http://www.reviewboard.org Beanbag, Inc. - http://www.beanbaginc.com On Thu, Mar 6, 2014 at 4:52 PM, Ze Lin Xiao ilacknormal...@gmail.comwrote: Hi Christian, We're facing some pretty bad performance issues on our production system after we moved our application to a different vm with RHEL6.4. We notice that our performance issues occur especially when the log shows this: [Fri Mar 07 00:18:19 2014] [error] /opt/software/lib/python2.7/site-packages/pycrypto-2.6.1-py2.7-linux-x86_64.egg/Crypto/Util/number.py:57: PowmInsecureWarning: Not using mpz_powm_sec. You should rebuild using libgmp = 5 to avoid timing attack vulnerability. However, it is important to note that we've seen these warning issues for the last 1.5 years, so I doubt it has to do with it. Nonetheless, do you know what specific operations one could do to trigger this warning? I'm trying to see if I can reproduce the performance spikes. Thanks, Ze On Wednesday, February 6, 2013 12:22:49 AM UTC-8, Christian Hammond wrote: Hi Chuck, Sorry for failing to respond to the previous e-mail. Missed it. I haven't seen that particular warning before. It'll probably have a log entry any time pycrypto is imported. What distro/version are you using? Sounds like maybe it's an older one? You may need to hand-upgrade libgmp, I'm not sure. From your previous e-mail: Doing a site backup never hurts, but generally isn't important. Review Board won't delete any files. At most, it'd add some new directories and tell you to change permissions, but I don't think we've done that since 1.5. We have provided instructions on other sorts of manual updates that need to be made, though. We don't have any documentation right now on p4python's SSL support. This is only needed if you're using SSL-backed Perforce repositories. It's unfortunately not something we can automate well right now, but essentially, you'd have to install OpenSSL 1.0.1 on your distro and install its development package (I don't know if newer versions work -- hopefully other 1.0.x releases do). You'd then need to manually compile/install p4python. Yes, it's a pain, but it's something Perforce will need to make easier for us. From the e-mail you just posted while I was replying to this, you'd need to check the reviewboard.log file and see what error it's reporting before I can say what happened. Christian -- Christian Hammond - chi...@chipx86.com Review Board - http://www.reviewboard.org VMware, Inc. - http://www.vmware.com On Feb 6, 2013, at 12:10 AM, chuck j cjerr...@gmail.com wrote: Hi Christian, I would like to thank you for your response about upgrade. I went through with your comments and i was able to bring my server to 1.7.4. Also also want to bring to your notice regarding below warning i got after while upgrading my site. /usr/local/lib/python2.7/site-packages/pycrypto-2.6-py2.7- linux-x86_64.egg/Crypto/Util/number.py:57: PowmInsecureWarning: Not using mpz_powm_sec. You should rebuild using libgmp = 5 to avoid timing attack vulnerability. How to resolve this? Do i need to build it libgmp again as message shows, will it make RB server report more issues. Thanks, -Chuck On Fri, Feb 1, 2013 at 6:58 PM, chuck j cjerr...@gmail.com wrote: Thanks Christian for the response. Good to hear that upgrade is possible from 1.6.1 to 1.7.4 RB version, apart from the database backup do we need to take care of any thing else which will disturb our production setup and in case of any issue we should be able to go back to our original state, if you point us action item it would be really great. Few queries though 1. How does upgrade takes place, does it replace files by files ( I mean python scripts etc ) apart from db. 2. The Release note of 1.7.2 its been mentioned about below However, this requires that p4python is specially compiled with OpenSSL support, and that the system has development headers for OpenSSL 1.0.1. P4PythonInstaller doesn’t do this, so users who need this feature will currently have to compile p4python manually, providing the path to the SSL directory using --ssl Do we have any tech note for the above steps which end user needs to perform. Cheers, Chuck On Thu, Jan 31, 2013 at 2:50 PM, Christian Hammond chi...@chipx86.comwrote: Hi Chuck, I always recommend backing up your database first, but you should be able to upgrade from 1.6.1 to 1.7.4 without any real problems. There is a bug that some people hit a while back in older versions that introduced some stale upgrade data in the database. I meant to get a
Re: Performance Issues (Was Re: RB server upgrade from 1.6.1 to 1.7.4)
Thanks for the quick reply. Yes, memcached is running. Here is what I see from the Admin Server Cache page I've got it running on two different vms, which I've obfuscated as VM1 and VM2 SERVER CACHE Cache backend: django.core.cache.backends.memcached.CacheClass vm1 Memory usage: 1.8 GB Keys in cache: 61079 of 257077 Cache hits: 5289571 of 5458860: 96% Cache misses: 169289 of 5458860: 3% Cache evictions: 139881 Cache traffic: 10.2 GB in, 27.9 GB out Uptime: 3683047 seconds vm2 Memory usage: 1.8 GB Keys in cache: 54978 of 401980 Cache hits: 5999634 of 6277198: 95% Cache misses: 277564 of 6277198: 4% Cache evictions: 307751 Cache traffic: 16.8 GB in, 26.2 GB out Uptime: 938019 seconds On Thu, Mar 6, 2014 at 5:20 PM, Christian Hammond chip...@chipx86.comwrote: Hi Ze, Those warnings are probably unrelated. I want to get a better sense of the performance problems. First thing I want to check is that your server is properly accessing and using memcached. If you log into the admin UI, do you see any stats on memcached, and any keys stored in the cache? Christian -- Christian Hammond - chip...@chipx86.com Review Board - http://www.reviewboard.org Beanbag, Inc. - http://www.beanbaginc.com On Thu, Mar 6, 2014 at 4:52 PM, Ze Lin Xiao ilacknormal...@gmail.comwrote: Hi Christian, We're facing some pretty bad performance issues on our production system after we moved our application to a different vm with RHEL6.4. We notice that our performance issues occur especially when the log shows this: [Fri Mar 07 00:18:19 2014] [error] /opt/software/lib/python2.7/site-packages/pycrypto-2.6.1-py2.7-linux-x86_64.egg/Crypto/Util/number.py:57: PowmInsecureWarning: Not using mpz_powm_sec. You should rebuild using libgmp = 5 to avoid timing attack vulnerability. However, it is important to note that we've seen these warning issues for the last 1.5 years, so I doubt it has to do with it. Nonetheless, do you know what specific operations one could do to trigger this warning? I'm trying to see if I can reproduce the performance spikes. Thanks, Ze On Wednesday, February 6, 2013 12:22:49 AM UTC-8, Christian Hammond wrote: Hi Chuck, Sorry for failing to respond to the previous e-mail. Missed it. I haven't seen that particular warning before. It'll probably have a log entry any time pycrypto is imported. What distro/version are you using? Sounds like maybe it's an older one? You may need to hand-upgrade libgmp, I'm not sure. From your previous e-mail: Doing a site backup never hurts, but generally isn't important. Review Board won't delete any files. At most, it'd add some new directories and tell you to change permissions, but I don't think we've done that since 1.5. We have provided instructions on other sorts of manual updates that need to be made, though. We don't have any documentation right now on p4python's SSL support. This is only needed if you're using SSL-backed Perforce repositories. It's unfortunately not something we can automate well right now, but essentially, you'd have to install OpenSSL 1.0.1 on your distro and install its development package (I don't know if newer versions work -- hopefully other 1.0.x releases do). You'd then need to manually compile/install p4python. Yes, it's a pain, but it's something Perforce will need to make easier for us. From the e-mail you just posted while I was replying to this, you'd need to check the reviewboard.log file and see what error it's reporting before I can say what happened. Christian -- Christian Hammond - chi...@chipx86.com Review Board - http://www.reviewboard.org VMware, Inc. - http://www.vmware.com On Feb 6, 2013, at 12:10 AM, chuck j cjerr...@gmail.com wrote: Hi Christian, I would like to thank you for your response about upgrade. I went through with your comments and i was able to bring my server to 1.7.4. Also also want to bring to your notice regarding below warning i got after while upgrading my site. /usr/local/lib/python2.7/site-packages/pycrypto-2.6-py2.7- linux-x86_64.egg/Crypto/Util/number.py:57: PowmInsecureWarning: Not using mpz_powm_sec. You should rebuild using libgmp = 5 to avoid timing attack vulnerability. How to resolve this? Do i need to build it libgmp again as message shows, will it make RB server report more issues. Thanks, -Chuck On Fri, Feb 1, 2013 at 6:58 PM, chuck j cjerr...@gmail.com wrote: Thanks Christian for the response. Good to hear that upgrade is possible from 1.6.1 to 1.7.4 RB version, apart from the database backup do we need to take care of any thing else which will disturb our production setup and in case of any issue we should be able to go back to our original state, if you point us action item it would be really great. Few queries though 1. How does upgrade takes place, does it replace files by files ( I mean python scripts etc ) apart from db. 2. The Release note of 1.7.2 its been mentioned
Re: Performance Issues (Was Re: RB server upgrade from 1.6.1 to 1.7.4)
Okay, well, I was hoping it'd be simple :) Can you give me some examples of operations that are very slow, and operations that remain fast? Or does everything basically slow to a grind? How do the Apache settings (worker vs prefork, and their config) compare between installs? Christian On Thursday, March 6, 2014, Ze Xiao ilacknormal...@gmail.com wrote: Thanks for the quick reply. Yes, memcached is running. Here is what I see from the Admin Server Cache page I've got it running on two different vms, which I've obfuscated as VM1 and VM2 SERVER CACHE Cache backend: django.core.cache.backends.memcached.CacheClass vm1 Memory usage: 1.8 GB Keys in cache: 61079 of 257077 Cache hits: 5289571 of 5458860: 96% Cache misses: 169289 of 5458860: 3% Cache evictions: 139881 Cache traffic: 10.2 GB in, 27.9 GB out Uptime: 3683047 seconds vm2 Memory usage: 1.8 GB Keys in cache: 54978 of 401980 Cache hits: 5999634 of 6277198: 95% Cache misses: 277564 of 6277198: 4% Cache evictions: 307751 Cache traffic: 16.8 GB in, 26.2 GB out Uptime: 938019 seconds On Thu, Mar 6, 2014 at 5:20 PM, Christian Hammond chip...@chipx86.comwrote: Hi Ze, Those warnings are probably unrelated. I want to get a better sense of the performance problems. First thing I want to check is that your server is properly accessing and using memcached. If you log into the admin UI, do you see any stats on memcached, and any keys stored in the cache? Christian -- Christian Hammond - chip...@chipx86.com Review Board - http://www.reviewboard.org Beanbag, Inc. - http://www.beanbaginc.com On Thu, Mar 6, 2014 at 4:52 PM, Ze Lin Xiao ilacknormal...@gmail.comwrote: Hi Christian, We're facing some pretty bad performance issues on our production system after we moved our application to a different vm with RHEL6.4. We notice that our performance issues occur especially when the log shows this: [Fri Mar 07 00:18:19 2014] [error] /opt/software/lib/python2.7/site-packages/pycrypto-2.6.1-py2.7-linux-x86_64.egg/Crypto/Util/number.py:57: PowmInsecureWarning: Not using mpz_powm_sec. You should rebuild using libgmp = 5 to avoid timing attack vulnerability. However, it is important to note that we've seen these warning issues for the last 1.5 years, so I doubt it has to do with it. Nonetheless, do you know what specific operations one could do to trigger this warning? I'm trying to see if I can reproduce the performance spikes. Thanks, Ze On Wednesday, February 6, 2013 12:22:49 AM UTC-8, Christian Hammond wrote: Hi Chuck, Sorry for failing to respond to the previous e-mail. Missed it. I haven't seen that particular warning before. It'll probably have a log entry any time pycrypto is imported. What distro/version are you using? Sounds like maybe it's an older one? You may need to hand-upgrade libgmp, I'm not sure. From your previous e-mail: Doing a site backup never hurts, but generally isn't important. Review Board won't delete any files. At most, it'd add some new directories and tell you to change permissions, but I don't think we've done that since 1.5. We have provided instructions on other sorts of manual updates that need to be made, though. We don't have any documentation right now on p4python's SSL support. This is only needed if you're using SSL-backed Perforce repositories. It's unfortunately not something we can automate well right now, but essentially, you'd have to install OpenSSL 1.0.1 on your distro and install its development package (I don't know if newer versions work -- hopefully other 1.0.x releases do). You'd then need to manually compile/install p4python. Yes, it's a pain, but it's something Perforce will need to make easier for us. From the e-mail you just posted while I was replying to this, you'd need to check the reviewboard.log file and see what error it's reporting before I can say what happened. Christian -- Christian Hammond - chi...@chipx86.com Review Board - http://www.reviewboard.org VMware, Inc. - http://www.vmware.com On Feb 6, 2013, at 12:10 AM, chuck j cjerr...@gmail.com wrote: Hi Christian, I would like to thank you for your response about upgrade. I went through with your comments and i was able to bring my server to 1.7.4. Also also want to bring to your notice regarding below warning i got after while upgrading my site. /usr/local/lib/python2.7/site- -- Ze Lin Xiao -- Get the Review Board Power Pack at http://www.reviewboard.org/powerpack/ --- Sign up for Review Board hosting at RBCommons: https://rbcommons.com/ --- Happy user? Let us know at http://www.reviewboard.org/users/ --- You received this message because you are subscribed to the Google Groups reviewboard group. To unsubscribe from this group and stop receiving emails from it, send an email to