Re: [PERFORM] Postgres performance
Hi, there are several performance related issues, thereby it's rather difficult to answer your question shortly. You have to keep in mind not only postgres itself, hardware is also an important factor. Do you have performance problems, which you can describe more detailed ? regards..GERD.. Am 28.09.2009 um 08:13 schrieb std pik: Hi all.. please, how can i tune postgres performance? Thanks. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org ) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Confusion on shared buffer
On Sun, Oct 4, 2009 at 6:32 AM, Robert Haas robertmh...@gmail.com wrote: On Sat, Oct 3, 2009 at 2:11 AM, S Arvind arvindw...@gmail.com wrote: Thanks Robert, So for our scenario what is the most important factor to be noted for performance. Tough to say without benchmarking, but if you have a lot of small databases that easily fit in RAM, and a lot of concurrent connections, I would think you'd want to spend your hardware $ on maximizing the number of cores. But there are many in this forum who have much more experience with these things than me, so take that with a grain of salt... (You might also want to look at consolidating some of those databases - maybe use one database with multiple schemas - that would probably help performance significantly.) I am not sure I understand the reasoning behind it! As long as they are different objects, how would it help performance if tables are stored in separate schema, or in separate databases; or are you referring to the difference in size of system tables and the performance improvement resulting from keeping all metadata in a single catalog. Best regards, -- Lets call it Postgres gurjeet[.sin...@enterprisedb.com EnterpriseDB http://www.enterprisedb.com singh.gurj...@{ gmail | yahoo }.com Twitter/Skype: singh_gurjeet Mail sent from my BlackLaptop device
Re: [PERFORM] Best suiting OS
On 10/01/2009 03:44 PM, Denis Lussier wrote: I'm a BSD license fan, but, I don't know much about *BSD otherwise (except that many advocates say it runs PG very nicely). On the Linux side, unless your a dweeb, go with a newer, popular well supported release for Production. IMHO, that's RHEL 5.x or CentOS 5.x. Of course the latest SLES UBuntu schtuff are also fine. In other words, unless you've got a really good reason for it, stay away from Fedora OpenSuse for production usage. Lots of conflicting opinions and results in this thread. Also, a lot of hand waving and speculation. :-) RHEL and CentOS are particular bad *right now*. See here: http://en.wikipedia.org/wiki/RHEL http://en.wikipedia.org/wiki/CentOS For RHEL, look down to Release History and RHEL 5.3 based on Linux-2.6.18, released March, 2007. On the CentOS page you'll see it is dated April, 2007. CentOS is identical to RHEL on purpose, but always 1 to 6 months after the RHEL, since they take the RHEL source, re-build it, and then re-test it. Linux is up to Linux-2.6.31.1 right now: http://www.kernel.org/ So any comparisons between operating system *distributions* should be fair. Comparing a 2007 release to a 2009 release, for example, is not fair. RHEL / CentOS are basically out of the running right now, because they are so old. However, RHEL 6.0 should be out in January or February, 2010, at which point it will be relevant again. Personally, I use Fedora, and my servers have been quite stable. One of our main web servers running Fedora: [m...@bambi]~% uptime 09:45:41 up 236 days, 10:07, 1 user, load average: 0.02, 0.04, 0.08 It was last rebooted as a scheduled reboot, not a crash. This isn't to say Fedora will be stable for you - however, having used both RHEL and Fedora, in many different configurations, I find RHEL being 2+ years behind in terms of being patch-current means that it is NOT as stable on modern hardware. Most recently, I installed RHEL on a 10 machine cluster of HP nodes, and RHEL would not detect the SATA controller out of the box and reverted to some base PIO mode yielding 2 Mbyte/s disk speed. Fedora was able to achieve 112 Mbyte/s on the same disks. Some twiddling of grub.conf allowed RHEL to achieve the same speed, but the point is that there are two ways to de-stabilize a kernel. One is to use the leading edge, the other is to use the trailing edge. Using an operating system designed for 2007 on hardware designed in 2009 is a bad idea. Using an operating system designed for 2009 on 2007 hardware might also be a bad idea. Using a modern operating system on modern hardware that the operating system was designed for will give you the best performance potential. In this case, Fedora 11 with Linux 2.6.30.8 is almost guaranteed to out-perform RHEL 5.3 with Linux 2.6.18. Where Fedora is within 1 to 6 months of the leading edge, Ubuntu is within 3 to 9 months of the leading edge, so Ubuntu will perform more similar to Fedora than RHEL. I've given up on the OS war. People use what they are comfortable with. Comfort lasts until the operating systems screws a person over, at which point they hate it, and switch to something else. It's about passion - not about actual metrics, capability, reliability, or any of these other supposed criteria. In my case, I'm comfortable with RedHat, because I've used it since the '90s, and because I've seen how they hire some of the best open source developers, and contribute quality releases back to the community, specifically including our very own Tom Lane and until recently Alan Cox. Many of their employees have high fame in the open source / Linux arena. As a result, my passion is for RedHat-based releases. As I describe earlier, RHEL is too old right now - and I am looking forward to RHEL 6.0 catching up to the rest of the world again. I've found Fedora to be a great alternative when I do need to be on the leading edge. So - use what you want - but try not to pretend this isn't about passion. Even between BSD and Linux, I understand they re-use drivers, or at least knowledge. It's software that runs your computer. If one OS can give you +2% performance over another for a 3 month period - big deal - the available hardware can get a lot better than that +2% by spending a tiny bit more money, or just waiting 3 months. In the case of RHEL, waiting about 4 months will give you RHEL 6.0 which will be within 3 to 9 months of the leading edge. If you are planning a new deployment - this might be something to consider. I suggest not going with RHEL 5 at this time... Cheers, mark -- Mark Mielkem...@mielke.cc -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Best suiting OS
On Sun, 2009-10-04 at 10:05 -0400, Mark Mielke wrote: RHEL and CentOS are particular bad *right now*. See here: http://en.wikipedia.org/wiki/RHEL http://en.wikipedia.org/wiki/CentOS For RHEL, look down to Release History and RHEL 5.3 based on Linux-2.6.18, released March, 2007. On the CentOS page you'll see it is dated April, 2007. CentOS is identical to RHEL on purpose, but always 1 to 6 months after the RHEL, since they take the RHEL source, re-build it, and then re-test it. Linux is up to Linux-2.6.31.1 right now: http://www.kernel.org/ So any comparisons between operating system *distributions* should be fair. Comparing a 2007 release to a 2009 release, for example, is not fair. RHEL / CentOS are basically out of the running right now, because they are so old. Some people call these stability . -- Devrim GÜNDÜZ, RHCE Command Prompt - http://www.CommandPrompt.com devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org Twitter: http://twitter.com/devrimgunduz signature.asc Description: This is a digitally signed message part
Re: [PERFORM] Best suiting OS
On Sun, 4 Oct 2009, Devrim G?ND?Z wrote: On Sun, 2009-10-04 at 10:05 -0400, Mark Mielke wrote: RHEL and CentOS are particular bad *right now*. See here: http://en.wikipedia.org/wiki/RHEL http://en.wikipedia.org/wiki/CentOS For RHEL, look down to Release History and RHEL 5.3 based on Linux-2.6.18, released March, 2007. On the CentOS page you'll see it is dated April, 2007. CentOS is identical to RHEL on purpose, but always 1 to 6 months after the RHEL, since they take the RHEL source, re-build it, and then re-test it. Linux is up to Linux-2.6.31.1 right now: http://www.kernel.org/ So any comparisons between operating system *distributions* should be fair. Comparing a 2007 release to a 2009 release, for example, is not fair. RHEL / CentOS are basically out of the running right now, because they are so old. Some people call these stability . stability can mean many things to people in this case it does _not_ mean 'will it crash' type of stability. if you do not have a corporate standard distro and your sysadmins are equally comfortable (or uncomfortable and will get training) with all distros, then the next question to decide is what your support requirements are. some companies insist on having a commercial support contract for anything that they run. If that is the case then you will run RHEL, SuSE enterprise, Ubuntu (probably long term support version), or _possibly_ debian with a support contract from a consutant shop. the next layer (and I believe the more common case) is to not require a commercial support contract, but do require that any software that you run be supported by the project/distro with security patches. In this case you need to look at the support timeframe and the release cycle. With Fedora, the support timeframe is 12 months with a release every 6 months. Since you cannot (and should not) upgrade all your production systems the day a new release comes out, this means that to costantly run a supported version you must upgrade every 6 months. With Ubuntu, the support timeframe is 18 months with a release every 6 months. This requires that you upgrade every 12 months to stay supported With the enterprise/long-term-support versions, the support timeframe is 5 years with a new release every 2-3 years. for these you will need to upgrade every new release, but can wait a year or two after a release has been made before doing the upgrade for Debian stable the support timeframe is 1 year after the next release (which has historicly had an unpredictable schedule, they are trying to shift to a 2 year cycle, but they haven't actually done it yet). This allows you to wait several months after a release before having to do an upgrade. another question you have to answer in terms of 'support' is what are the limitations on replacing software that came with the distro with a new version yourself. With the commercial support options the answer is usually 'if you upgrade something you void your support contract'. This can be a significant problem if the distro cycle is long and you need a new feature. note that for the kernel a 'new feature' may be support for new hardware, so this can limit what hardware you can buy. If you insist on buying your hardware from HP/Dell/IBM/etc this may not be too big a problem as those hardware vendors also take a significant amount of time to 'certify' new things. For example, I have been looking for a new hardware vendor and just discovered that I cannot buy a system from HP/Dell/IBM that includes a SSD yet. In my case I run Debian Stable on my servers, but identify 'important' packages that I will compile myself and keep up to date with the upstream project rather than running what Debian ships. The kernel is one such package (I don't necessarily run the latest kernel, but I watch closely for the vunerabilities discovered and if any are relavent to the configuration I have, I upgrade). For dedicated database boxes Postgres is another such package (if a system is primarily used for something else and that package just needs a SQL database I may stick with the distro default) David Lang -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Speed / Server
All: We have a web-application which is growing ... fast. We're currently running on (1) quad-core Xeon 2.0Ghz with a RAID-1 setup, and 8GB of RAM. Our application collects a lot of sensor data, which means that we have 1 table which has about 8 million rows, and we're adding about 2.5 million rows per month. The problem is, this next year we're anticipating significant growth, where we may be adding more like 20 million rows per month (roughly 15GB of data). A row of data might have: The system identifier (int) Date/Time read (timestamp) Sensor identifier (int) Data Type (int) Data Value (double) The nasty part of this problem is that the data needs to be readily available for reports, and we cannot consolidate the data for reporting purposes. We generate real time graphs from this data, usually running reports across multiple date/time ranges for any given system. Reports and graphs do not span more than 1 system, and we have indexes on the applicable columns. I know we need a LOT of RAM (as much as we can afford), and we're looking at a couple of Nehalem systems w/ a large, and fast, RAID-10 disk set up. So far, we're seeing some slowness in reading from our table - queries are in the seconds range. No issues, yet, with inserting volumes of data. Two questions: 1. Other than partitioning (by system, and/or date), and splitting up the data into multiple tables (by data type), what could be done within Postgresql to help with this type of set up (1 large table)? 2. Before going out and buying a beast of a system, we'd like to get some idea of performance on a high-end system. We may need to split this up, or move into some other type of architecture. Do you know of anyone who would let us play with a couple of systems to see what would be an applicable purchase? Thanks! -- Anthony -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Speed / Server
On Sun, Oct 4, 2009 at 4:45 PM, anth...@resolution.com wrote: All: We have a web-application which is growing ... fast. We're currently running on (1) quad-core Xeon 2.0Ghz with a RAID-1 setup, and 8GB of RAM. Our application collects a lot of sensor data, which means that we have 1 table which has about 8 million rows, and we're adding about 2.5 million rows per month. The problem is, this next year we're anticipating significant growth, where we may be adding more like 20 million rows per month (roughly 15GB of data). A row of data might have: The system identifier (int) Date/Time read (timestamp) Sensor identifier (int) Data Type (int) Data Value (double) The nasty part of this problem is that the data needs to be readily available for reports, and we cannot consolidate the data for reporting purposes. We generate real time graphs from this data, usually running reports across multiple date/time ranges for any given system. Reports and graphs do not span more than 1 system, and we have indexes on the applicable columns. I know we need a LOT of RAM (as much as we can afford), and we're looking at a couple of Nehalem systems w/ a large, and fast, RAID-10 disk set up. So far, we're seeing some slowness in reading from our table - queries are in the seconds range. No issues, yet, with inserting volumes of data. Two questions: 1. Other than partitioning (by system, and/or date), and splitting up the data into multiple tables (by data type), what could be done within Postgresql to help with this type of set up (1 large table)? 2. Before going out and buying a beast of a system, we'd like to get some idea of performance on a high-end system. We may need to split this up, or move into some other type of architecture. Do you know of anyone who would let us play with a couple of systems to see what would be an applicable purchase? Most of the producers of big bad database servers have a trial period you can try stuff out for. My supplier has something like a 30 day trial. I'm sure the bigger the system the more they'd need to charge you for playing on it then returning it. But you should plan on partitioning to multiple db servers up front and save pain of conversion later on. A dual socket motherboard with 16 to 32 SAS drives and a fast RAID controller is WAY cheaper than a similar machine with 4 to 8 sockets is gonna be. And if you gotta go there anyway, might as well spend your money on other stuff. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Best suiting OS
On Sun, Oct 4, 2009 at 8:05 AM, Mark Mielke m...@mark.mielke.cc wrote: On 10/01/2009 03:44 PM, Denis Lussier wrote: I'm a BSD license fan, but, I don't know much about *BSD otherwise (except that many advocates say it runs PG very nicely). On the Linux side, unless your a dweeb, go with a newer, popular well supported release for Production. IMHO, that's RHEL 5.x or CentOS 5.x. Of course the latest SLES UBuntu schtuff are also fine. In other words, unless you've got a really good reason for it, stay away from Fedora OpenSuse for production usage. Lots of conflicting opinions and results in this thread. Also, a lot of hand waving and speculation. :-) RHEL and CentOS are particular bad *right now*. See here: http://en.wikipedia.org/wiki/RHEL http://en.wikipedia.org/wiki/CentOS For RHEL, look down to Release History and RHEL 5.3 based on Linux-2.6.18, released March, 2007. On the CentOS page you'll see it is dated April, 2007. CentOS is identical to RHEL on purpose, but always 1 to 6 months after the RHEL, since they take the RHEL source, re-build it, and then re-test it. Linux is up to Linux-2.6.31.1 right now: http://www.kernel.org/ So any comparisons between operating system *distributions* should be fair. Comparing a 2007 release to a 2009 release, for example, is not fair. RHEL / CentOS are basically out of the running right now, because they are so old. However, RHEL 6.0 should be out in January or February, 2010, at which point it will be relevant again. It's completely fair. I have a Centos 5.2 machine with 430 or so days of uptime. I put it online, tested it and had it ready 430 days ago and it's crashed / hung exactly zero times since. You're right. It's completely unfair to compare some brand new kernel with unknown bugginess and stability issues to my 5.2 machine. Oh wait, you're saying Centos is out of the running because it's old? That's 110% backwards from the way a DBA should be thinking. First make it stable, THEN look for ways to make it performance. A DB server with stability issues is completely useless in a production environment. Personally, I use Fedora, and my servers have been quite stable. One of our main web servers running Fedora: It's not that there can't be stable releases of FC, it's that it's not the focus of that project. So, if you get lucky, great! I can't imagine running a production DB on FC, with it's short supported life span and focus on development and not stability. [m...@bambi]~% uptime 09:45:41 up 236 days, 10:07, 1 user, load average: 0.02, 0.04, 0.08 It was last rebooted as a scheduled reboot, not a crash. This isn't to say Fedora will be stable for you - however, having used both RHEL and Fedora, in many different configurations, I find RHEL being 2+ years behind in terms of being patch-current means that it is NOT as stable on modern hardware. It is NOT 2+ years behind on patches. Any security issues or bugs that show up get patched. Performance enhancing architectural changes get to wait for the next version (RHEL6). Most recently, I installed RHEL on a 10 machine cluster of HP nodes, and RHEL would not detect the SATA controller out of the box and reverted to some base PIO mode yielding 2 Mbyte/s disk speed. Yes, again, RHEL is focused on making stable, already production capable hardware stay up, and stay supported for 5 to 7 years. It's a different focus. Fedora was able to achieve 112 Mbyte/s on the same disks. Some twiddling of grub.conf allowed RHEL to achieve the same speed, but the point is that there are two ways to de-stabilize a kernel. One is to use the leading edge, the other is to use the trailing edge. Sorry, but your argument does not support this point. RHEL took twiddling to work with the SATA ports. Using an operating system designed for 2007 on hardware designed in 2009 is a bad idea. Depends on whether or not it's using the latest and greatest or if RHEL has back patched support for newer hardware. Using RHEL on hardware that isn't officially supported is a bad idea. I agree about halfway here, but not really. I have brand new machines running Centos 5.2 with no problem. Using an operating system designed for 2009 on 2007 hardware might also be a bad idea. I think you have to go further back. Unlike Vista, Linux kernels tend to support older hardware for a very long time. Using a modern operating system on modern hardware that the operating system was designed for will give you the best performance potential. True. But you have to test it hard and prove it's reliable first, cause it really doesn't matter how fast it crashes. In this case, Fedora 11 with Linux 2.6.30.8 is almost guaranteed to out-perform RHEL 5.3 with Linux 2.6.18. Where Fedora is within 1 to 6 months of the leading edge, Ubuntu is within 3 to 9 months of the leading edge, so Ubuntu will perform more similar to Fedora than RHEL. And on more exotic hardware
Re: [PERFORM] Bad performance of SELECT ... where id IN (...)
Hi Xia, Try this patch: http://treehou.se/~omar/postgresql-8.4.1-array_sel_hack.patch It's a hack, but it works for us. I think you're probably spending most of your query time planning, and this patch helps speed things up 10x over here. Regards, Omar On Sun, Sep 27, 2009 at 5:13 PM, Xia Qingran qingran@gmail.com wrote: On Sat, Sep 26, 2009 at 10:59 PM, Craig James craig_ja...@emolecules.com wrote: If your user_id is always in a narrow range like this, or even in any range that is a small fraction of the total, then add a range condition, like this: select * from event where user_id = 500 and user_id = 0 and user_id in (...) I did this exact same thing in my application and it worked well. Craig It is a good idea. But In my application, most of the queries' user_id are random and difficult to range. Thanks anyway. -- 夏清然 Xia Qingran qingran@gmail.com Sent from Beijing, 11, China Charles de Gaulle - The better I get to know men, the more I find myself loving dogs. - http://www.brainyquote.com/quotes/authors/c/charles_de_gaulle.html -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Maybe OT, not sure Re: [PERFORM] Best suiting OS
This is kind of OT, unless somebody really is concerned with understanding the + and - of distributions, and is willing to believe the content of this thread as being accurate and objective... :-) On 10/04/2009 08:42 PM, Scott Marlowe wrote: On Sun, Oct 4, 2009 at 8:05 AM, Mark Mielkem...@mark.mielke.cc wrote: So any comparisons between operating system *distributions* should be fair. Comparing a 2007 release to a 2009 release, for example, is not fair. RHEL / CentOS are basically out of the running right now, because they are so old. However, RHEL 6.0 should be out in January or February, 2010, at which point it will be relevant again. It's completely fair. I have a Centos 5.2 machine with 430 or so days of uptime. I put it online, tested it and had it ready 430 days ago and it's crashed / hung exactly zero times since. You're right. It's completely unfair to compare some brand new kernel with unknown bugginess and stability issues to my 5.2 machine. Oh wait, you're saying Centos is out of the running because it's old? That's 110% backwards from the way a DBA should be thinking. First make it stable, THEN look for ways to make it performance. A DB server with stability issues is completely useless in a production environment. Maybe - if the only thing the server is running is PostgreSQL. Show of hands - how many users who ONLY install PostgreSQL, and use a bare minimum OS install, choosing to not run any other software? Now, how many people ALSO run things like PHP, and require software more up-to-date than 3 years? Personally, I use Fedora, and my servers have been quite stable. One of our main web servers running Fedora: It's not that there can't be stable releases of FC, it's that it's not the focus of that project. So, if you get lucky, great! I can't imagine running a production DB on FC, with it's short supported life span and focus on development and not stability. Depends on requirements. If the application is frozen in time and doesn't change - sure. If the application keeps evolving and benefits from new base software - to require an upgrade every 12 months or more out of *application* requirements (not even counting OS support requirements), may not be unusual. In any case - I'm not telling you to use Fedora. I'm telling you that I use Fedora, and that RHEL 5 is too old from an application software perspective for anybody with a requirement on more than a handful of base OS packages. [m...@bambi]~% uptime 09:45:41 up 236 days, 10:07, 1 user, load average: 0.02, 0.04, 0.08 It was last rebooted as a scheduled reboot, not a crash. This isn't to say Fedora will be stable for you - however, having used both RHEL and Fedora, in many different configurations, I find RHEL being 2+ years behind in terms of being patch-current means that it is NOT as stable on modern hardware. It is NOT 2+ years behind on patches. Any security issues or bugs that show up get patched. Performance enhancing architectural changes get to wait for the next version (RHEL6). Not true. They only backport things considered important. BIND might be updated. Firefox might be updated. Most packages see no updates. Some packages see many updates. As I said - the kernel has something like 3000 patches applied against it (although that's a small subset of all of the changes made to the upstream kernel). It is not true that any security issues or bugs that show up get patched. *Some* security issues or bugs that show up get patched. If they patched everything back, they wouldn't have a stable release. Also, they cross this line - performance enhancing architectural changes *are* made, but only if there is sufficient demand from the customer base. XFS, EXT4, and FUSE made it into RHEL 5.4. Even so, plenty of open source software is difficult or impossible to compile for RHEL 5 without re-compiling base packages or bringing them in from another source. Try compiling Subversion 1.6.5 with GNOME keyring support on RHEL 5.3 - that was the last one that busted us. In fact, this one is still open for us. Most recently, I installed RHEL on a 10 machine cluster of HP nodes, and RHEL would not detect the SATA controller out of the box and reverted to some base PIO mode yielding 2 Mbyte/s disk speed. Yes, again, RHEL is focused on making stable, already production capable hardware stay up, and stay supported for 5 to 7 years. It's a different focus. Yes, exactly. Which is why a new deployment should align against the RHEL release. Deploying RHEL 5 today, when RHEL 5 is 3 years old, and RHEL 6 is coming out in 4 months, means that your RHEL 5 install is not going to have 5 to 7 years of life starting today. Half the support life has almost elapsed at this point. Fedora was able to achieve 112 Mbyte/s on the same disks. Some twiddling of grub.conf allowed RHEL to achieve the same speed, but the point is that there are two ways