Re: [PERFORM] New server to improve performance on our large and busy DB - advice?
Carlo Stonebanks wrote: effective_cache_size should be set to how much memory is leftover for disk caching after taking into account what's used by the operating system, dedicated PostgreSQL memory, and other applications. If it's set too low, indexes may not be used for executing queries the way you'd expect. Setting effective_cache_size to 1/2 of total memory would be a normal conservative setting. You might find a better estimate by looking at your operating system's statistics. On UNIX-like systems, add the free+cached numbers from free or top. On Windows see the System Cache in the Windows Task Manager's Performance tab. Are these values to look at BEFORE starting PG? If so, how do I relate the values returned to setting the effective_cache_size values? After starting the database. You can set effective_cache_size to a size in megabytes, so basically you'd look at the amount of free cache, maybe round down a bit, and set effective_cache_size to exactly that. It's not super important to get the number right. The point is that the default is going to be a tiny number way smaller than the RAM in your system, and even getting it within a factor of 2 or 3 of reality will radically change some types of query plans. PS Loved your 1995 era pages. Being a musician, it was great to read your recommendations on how to buy these things called CD's. I Googled the term, and they appear to be some ancient precursor to MP3s which people actually PAID for. What kind of stone were they engraved on? ;-D They're plastic, just like the iPod, iPhone, iToilet, or whatever other white plastic Apple products people listen to music during this new era. Since both my CD collection and the stereo I listen to them on are each individually worth more than my car, it's really tough to sell me on all the terrible sounding MP3s I hear nowadays. I'm the guy who can tell you how the LP, regular CD, gold CD, and SACD/DVD-A for albums I like all compare, so dropping below CD quality is right out. If you ever find yourself going hey, I wish I had six different versions of 'Dark Side of the Moon' around so I could compare the subtle differences in the mastering and mix on each of them, I'm your guy. -- Greg Smith2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice?
Hi Greg, As a follow up to this suggestion: I don't see effective_cache_size listed there. If that's at the default, I wouldn't be surprised that you're seeing sequential scans instead of indexed ones far too often. I found an article written by you http://www.westnet.com/~gsmith/content/postgresql/pg-5minute.htm and thought this was pretty useful, and especially this comment: effective_cache_size should be set to how much memory is leftover for disk caching after taking into account what's used by the operating system, dedicated PostgreSQL memory, and other applications. If it's set too low, indexes may not be used for executing queries the way you'd expect. Setting effective_cache_size to 1/2 of total memory would be a normal conservative setting. You might find a better estimate by looking at your operating system's statistics. On UNIX-like systems, add the free+cached numbers from free or top. On Windows see the System Cache in the Windows Task Manager's Performance tab. Are these values to look at BEFORE starting PG? If so, how do I relate the values returned to setting the effective_cache_size values? Carlo PS Loved your 1995 era pages. Being a musician, it was great to read your recommendations on how to buy these things called CD's. I Googled the term, and they appear to be some ancient precursor to MP3s which people actually PAID for. What kind of stone were they engraved on? ;-D -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice?
Scott Marlowe escribió: On Thu, Jan 14, 2010 at 12:17 PM, Carlo Stonebanks stonec.regis...@sympatico.ca wrote: 4) Is this the right PG version for our needs? 8.3 is very stable. Update to the latest. 8.4 seems good, but I've had, and still am having, problems with it crashing in production. Not often, maybe once every couple of months, but just enough that I'm not ready to try and use it there yet. And I can't force the same failure in testing, at least not yet. uh. Is there a report of the crash somewhere with details, say stack traces and such? -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice?
On Thu, Jan 21, 2010 at 8:51 AM, Alvaro Herrera alvhe...@commandprompt.com wrote: Scott Marlowe escribió: On Thu, Jan 14, 2010 at 12:17 PM, Carlo Stonebanks stonec.regis...@sympatico.ca wrote: 4) Is this the right PG version for our needs? 8.3 is very stable. Update to the latest. 8.4 seems good, but I've had, and still am having, problems with it crashing in production. Not often, maybe once every couple of months, but just enough that I'm not ready to try and use it there yet. And I can't force the same failure in testing, at least not yet. uh. Is there a report of the crash somewhere with details, say stack traces and such? No, the only server that does this is in production as our stats db and when it happens it usually gets restarted immediately. It does this about once every two months. Do the PGDG releases have debugging symbols and what not? I'll see about having a stack trace ready to run for the next time it does this. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice?
Scott Marlowe escribió: On Thu, Jan 21, 2010 at 8:51 AM, Alvaro Herrera alvhe...@commandprompt.com wrote: Scott Marlowe escribió: On Thu, Jan 14, 2010 at 12:17 PM, Carlo Stonebanks stonec.regis...@sympatico.ca wrote: 4) Is this the right PG version for our needs? 8.3 is very stable. Update to the latest. 8.4 seems good, but I've had, and still am having, problems with it crashing in production. Not often, maybe once every couple of months, but just enough that I'm not ready to try and use it there yet. And I can't force the same failure in testing, at least not yet. uh. Is there a report of the crash somewhere with details, say stack traces and such? No, the only server that does this is in production as our stats db and when it happens it usually gets restarted immediately. It does this about once every two months. Do the PGDG releases have debugging symbols and what not? I'll see about having a stack trace ready to run for the next time it does this. You mean the RPMs? Yes, I think Devrim publishes debuginfo packages which you need to install separately. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice?
On Thu, Jan 21, 2010 at 9:44 AM, Alvaro Herrera alvhe...@commandprompt.com wrote: Scott Marlowe escribió: On Thu, Jan 21, 2010 at 8:51 AM, Alvaro Herrera alvhe...@commandprompt.com wrote: Scott Marlowe escribió: On Thu, Jan 14, 2010 at 12:17 PM, Carlo Stonebanks stonec.regis...@sympatico.ca wrote: 4) Is this the right PG version for our needs? 8.3 is very stable. Update to the latest. 8.4 seems good, but I've had, and still am having, problems with it crashing in production. Not often, maybe once every couple of months, but just enough that I'm not ready to try and use it there yet. And I can't force the same failure in testing, at least not yet. uh. Is there a report of the crash somewhere with details, say stack traces and such? No, the only server that does this is in production as our stats db and when it happens it usually gets restarted immediately. It does this about once every two months. Do the PGDG releases have debugging symbols and what not? I'll see about having a stack trace ready to run for the next time it does this. You mean the RPMs? Yes, I think Devrim publishes debuginfo packages which you need to install separately. Well crap, this one was built from source, and not with debugging. Gimme a day or so and I'll have it rebuilt with debug and can run a useful backtrace on it. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice?
On Thu, 2010-01-21 at 13:44 -0300, Alvaro Herrera wrote: I think Devrim publishes debuginfo packages which you need to install separately. Right. -- Devrim GÜNDÜZ, RHCE Command Prompt - http://www.CommandPrompt.com devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org Twitter: http://twitter.com/devrimgunduz signature.asc Description: This is a digitally signed message part
Re: [PERFORM] New server to improve performance on our large and busy DB - advice?
yeah, the values are at the end. Sounds like your vacuum settings are too non-aggresive. Generally this is the vacuum cost delay being too high. Of course, I have to ask: what's the down side? Yes! You can run vacuum verbose against the regular old postgres database (or just create one for testing with nothing in it) and you'll still get the fsm usage numbers from that! So, no need to run it against the big db. However, if regular vacuum verbose couldn't finish in a week, then you've likely got vacuum and autovacuum set to be too timid in their operation, and may be getting pretty bloated as we speak. Once the fsm gets too blown out of the water, it's quicker to dump and reload the whole DB than to try and fix it. My client reports this is what they actualyl do on a monthly basis. And the numbers are in: NOTICE: number of page slots needed (4090224) exceeds max_fsm_pages (204800) HINT: Consider increasing the configuration parameter max_fsm_pages to a value over 4090224. Gee, only off by a factor of 20. What happens if I go for this number (once again, what's the down side)? Carlo -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice?
Carlo Stonebanks stonec.regis...@sympatico.ca wrote: yeah, the values are at the end. Sounds like your vacuum settings are too non-aggresive. Generally this is the vacuum cost delay being too high. Of course, I have to ask: what's the down side? If you make it too aggressive, it could impact throughput or response time. Odds are that the bloat from having it not aggressive enough is currently having a worse impact. Once the fsm gets too blown out of the water, it's quicker to dump and reload the whole DB than to try and fix it. My client reports this is what they actualyl do on a monthly basis. The probably won't need to do that with proper configuration and vacuum policies. NOTICE: number of page slots needed (4090224) exceeds max_fsm_pages (204800) HINT: Consider increasing the configuration parameter max_fsm_pages to a value over 4090224. Gee, only off by a factor of 20. What happens if I go for this number (once again, what's the down side)? It costs six bytes of shared memory per entry. http://www.postgresql.org/docs/8.3/interactive/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-FSM -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice?
On Wed, Jan 20, 2010 at 3:03 PM, Carlo Stonebanks stonec.regis...@sympatico.ca wrote: Yes! You can run vacuum verbose against the regular old postgres database (or just create one for testing with nothing in it) and you'll still get the fsm usage numbers from that! So, no need to run it against the big db. However, if regular vacuum verbose couldn't finish in a week, then you've likely got vacuum and autovacuum set to be too timid in their operation, and may be getting pretty bloated as we speak. Once the fsm gets too blown out of the water, it's quicker to dump and reload the whole DB than to try and fix it. My client reports this is what they actualyl do on a monthly basis. Something is deeply wrong with your client's vacuuming policies. ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice?
Hi Scott, Sorry for the very late reply on this post, but I'd like to follow up. The reason that I took so long to reply was due to this suggestion: Run vacuum verbose to see if you're overrunning the max_fsm_pages settings or the max_fsm_relations. My first thought was, does he mean against the entire DB? That would take a week! But, since it was recommended, I decided to see what would happen. So, I just ran VACUUM VERBOSE. After five days, it was still vacuuming and the server admin said they needed to bounce the server, which means the command never completed (I kept the log of the progress so far, but don't know if the values you needed would appear at the end. I confess I have no idea how to relate the INFO and DETAIL data coming back with regards to max_fsm_pages settings or the max_fsm_relations. So, now my questions are: 1) Did you really mean you wanted VACUUM VERBOSE to run against the entire DB? 2) Given my previous comments on the size of the DB (and my thinking that this is an exceptionally large and busy DB) were you expecting it to take this long? 3) I took no exceptional measures before running it, I didn't stop the automated import processes, I didn't turn off autovacuum. Would this have accounted for the time it is taking to THAT degree? 4) Any other way to get max_fsm_pages settings and max_fsm_relations? Carlo -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice?
On Tue, Jan 19, 2010 at 2:09 PM, Carlo Stonebanks stonec.regis...@sympatico.ca wrote: Hi Scott, Sorry for the very late reply on this post, but I'd like to follow up. The reason that I took so long to reply was due to this suggestion: Run vacuum verbose to see if you're overrunning the max_fsm_pages settings or the max_fsm_relations. My first thought was, does he mean against the entire DB? That would take a week! But, since it was recommended, I decided to see what would happen. So, I just ran VACUUM VERBOSE. After five days, it was still vacuuming and the server admin said they needed to bounce the server, which means the command never completed (I kept the log of the progress so far, but don't know if the values you needed would appear at the end. I confess I have no idea how to relate the INFO and DETAIL data coming back with regards to max_fsm_pages settings or the max_fsm_relations. yeah, the values are at the end. Sounds like your vacuum settings are too non-aggresive. Generally this is the vacuum cost delay being too high. So, now my questions are: 1) Did you really mean you wanted VACUUM VERBOSE to run against the entire DB? Yes. A whole db at least. However... 2) Given my previous comments on the size of the DB (and my thinking that this is an exceptionally large and busy DB) were you expecting it to take this long? Yes, I was figuring it would be a while. However... 3) I took no exceptional measures before running it, I didn't stop the automated import processes, I didn't turn off autovacuum. Would this have accounted for the time it is taking to THAT degree? Nah, not really. However... 4) Any other way to get max_fsm_pages settings and max_fsm_relations? Yes! You can run vacuum verbose against the regular old postgres database (or just create one for testing with nothing in it) and you'll still get the fsm usage numbers from that! So, no need to run it against the big db. However, if regular vacuum verbose couldn't finish in a week, then you've likely got vacuum and autovacuum set to be too timid in their operation, and may be getting pretty bloated as we speak. Once the fsm gets too blown out of the water, it's quicker to dump and reload the whole DB than to try and fix it. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice? (v2)
On Thu, 14 Jan 2010 16:35:53 -0600 Dave Crooke dcro...@gmail.com wrote: For any given database engine, regardless of the marketing and support stance, there is only one true primary enterprise OS platform that most big mission critical sites use, and is the best supported and most stable platform for that RDBMS. For Oracle, that's HP-UX (but 10 years ago, it was Solaris). For PostgreSQL, it's Linux. I am interested in this response and am wondering if this is just Dave's opinion or some sort of official PostgreSQL policy. I am learning PostgreSQL by running it on FreeBSD 8.0-STABLE. So far I have found no problems and have even read a few posts that are critical of Linux's handling of fsync. I really don't want to start a Linux vs FreeBSD flame war (I like Linux and use that too, though not for database use), I am just intrigued by the claim that Linux is somehow the natural OS for running PostgreSQL. I think if Dave had said for PostgreSQL, it's a variant of Unix I wouldn't have been puzzled. So I suppose the question is: what is it about Linux specifically (as contrasted with other Unix-like OSes, especially Open Source ones) that makes it particularly suitable for running PostgreSQL? Best, Tony -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice? (v2)
On Fri, Jan 15, 2010 at 8:10 AM, Tony McC af...@btinternet.com wrote: most stable platform for that RDBMS. For Oracle, that's HP-UX (but 10 years ago, it was Solaris). For PostgreSQL, it's Linux. I am interested in this response and am wondering if this is just Dave's opinion or some sort of official PostgreSQL policy. I really don't want to start a Linux vs FreeBSD flame war (I like Linux and use that too, though not for database use), I am just intrigued by the claim that Linux is somehow the natural OS for running PostgreSQL. I would wager that this response is a tad flame-bait-ish. -- Regards, Richard Broersma Jr. Visit the Los Angeles PostgreSQL Users Group (LAPUG) http://pugs.postgresql.org/lapug -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice? (v2)
On Fri, Jan 15, 2010 at 11:10 AM, Tony McC af...@btinternet.com wrote: what is it about Linux specifically (as contrasted with other Unix-like OSes, especially Open Source ones) that makes it particularly suitable for running PostgreSQL? Nothing that I know of. ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice?
On Thu, Jan 14, 2010 at 8:17 PM, Carlo Stonebanks stonec.regis...@sympatico.ca wrote: . 48 GB RAM 2) Which Windows OS would you recommend? (currently 2008 x64 Server) There is not a 64-bit windows build now - You would be limited to shared_buffers at about a gigabyte. Choose Linux Greetings Marcin Mańk -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice? (v2)
Richard Broersma richard.broer...@gmail.com writes: On Fri, Jan 15, 2010 at 8:10 AM, Tony McC af...@btinternet.com wrote: most stable platform for that RDBMS. For Oracle, that's HP-UX (but 10 years ago, it was Solaris). For PostgreSQL, it's Linux. I am interested in this response and am wondering if this is just Dave's opinion or some sort of official PostgreSQL policy. I really don't want to start a Linux vs FreeBSD flame war (I like Linux and use that too, though not for database use), I am just intrigued by the claim that Linux is somehow the natural OS for running PostgreSQL. I would wager that this response is a tad flame-bait-ish. Indeed. It's certainly not project policy. Given the Linux kernel hackers' apparent disinterest in fixing their OOM kill policy or making write barriers work well (or at all, with LVM), I think arguing that Linux is the best database platform requires a certain amount of suspension of disbelief. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice? (v2)
Tom Lane wrote: Given the Linux kernel hackers' apparent disinterest in fixing their OOM kill policy or making write barriers work well (or at all, with LVM), I think arguing that Linux is the best database platform requires a certain amount of suspension of disbelief. Don't forget the general hostility toward how the database allocates shared memory on that list too. I was suggesting Linux as being the best in the context of consistently having up to date packages that install easily if you can use the PGDG yum repo, since that was a specific request. The idea that Linux is somehow the preferred platform from PostgreSQL is pretty weird; it's just a popular one, and has plenty of drawbacks. I think it's certainly the case that you have to enter into using PostgreSQL with Linux with the understanding that you only use the most basic and well understood parts of the OS. Filesystem other than ext3? Probably buggy, may get corrupted. Using the latest write-barrier code rather than the most basic fsync approach? Probably buggy, may get corrupted. Using LVM instead of simple partitions? Probably going to perform badly, maybe buggy and get corrupted too. Assuming software RAID can replace a hardware solution with a battery-backed write cache? Never happen. There's a narrow Linux setup for PostgreSQL that works well for a lot of people, but some days it does feel like that's in spite of the priorities of the people working on the Linux kernel. -- Greg Smith2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice? (v2)
On Fri, Jan 15, 2010 at 11:28 AM, Greg Smith g...@2ndquadrant.com wrote: Tom Lane wrote: Given the Linux kernel hackers' apparent disinterest in fixing their OOM kill policy or making write barriers work well (or at all, with LVM), I think arguing that Linux is the best database platform requires a certain amount of suspension of disbelief. Don't forget the general hostility toward how the database allocates shared memory on that list too. I was suggesting Linux as being the best in the context of consistently having up to date packages that install easily if you can use the PGDG yum repo, since that was a specific request. The idea that Linux is somehow the preferred platform from PostgreSQL is pretty weird; it's just a popular one, and has plenty of drawbacks. I think it's certainly the case that you have to enter into using PostgreSQL with Linux with the understanding that you only use the most basic and well understood parts of the OS. Filesystem other than ext3? Probably buggy, may get corrupted. Using the latest write-barrier code rather than the most basic fsync approach? Probably buggy, may get corrupted. Using LVM instead of simple partitions? Probably going to perform badly, maybe buggy and get corrupted too. Assuming software RAID can replace a hardware solution with a battery-backed write cache? Never happen. There's a narrow Linux setup for PostgreSQL that works well for a lot of people, but some days it does feel like that's in spite of the priorities of the people working on the Linux kernel. As someone who uses Linux to run postgresql dbs, I tend to agree. It's not quite alchemy or anything, but there are very real caveats to be aware of when using linux as the OS for postgresql to run on top of. I will say that XFS seems to be a very stable file system, and we use it for some of our databases with no problems at all. But most of our stuff sits on ext3 because it's stable and reliable and fast enough. Each OS has some warts when it comes to running pg on it. Could be a narrower selection of hardware drivers, buggy locale support, iffy kernel behaviour when lots of memory is allocated. And most have a way to work around those issues as long as you're careful what you're doing. If you're familiar with one OS and its warts, you're more likely to be bitten by the warts of another OS that's new to you no matter how good it is. And as always, test the crap outta your setup, cause the time to find problems is before you put a machine into production. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice? (v2)
Scott Marlowe scott.marl...@gmail.com wrote: I will say that XFS seems to be a very stable file system, and we use it for some of our databases with no problems at all. But most of our stuff sits on ext3 because it's stable and reliable and fast enough. Our PostgreSQL data directories are all on xfs, with everything else (OS, etc) on ext3. We've been happy with it, as long as we turn off write barriers, which is only save with a RAID controller with BBU cache. And as always, test the crap outta your setup, cause the time to find problems is before you put a machine into production. Absolutely. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice? (v2)
This is the second time I've heard that PG shared buffer on Windows doesn't matter ... I'd like to understand the reasoning behind that claim, and why it differs from other DB servers. though that's much less important for Pg than for most other things, as Pg uses a one-process-per-connection model and lets the OS handle much of the caching. So long as the OS can use all that RAM for caching, Pg will benefit, and it's unlikely you need 2GB for any given client connection or for the postmaster. Any DB software would benefit from the OS buffer cache, but there is still the overhead of copying that data into the shared buffer area, and as necessary unpacking it into in-memory format. Oracle uses a more or less identical process and memory model to PG, and for sure you can't have too much SGA with it. It's nice to have the flexibility to push up shared_buffers, and it'd be good to avoid any overheads in running 32-bit code on win64. However, it's not that unreasonable to run a 32-bit Pg on a 64-bit OS and expect good performance. My reasoning goes like this: a. there is a significant performance benefit to using a large proportion of memory as in-process DB server cache instead of OS level block / filesystem cache b. the only way to do so on modern hardware (i.e. 4GB) is with a 64-bit binary c. therefore, a 64-bit binary is essential You're the second person that's said a. is only a nice to have with PG ... what makes the difference? Cheers Dave
Re: [PERFORM] New server to improve performance on our large and busy DB - advice? (v2)
Dave Crooke wrote: My reasoning goes like this: a. there is a significant performance benefit to using a large proportion of memory as in-process DB server cache instead of OS level block / filesystem cache b. the only way to do so on modern hardware (i.e. 4GB) is with a 64-bit binary c. therefore, a 64-bit binary is essential You're the second person that's said a. is only a nice to have with PG ... what makes the difference? The PostgreSQL model presumes that it's going to be cooperating with the operating system cache. In a default config, all reads and writes go through the OS cache. You can get the WAL writes to be written in a way that bypasses the OS cache, but even that isn't the default. This makes PostgreSQL's effective cache size equal to shared_buffers *plus* the OS cache. This is why Windows can perform OK even without having a giant amount of dedicated RAM; it just leans on the OS more heavily instead. That's not as efficient, because you're shuffling more things between shared_buffers and the OS than you would on a UNIX system, but it's still way faster than going all the way to disk for something. On, say, a system with 16GB of RAM, you can setup Windows to use 256MB of shared_buffers, and expect that you'll find at least another 14GB or so of data cached by the OS. The reasons why Windows is particularly unappreciative of being allocated memory directly isn't well understood. But the basic property that shared_buffers is not the only source, or even the largest source, of caching is not unique to that platform. Oracle uses a more or less identical process and memory model to PG, and for sure you can't have too much SGA with it. The way data goes in and out of Oracle's SGA is often via direct I/O instead of even touching the OS read/white cache. That's why the situation is so different there. If you're on an Oracle system, and you need to re-read a block that was recently evicted from the SGA, it's probably going to be read from disk. In the same situation with PostgreSQL, it's likely you'll find it's still in the OS cache. -- Greg Smith2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice? (v2)
Dave Crooke dcro...@gmail.com writes: This is the second time I've heard that PG shared buffer on Windows doesn't matter ... I'd like to understand the reasoning behind that claim, and why it differs from other DB servers. AFAIK we don't really understand why, but the experimental evidence is that increasing shared_buffers to really large values doesn't help much on Windows. You can probably find more in the archives. I'm not sure that this has been retested recently, so it might be obsolete information, but it's what we've got. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] New server to improve performance on our large and busy DB - advice?
My client just informed me that new hardware is available for our DB server. . Intel Core 2 Quads Quad . 48 GB RAM . 4 Disk RAID drive (RAID level TBD) I have put the ugly details of what we do with our DB below, as well as the postgres.conf settings. But, to summarize: we have a PostgreSQL 8.3.6 DB with very large tables and the server is always busy serving a constant stream of single-row UPDATEs and INSERTs from parallel automated processes. There are less than 10 users, as the server is devoted to the KB production system. My questions: 1) Which RAID level would you recommend 2) Which Windows OS would you recommend? (currently 2008 x64 Server) 3) If we were to port to a *NIX flavour, which would you recommend? (which support trouble-free PG builds/makes please!) 4) Is this the right PG version for our needs? Thanks, Carlo The details of our use: . The DB hosts is a data warehouse and a knowledgebase (KB) tracking the professional information of 1.3M individuals. . The KB tables related to these 130M individuals are naturally also large . The DB is in a perpetual state of serving TCL-scripted Extract, Transform and Load (ETL) processes . These ETL processes typically run 10 at-a-time (i.e. in parallel) . We would like to run more, but the server appears to be the bottleneck . The ETL write processes are 99% single row UPDATEs or INSERTs. . There are few, if any DELETEs . The ETL source data are import tables . The import tables are permanently kept in the data warehouse so that we can trace the original source of any information. . There are 6000+ and counting . The import tables number from dozens to hundreds of thousands of rows. They rarely require more than a pkey index. . Linking the KB to the source import date requires an audit table of 500M rows, and counting. . The size of the audit table makes it very difficult to manage, especially if we need to modify the design. . Because we query the audit table different ways to audit the ETL processes decisions, almost every column in the audit table is indexed. . The maximum number of physical users is 10 and these users RARELY perform any kind of write . By contrast, the 10+ ETL processes are writing constantly . We find that internal stats drift, for whatever reason, causing row seq scans instead of index scans. . So far, we have never seen a situation where a seq scan has improved performance, which I would attribute to the size of the tables . We believe our requirements are exceptional, and we would benefit immensely from setting up the PG planner to always favour index-oriented decisions - which seems to contradict everything that PG advice suggests as best practice. Current non-default conf settings are: autovacuum = on autovacuum_analyze_scale_factor = 0.1 autovacuum_analyze_threshold = 250 autovacuum_naptime = 1min autovacuum_vacuum_scale_factor = 0.2 autovacuum_vacuum_threshold = 500 bgwriter_lru_maxpages = 100 checkpoint_segments = 64 checkpoint_warning = 290 datestyle = 'iso, mdy' default_text_search_config = 'pg_catalog.english' lc_messages = 'C' lc_monetary = 'C' lc_numeric = 'C' lc_time = 'C' log_destination = 'stderr' log_line_prefix = '%t ' logging_collector = on maintenance_work_mem = 16MB max_connections = 200 max_fsm_pages = 204800 max_locks_per_transaction = 128 port = 5432 shared_buffers = 500MB vacuum_cost_delay = 100 work_mem = 512MB -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice?
On Thu, 14 Jan 2010 14:17:13 -0500, Carlo Stonebanks stonec.regis...@sympatico.ca wrote: My client just informed me that new hardware is available for our DB server. . Intel Core 2 Quads Quad . 48 GB RAM . 4 Disk RAID drive (RAID level TBD) I have put the ugly details of what we do with our DB below, as well as the postgres.conf settings. But, to summarize: we have a PostgreSQL 8.3.6 DB with very large tables and the server is always busy serving a constant stream of single-row UPDATEs and INSERTs from parallel automated processes. There are less than 10 users, as the server is devoted to the KB production system. My questions: 1) Which RAID level would you recommend 10 2) Which Windows OS would you recommend? (currently 2008 x64 Server) If you have to run Windows... that works. 3) If we were to port to a *NIX flavour, which would you recommend? (which support trouble-free PG builds/makes please!) Community driven: Debian Stable CentOS 5 Commercial: Ubuntu LTS RHEL 5 4) Is this the right PG version for our needs? You want to run at least the latest stable 8.3 series which I believe is 8.3.9. With the imminent release of 8.5 (6 months), it may be time to move to 8.4.2 instead. Joshua D. Drake Thanks, Carlo The details of our use: . The DB hosts is a data warehouse and a knowledgebase (KB) tracking the professional information of 1.3M individuals. . The KB tables related to these 130M individuals are naturally also large . The DB is in a perpetual state of serving TCL-scripted Extract, Transform and Load (ETL) processes . These ETL processes typically run 10 at-a-time (i.e. in parallel) . We would like to run more, but the server appears to be the bottleneck . The ETL write processes are 99% single row UPDATEs or INSERTs. . There are few, if any DELETEs . The ETL source data are import tables . The import tables are permanently kept in the data warehouse so that we can trace the original source of any information. . There are 6000+ and counting . The import tables number from dozens to hundreds of thousands of rows. They rarely require more than a pkey index. . Linking the KB to the source import date requires an audit table of 500M rows, and counting. . The size of the audit table makes it very difficult to manage, especially if we need to modify the design. . Because we query the audit table different ways to audit the ETL processes decisions, almost every column in the audit table is indexed. . The maximum number of physical users is 10 and these users RARELY perform any kind of write . By contrast, the 10+ ETL processes are writing constantly . We find that internal stats drift, for whatever reason, causing row seq scans instead of index scans. . So far, we have never seen a situation where a seq scan has improved performance, which I would attribute to the size of the tables . We believe our requirements are exceptional, and we would benefit immensely from setting up the PG planner to always favour index-oriented decisions - which seems to contradict everything that PG advice suggests as best practice. Current non-default conf settings are: autovacuum = on autovacuum_analyze_scale_factor = 0.1 autovacuum_analyze_threshold = 250 autovacuum_naptime = 1min autovacuum_vacuum_scale_factor = 0.2 autovacuum_vacuum_threshold = 500 bgwriter_lru_maxpages = 100 checkpoint_segments = 64 checkpoint_warning = 290 datestyle = 'iso, mdy' default_text_search_config = 'pg_catalog.english' lc_messages = 'C' lc_monetary = 'C' lc_numeric = 'C' lc_time = 'C' log_destination = 'stderr' log_line_prefix = '%t ' logging_collector = on maintenance_work_mem = 16MB max_connections = 200 max_fsm_pages = 204800 max_locks_per_transaction = 128 port = 5432 shared_buffers = 500MB vacuum_cost_delay = 100 work_mem = 512MB -- PostgreSQL - XMPP: jdrake(at)jabber(dot)postgresql(dot)org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice?
On Thu, Jan 14, 2010 at 12:17 PM, Carlo Stonebanks stonec.regis...@sympatico.ca wrote: My client just informed me that new hardware is available for our DB server. . Intel Core 2 Quads Quad . 48 GB RAM . 4 Disk RAID drive (RAID level TBD) I have put the ugly details of what we do with our DB below, as well as the postgres.conf settings. But, to summarize: we have a PostgreSQL 8.3.6 DB with very large tables and the server is always busy serving a constant stream of single-row UPDATEs and INSERTs from parallel automated processes. There are less than 10 users, as the server is devoted to the KB production system. My questions: 1) Which RAID level would you recommend RAID-10 with a battery backed hardware caching controller. 2) Which Windows OS would you recommend? (currently 2008 x64 Server) That's probably the most stable choice out there for Windows. 3) If we were to port to a *NIX flavour, which would you recommend? (which support trouble-free PG builds/makes please!) I'd parrot what Joshua Drake said here. Centos / RHEL / Debian / Ubuntu 4) Is this the right PG version for our needs? 8.3 is very stable. Update to the latest. 8.4 seems good, but I've had, and still am having, problems with it crashing in production. Not often, maybe once every couple of months, but just enough that I'm not ready to try and use it there yet. And I can't force the same failure in testing, at least not yet. The details of our use: . These ETL processes typically run 10 at-a-time (i.e. in parallel) . We would like to run more, but the server appears to be the bottleneck . The ETL write processes are 99% single row UPDATEs or INSERTs. Can you run the ETL processes in such a way that they can do many inserts and updates at once? That would certainly speed things up a bit. . The size of the audit table makes it very difficult to manage, especially if we need to modify the design. You might want to look into partitioning / inheritance if that would help. . Because we query the audit table different ways to audit the ETL processes decisions, almost every column in the audit table is indexed. This may or may not help. If you're querying it and the part in the where clause referencing this column isn't very selective, and index won't be chosen anyway. If you've got multiple columns in your where clause, the more selective ones will use and index and the rest will get filtered out instead of using an index. Look in pg_stat_user_indexes for indexes that don't get used and drop them unless, of course, they're unique indexes. . The maximum number of physical users is 10 and these users RARELY perform any kind of write . By contrast, the 10+ ETL processes are writing constantly You may be well served by having two servers, one to write to, and a slave that is used by the actual users. Our slony slaves have a much easier time writing out their data than our master database does. . We find that internal stats drift, for whatever reason, causing row seq scans instead of index scans. Yeah, this is a known problem on heavily updated tables and recent entries. Cranking up autovacuum a bit can help, but often it requires special treatment, either by adjusting the autovac analyze threshold values for just those tables, or running manual analyzes every couple of minutes. . So far, we have never seen a situation where a seq scan has improved performance, which I would attribute to the size of the tables Not so much the size of the tables, as the size of the request. If you were running aggregates across whole large tables, a seq scan would definitely be the way to go. If you're asking for one row, index scan should win. Somewhere between those two, when you get up to hitting some decent percentage of the rows, the switch from index scan to seq scan makes sense, and it's likely happening too early for you. Look at random_page_cost and effective_cache_size for starters. . We believe our requirements are exceptional, and we would benefit immensely from setting up the PG planner to always favour index-oriented decisions - which seems to contradict everything that PG advice suggests as best practice. See previous comment I made up there ^^^ It's not about always using indexes, it's about giving the planner the information it needs to make the right choice. Current non-default conf settings are: autovacuum = on autovacuum_analyze_scale_factor = 0.1 You might wanna lower the analyze scale factor if you're having problems with bad query plans on fresh data. autovacuum_analyze_threshold = 250 autovacuum_naptime = 1min autovacuum_vacuum_scale_factor = 0.2 autovacuum_vacuum_threshold = 500 bgwriter_lru_maxpages = 100 checkpoint_segments = 64 checkpoint_warning = 290 datestyle = 'iso, mdy' default_text_search_config = 'pg_catalog.english' lc_messages = 'C' lc_monetary = 'C' lc_numeric = 'C' lc_time = 'C' log_destination = 'stderr' log_line_prefix = '%t '
[PERFORM] New server to improve performance on our large and busy DB - advice? (v2)
My client just informed me that new hardware is available for our DB server. . Intel Core 2 Quads Quad . 48 GB RAM . 4 Disk RAID drive (RAID level TBD) I have put the ugly details of what we do with our DB below, as well as the postgres.conf settings. But, to summarize: we have a PostgreSQL 8.3.6 DB with very large tables and the server is always busy serving a constant stream of single-row UPDATEs and INSERTs from parallel automated processes. There are less than 10 users, as the server is devoted to the KB production system. My questions: 1) Which RAID level would you recommend 2) Which Windows OS would you recommend? (currently 2008 x64 Server) 3) If we were to port to a *NIX flavour, which would you recommend? (which support trouble-free PG builds/makes please!) 4) Is this the right PG version for our needs? Thanks, Carlo The details of our use: . The DB hosts is a data warehouse and a knowledgebase (KB) tracking the professional information of 1.3M individuals. . The KB tables related to these 130M individuals are naturally also large . The DB is in a perpetual state of serving TCL-scripted Extract, Transform and Load (ETL) processes . These ETL processes typically run 10 at-a-time (i.e. in parallel) . We would like to run more, but the server appears to be the bottleneck . The ETL write processes are 99% single row UPDATEs or INSERTs. . There are few, if any DELETEs . The ETL source data are import tables . The import tables are permanently kept in the data warehouse so that we can trace the original source of any information. . There are 6000+ and counting . The import tables number from dozens to hundreds of thousands of rows. They rarely require more than a pkey index. . Linking the KB to the source import date requires an audit table of 500M rows, and counting. . The size of the audit table makes it very difficult to manage, especially if we need to modify the design. . Because we query the audit table different ways to audit the ETL processes decisions, almost every column in the audit table is indexed. . The maximum number of physical users is 10 and these users RARELY perform any kind of write . By contrast, the 10+ ETL processes are writing constantly . We find that internal stats drift, for whatever reason, causing row seq scans instead of index scans. . So far, we have never seen a situation where a seq scan has improved performance, which I would attribute to the size of the tables . We believe our requirements are exceptional, and we would benefit immensely from setting up the PG planner to always favour index-oriented decisions - which seems to contradict everything that PG advice suggests as best practice. Current non-default conf settings are: autovacuum = on autovacuum_analyze_scale_factor = 0.1 autovacuum_analyze_threshold = 250 autovacuum_naptime = 1min autovacuum_vacuum_scale_factor = 0.2 autovacuum_vacuum_threshold = 500 bgwriter_lru_maxpages = 100 checkpoint_segments = 64 checkpoint_warning = 290 datestyle = 'iso, mdy' default_text_search_config = 'pg_catalog.english' lc_messages = 'C' lc_monetary = 'C' lc_numeric = 'C' lc_time = 'C' log_destination = 'stderr' log_line_prefix = '%t ' logging_collector = on maintenance_work_mem = 16MB max_connections = 200 max_fsm_pages = 204800 max_locks_per_transaction = 128 port = 5432 shared_buffers = 500MB vacuum_cost_delay = 100 work_mem = 512MB -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice? (v2)
I'll bite 1. In general, RAID-10 is the only suitable RAID configuration for a database. The decision making comes in how many drives, and splitting stuff up into LUNs (like putting pg_xlog on its own LUN). 2. None of the above - you're asking the wrong question really. PostgreSQL is open source, and is developed on Unix. The Windows version is a pretty good port, as Windows posrt of OSS stuff go, but it's just that, a port. Your server is being dedicated to running Postgres, so the right question to ask is What is the best OS for running Postgres?. For any given database engine, regardless of the marketing and support stance, there is only one true primary enterprise OS platform that most big mission critical sites use, and is the best supported and most stable platform for that RDBMS. For Oracle, that's HP-UX (but 10 years ago, it was Solaris). For PostgreSQL, it's Linux. The biggest problem with Postgres on Windows is that it only comes in 32-bit. RAM is the ultimate performance tweak for an RDBMS, and to make proper use of modern amounts of RAM, you need a 64-bit executable. 3. The two choices I'd consider are both Linux: - for the conservative / supported approach, get Red Hat and buy support from them and (e.g.) Enterprise DB - if you plan to keep pretty current and are happy actively managing versions and running locally compiled builds, go with Ubuntu 4. The general wisdom is that there are a lot of improvements from 8.3 to 8.4, but how much benefit you'll see in your environment is another question. If you're building a new system and have to migrate anyway, it seems like a good opportunity to upgrade. Cheers Dave On Thu, Jan 14, 2010 at 3:25 PM, Carlo Stonebanks stonec.regis...@sympatico.ca wrote: My client just informed me that new hardware is available for our DB server. . Intel Core 2 Quads Quad . 48 GB RAM . 4 Disk RAID drive (RAID level TBD) I have put the ugly details of what we do with our DB below, as well as the postgres.conf settings. But, to summarize: we have a PostgreSQL 8.3.6 DB with very large tables and the server is always busy serving a constant stream of single-row UPDATEs and INSERTs from parallel automated processes. There are less than 10 users, as the server is devoted to the KB production system. My questions: 1) Which RAID level would you recommend 2) Which Windows OS would you recommend? (currently 2008 x64 Server) 3) If we were to port to a *NIX flavour, which would you recommend? (which support trouble-free PG builds/makes please!) 4) Is this the right PG version for our needs? Thanks, Carlo The details of our use: . The DB hosts is a data warehouse and a knowledgebase (KB) tracking the professional information of 1.3M individuals. . The KB tables related to these 130M individuals are naturally also large . The DB is in a perpetual state of serving TCL-scripted Extract, Transform and Load (ETL) processes . These ETL processes typically run 10 at-a-time (i.e. in parallel) . We would like to run more, but the server appears to be the bottleneck . The ETL write processes are 99% single row UPDATEs or INSERTs. . There are few, if any DELETEs . The ETL source data are import tables . The import tables are permanently kept in the data warehouse so that we can trace the original source of any information. . There are 6000+ and counting . The import tables number from dozens to hundreds of thousands of rows. They rarely require more than a pkey index. . Linking the KB to the source import date requires an audit table of 500M rows, and counting. . The size of the audit table makes it very difficult to manage, especially if we need to modify the design. . Because we query the audit table different ways to audit the ETL processes decisions, almost every column in the audit table is indexed. . The maximum number of physical users is 10 and these users RARELY perform any kind of write . By contrast, the 10+ ETL processes are writing constantly . We find that internal stats drift, for whatever reason, causing row seq scans instead of index scans. . So far, we have never seen a situation where a seq scan has improved performance, which I would attribute to the size of the tables . We believe our requirements are exceptional, and we would benefit immensely from setting up the PG planner to always favour index-oriented decisions - which seems to contradict everything that PG advice suggests as best practice. Current non-default conf settings are: autovacuum = on autovacuum_analyze_scale_factor = 0.1 autovacuum_analyze_threshold = 250 autovacuum_naptime = 1min autovacuum_vacuum_scale_factor = 0.2 autovacuum_vacuum_threshold = 500 bgwriter_lru_maxpages = 100 checkpoint_segments = 64 checkpoint_warning = 290 datestyle = 'iso, mdy' default_text_search_config = 'pg_catalog.english' lc_messages = 'C' lc_monetary = 'C' lc_numeric = 'C' lc_time = 'C' log_destination = 'stderr'
Re: [PERFORM] New server to improve performance on our large and busy DB - advice? (v2)
On 15/01/2010 6:35 AM, Dave Crooke wrote: I'll bite 1. In general, RAID-10 is the only suitable RAID configuration for a database. The decision making comes in how many drives, and splitting stuff up into LUNs (like putting pg_xlog on its own LUN). The biggest problem with Postgres on Windows is that it only comes in 32-bit. RAM is the ultimate performance tweak for an RDBMS, and to make proper use of modern amounts of RAM, you need a 64-bit executable. though that's much less important for Pg than for most other things, as Pg uses a one-process-per-connection model and lets the OS handle much of the caching. So long as the OS can use all that RAM for caching, Pg will benefit, and it's unlikely you need 2GB for any given client connection or for the postmaster. It's nice to have the flexibility to push up shared_buffers, and it'd be good to avoid any overheads in running 32-bit code on win64. However, it's not that unreasonable to run a 32-bit Pg on a 64-bit OS and expect good performance. You can always go 64-bit once 8.5/9.0 hits and has stabilized, anyway. -- Craig Ringer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice?
Carlo Stonebanks wrote: 1) Which RAID level would you recommend It looks like you stepped over a critical step, which is will the server have a good performing RAID card?. Your whole upgrade could underperform if you make a bad mistake on that part. It's really important to nail that down, and to benchmark to prove you got what you expected from your hardware vendor. 3) If we were to port to a *NIX flavour, which would you recommend? (which support trouble-free PG builds/makes please!) The only platform I consider close to trouble free as far as the PG builds working without issues are RHEL/CentOS, due to the maturity of the PGDG yum repository and how up to date it's kept. Every time I wander onto another platform I find the lag and care taken in packaging PostgreSQL to be at least a small step down from there. 4) Is this the right PG version for our needs? 8.4 removes the FSM, which takes away a common source for unexpected performance issues when you overflow max_fsm_pages one day. If you're going to deploy 8.3, you need to be more careful to monitor the whole VACUUM process; it's easier to ignore in 8.4 and still get by OK. As far as general code stability goes, I think it's a wash at this point. You might discover a bug in 8.4 that causes a regression, but I think you're just as likely to run into a situation that 8.3 handles badly that's improved in 8.4. Hard to say which will work out better in a really general way. . We believe our requirements are exceptional, and we would benefit immensely from setting up the PG planner to always favour index-oriented decisions - which seems to contradict everything that PG advice suggests as best practice. Pretty much everyone thinks their requirements are exceptional. It's funny how infrequently that's actually true. The techniques that favor index-use aren't that unique: collect better stats, set basic parameters correctly, adjust random_page_cost, investigate plans that don't do what you want to figure out why. It's easy to say there's something special about your data rather than follow fundamentals here; I'd urge you to avoid doing that. The odds that the real issue is that you're feeding the optimizer bad data is more likely than most people think, which brings us to: Current non-default conf settings are: I don't see effective_cache_size listed there. If that's at the default, I wouldn't be surprised that you're seeing sequential scans instead of indexed ones far too often. max_connections = 200 work_mem = 512MB This is a frightening combination by the way. -- Greg Smith2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server to improve performance on our large and busy DB - advice?
Pretty much everyone thinks their requirements are exceptional. It's funny how infrequently that's actually true. The techniques that favor index-use aren't that unique: collect better stats, set basic parameters correctly, adjust random_page_cost, investigate plans that don't do what you want to figure out why. It's easy to say there's something special about your data rather than follow fundamentals here; I'd urge you to avoid doing that. The odds that the real issue is that you're feeding the optimizer bad data is more likely than most people think, which brings us to: I understand that. And the answer is usually to go and do and ANALYZE manually (if it isn't this, it will be some dependency on a set-returning stored function we wrote before we could specify the rows and cost). My question is really - why do I need this constant intervention? When we rarely do aggregates, when our queries are (nearly) always single row queries (and very rarely more than 50 rows) out of tables that have hundreds of thousands to millions of rows, what does it take to NOT have to intervene? WHich brings me to your next point: I don't see effective_cache_size listed there. If that's at the default, I wouldn't be surprised that you're seeing sequential scans instead of indexed ones far too often. Nice to know - I suspect someone has been messing around with stuff they don't understand. I do know that after some screwing around they got the server to the point that it wouldn't restart and tried to back out until it would. max_connections = 200 work_mem = 512MB This is a frightening combination by the way. Looks like it's connected to the above issue. The real max connection value is 1/10th of that. Thanks Greg! Carlo -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance