Re: [PERFORM] which Xeon processors don't have the context switching problem
Guillaume Smet wrote: On 2/23/07, Geoffrey <[EMAIL PROTECTED]> wrote: As I've heard. We're headed for 8 as soon as possible, but until we get our code ready, we're on 7.4.16. You should move to at least 8.1 and possibly 8.2. It's not a good idea to upgrade only to 8 IMHO. When I said 8, I meant whatever the latest greatest 8 is. Right now, that looks like 8.2.3. -- Until later, Geoffrey Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. - Benjamin Franklin ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] pg_trgm performance
Hi Steinar, On 2/24/07, Steinar H. Gunderson <[EMAIL PROTECTED]> wrote: I'm sorry, I can no longer remember where I needed pg_trgm. Simple testing of your patch seems to indicate that the GiN version is about 65% _slower_ (18ms vs. 30ms) for a test data set I found lying around, but I remember that on the data set I needed it, the GIST version was a lot slower than that (think 3-400ms). The 18 vs. 30ms test is a random Amarok database, on 8.2.3 (Debian). Could you post EXPLAIN ANALYZE for both queries (after 2 or 3 runs)? And if you can provide EXPLAIN ANALYZE for a couple of searches (short length, medium length and long) in both cases, it could be nice too. The GiN version is not selective enough currently compared to GiST. It generally finds the matching rows faster but it has a slower recheck cond so it's sometimes interesting (in my case) and sometimes not that interesting (it seems to be your case). Thanks. -- Guillaume ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] which Xeon processors don't have the context switching problem
Alvaro Herrera wrote: > Steinar H. Gunderson wrote: >> On Fri, Feb 23, 2007 at 02:05:57PM -0500, Geoffrey wrote: >>> In searching the archives, I can't find any specific info indentifying >>> which Xeon processors don't have this problem. >> AFAIK the cut-off point is at the Woodcrests. They are overall much better >> suited to PostgreSQL than the older Xeons were. >> >> It's slightly unfortunate that AMD and Intel cling to the Opteron and Xeon >> names even though they're making significant architecture changes, but that's >> life, I guess. > > AFAIR Intel has been calling their server processors Xeon since Pentium > Pro's, at least. > Almost. Xeon was the new name for the "Pro" series. Instead of Pentium II Pro, we got Pentium II Xeon. The whole Pentium Pro line was a server line, which is why initial Pentium-II CPUs were significantly slower for server apps than the much older ppro (which still runs pg at a reasonable speed if you have enough of them and a low budget, btw) //Magnus ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] pg_trgm performance
On Sat, Feb 24, 2007 at 12:09:41AM +0100, Guillaume Smet wrote: > Could you try to see if the GIN implementation of pg_trgm is faster in > your cases? I'm sorry, I can no longer remember where I needed pg_trgm. Simple testing of your patch seems to indicate that the GiN version is about 65% _slower_ (18ms vs. 30ms) for a test data set I found lying around, but I remember that on the data set I needed it, the GIST version was a lot slower than that (think 3-400ms). The 18 vs. 30ms test is a random Amarok database, on 8.2.3 (Debian). Sorry I couldn't be of more help. /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] pg_trgm performance
Florian, Steinar, Could you try to see if the GIN implementation of pg_trgm is faster in your cases? Florian, instead of using WHERE similarity(...) > 0.4, you should use set_limit (SELECT set_limit(0.4);). I posted it on -patches and it is available here: http://people.openwide.fr/~gsmet/postgresql/pg_trgm_gin3.diff . The patch is against HEAD but It applies on 8.2 without any problem. After applying this patch and installing pg_trgm.sql (you should uninstall pg_trgm before compiling using the old uninstall script), then you can create: CREATE INDEX idx_table_word ON table USING gin(word gin_trgm_ops); 17 characters is quite long so I'm not sure it will help you because it usually has to recheck a high number of rows due to the GIN implementation but I'd like to know if it's faster or slower in this case. If your data are not private and you don't have the time to test it, I can test it here without any problem. Thanks. -- Guillaume ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] long checkpoint_timeout
On Fri, Feb 23, 2007 at 12:23:08PM -0800, Jeff Davis wrote: > On Fri, 2007-02-23 at 14:02 -0600, Jim C. Nasby wrote: > > > say that checkpoints cause extra disk I/O. Is there a good way to > > > measure how much extra I/O (and WAL volume) is caused by the > > > checkpoints? Also, it would be good to know how much total I/O is caused > > > by a checkpoint so that I know if bgwriter is doing it's job. > > > > There's a patch someone just came up with that provides additional debug > > info about both bgwriter operation and checkpoints. I know it will at > > least tell you how much was written out by a checkpoint. > > Excellent, that would answer a lot of my questions. I did some brief > searching and nothing turned up. Do you have a link to the discussion or > the patch? http://archives.postgresql.org/pgsql-hackers/2007-02/msg01083.php -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] which Xeon processors don't have the context switching problem
On 2/23/07, Geoffrey <[EMAIL PROTECTED]> wrote: As I've heard. We're headed for 8 as soon as possible, but until we get our code ready, we're on 7.4.16. You should move to at least 8.1 and possibly 8.2. It's not a good idea to upgrade only to 8 IMHO. -- Guillaume ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] which Xeon processors don't have the context switching problem
On 2/23/07, Joshua D. Drake <[EMAIL PROTECTED]> wrote: Also isn't it pretty much *not* a problem with current versions of PostgreSQL? We had a really *big* scalability problem with a quad Xeon MP 2.2 and PostgreSQL 7.4. The problem is mostly gone since we upgraded to 8.1 a year ago. Woodcrest seems to perform really well with PostgreSQL according to what I can read on the Internet so we will probably change the server for a dual Woodcrest in a few months. -- Guillaume ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] which Xeon processors don't have the context switching problem
Joshua D. Drake wrote: Josh Berkus wrote: Geoffrey, I recall a reference on the list indicating that newer Xeon processors don't suffer from the context switching problem reported last year. Just to be clear, it's a software problem which affects all architectures, including AMD and Sparc. It's just *worse* on the PIII and P4 generation Xeons. Also isn't it pretty much *not* a problem with current versions of PostgreSQL? As I've heard. We're headed for 8 as soon as possible, but until we get our code ready, we're on 7.4.16. -- Until later, Geoffrey Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. - Benjamin Franklin ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] which Xeon processors don't have the context switching problem
Josh Berkus wrote: > Geoffrey, > >> I recall a reference on the list indicating that newer Xeon processors >> don't suffer from the context switching problem reported last year. > > Just to be clear, it's a software problem which affects all architectures, > including AMD and Sparc. It's just *worse* on the PIII and P4 generation > Xeons. > Also isn't it pretty much *not* a problem with current versions of PostgreSQL? Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] which Xeon processors don't have the context switching problem
Josh Berkus wrote: Geoffrey, I recall a reference on the list indicating that newer Xeon processors don't suffer from the context switching problem reported last year. Just to be clear, it's a software problem which affects all architectures, including AMD and Sparc. It's just *worse* on the PIII and P4 generation Xeons. Thanks, that's what I need to hear. They've since cut a deal for Operton based hardware, so the point is now moot. -- Until later, Geoffrey Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. - Benjamin Franklin ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] which Xeon processors don't have the context switching problem
Geoffrey, > I recall a reference on the list indicating that newer Xeon processors > don't suffer from the context switching problem reported last year. Just to be clear, it's a software problem which affects all architectures, including AMD and Sparc. It's just *worse* on the PIII and P4 generation Xeons. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] long checkpoint_timeout
On Fri, 2007-02-23 at 14:02 -0600, Jim C. Nasby wrote: > > say that checkpoints cause extra disk I/O. Is there a good way to > > measure how much extra I/O (and WAL volume) is caused by the > > checkpoints? Also, it would be good to know how much total I/O is caused > > by a checkpoint so that I know if bgwriter is doing it's job. > > There's a patch someone just came up with that provides additional debug > info about both bgwriter operation and checkpoints. I know it will at > least tell you how much was written out by a checkpoint. Excellent, that would answer a lot of my questions. I did some brief searching and nothing turned up. Do you have a link to the discussion or the patch? Regards, Jeff Davis ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] long checkpoint_timeout
On Fri, Feb 23, 2007 at 10:14:29AM -0800, Jeff Davis wrote: > The postgresql.conf says that the maximum checkpoint_timeout is 1 hour. > However, the following messages seem to suggest that it may be useful to > set the value significantly higher to reduce unnecessary WAL volume: > > http://archives.postgresql.org/pgsql-hackers/2006-10/msg00527.php > http://archives.postgresql.org/pgsql-hackers/2006-08/msg01190.php > > Is there a reason for the hour-long limit on checkpoint_timeout? Is > there a cost to doing so, aside from potentially longer recovery time? > > As I understand it, the background writer keeps the I/O more balanced > anyway, avoiding I/O spikes at checkpoint. > > I don't need the checkpoint time to be higher than 1 hour, but I'm > trying to understand the reasoning behind the limit and the implications > of a longer checkpoint_timeout. > > The docs here: > > http://www.postgresql.org/docs/current/static/wal-configuration.html > > say that checkpoints cause extra disk I/O. Is there a good way to > measure how much extra I/O (and WAL volume) is caused by the > checkpoints? Also, it would be good to know how much total I/O is caused > by a checkpoint so that I know if bgwriter is doing it's job. There's a patch someone just came up with that provides additional debug info about both bgwriter operation and checkpoints. I know it will at least tell you how much was written out by a checkpoint. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] which Xeon processors don't have the context switching problem
On Fri, Feb 23, 2007 at 04:53:18PM -0300, Alvaro Herrera wrote: >> It's slightly unfortunate that AMD and Intel cling to the Opteron and Xeon >> names even though they're making significant architecture changes, but that's >> life, I guess. > AFAIR Intel has been calling their server processors Xeon since Pentium > Pro's, at least. Yes, that was sort of my point. :-) /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] which Xeon processors don't have the context switching problem
Steinar H. Gunderson wrote: > On Fri, Feb 23, 2007 at 02:05:57PM -0500, Geoffrey wrote: > > In searching the archives, I can't find any specific info indentifying > > which Xeon processors don't have this problem. > > AFAIK the cut-off point is at the Woodcrests. They are overall much better > suited to PostgreSQL than the older Xeons were. > > It's slightly unfortunate that AMD and Intel cling to the Opteron and Xeon > names even though they're making significant architecture changes, but that's > life, I guess. AFAIR Intel has been calling their server processors Xeon since Pentium Pro's, at least. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] which Xeon processors don't have the context switching problem
On Fri, Feb 23, 2007 at 02:05:57PM -0500, Geoffrey wrote: > In searching the archives, I can't find any specific info indentifying > which Xeon processors don't have this problem. AFAIK the cut-off point is at the Woodcrests. They are overall much better suited to PostgreSQL than the older Xeons were. It's slightly unfortunate that AMD and Intel cling to the Opteron and Xeon names even though they're making significant architecture changes, but that's life, I guess. /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] which Xeon processors don't have the context switching problem
I recall a reference on the list indicating that newer Xeon processors don't suffer from the context switching problem reported last year. In searching the archives, I can't find any specific info indentifying which Xeon processors don't have this problem. Anyone point me to a reference? We recently migrated to a woodcrest @ 3 GHz from a 2 Ghz opteron. The woodcrest seems to be enjoying doing db-related work. I don't have numbers other than load is much lower now. Is this in any way related to the version of Postgresql one is running? We're headed for 8, but have a bit of work before we can get there. We are currently on 7.4.16. We are at 7.4.14 which works fine atm. regards Claus ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] which Xeon processors don't have the context switching problem
I recall a reference on the list indicating that newer Xeon processors don't suffer from the context switching problem reported last year. In searching the archives, I can't find any specific info indentifying which Xeon processors don't have this problem. Anyone point me to a reference? Is this in any way related to the version of Postgresql one is running? We're headed for 8, but have a bit of work before we can get there. We are currently on 7.4.16. Thanks for any info. -- Until later, Geoffrey Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. - Benjamin Franklin ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] long checkpoint_timeout
The postgresql.conf says that the maximum checkpoint_timeout is 1 hour. However, the following messages seem to suggest that it may be useful to set the value significantly higher to reduce unnecessary WAL volume: http://archives.postgresql.org/pgsql-hackers/2006-10/msg00527.php http://archives.postgresql.org/pgsql-hackers/2006-08/msg01190.php Is there a reason for the hour-long limit on checkpoint_timeout? Is there a cost to doing so, aside from potentially longer recovery time? As I understand it, the background writer keeps the I/O more balanced anyway, avoiding I/O spikes at checkpoint. I don't need the checkpoint time to be higher than 1 hour, but I'm trying to understand the reasoning behind the limit and the implications of a longer checkpoint_timeout. The docs here: http://www.postgresql.org/docs/current/static/wal-configuration.html say that checkpoints cause extra disk I/O. Is there a good way to measure how much extra I/O (and WAL volume) is caused by the checkpoints? Also, it would be good to know how much total I/O is caused by a checkpoint so that I know if bgwriter is doing it's job. Regards, Jeff Davis ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Recommended Initial Settings
In response to "Campbell, Lance" <[EMAIL PROTECTED]>: > Richard, > Thanks for your reply. > > You said: > "Your operating-system should be doing the caching for you." > > My understanding is that as long as Linux has memory available it will > cache files. Then from your comment I get the impression that since > Linux would be caching the data files for the postgres database it would > be redundant to have a large shared_buffers. Did I understand you > correctly? Keep in mind that keeping the data in the kernel's buffer requires Postgres to make a syscall to read a file, which the kernel then realizes is cached in memory. The kernel then has to make that data available to the Postgres (userland) process. If the data is in Postgres' buffers, Postgres can fetch it directly, thus avoiding the overhead of the syscalls and the kernel activity. You still have to make sysvshm calls, though. So, it depends on which is able to manage the memory better. Is the kernel so much more efficient that it makes up for the overhead of the syscalls? My understanding is that in recent versions of Postgres, this is not the case, and large shared_buffers improve performance. I've yet to do any in-depth testing on this, though. -- Bill Moran Collaborative Fusion Inc. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Very slow bytea data extraction
>are you getting the data from the local box or from a remote site? Everything is on the local box. >also explain analyze is showing nothing slow but you did not post the >enitre output. also, try the \timing switch in psql. Actually a line was missing: Total runtime: 0.337 ms. Massimo ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Very slow bytea data extraction
>If you look at the "actual time" it's completing very quickly indeed. So >- it must be something to do with either: >1. Fetching/formatting the data >>2. Transferring the data to the client. I do agree. >What happens if you only select half the rows? Does the time to run the >select halve? Yes, it does. Using pgAdmin, the time to get all 36 rows is about 67500ms while it's 24235ms to get only 18 rows. Massimo ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Recommended Initial Settings
If you're doing much updating at all you'll also want to bump up checkpoint_segments. I like setting checkpoint_warning just a bit under checkpoint_timeout as a way to monitor how often you're checkpointing due to running out of segments. With a large shared_buffers you'll likely need to make the bgwriter more aggressive as well (increase the max_pages numbers), though how important that is depends on how much updating you're doing. If you see periodic spikes in IO corresponding to checkpoints, that's an indication bgwriter isn't doing a good enough job. If everything ends up in memory, it might be good to decrease random_page_cost to 1 or something close to it; though the database should just rely on effective_cache to figure out that everything's in memory. If you're on pre-8.2, you'll want to cut all the autovacuum parameters in half, if you're using it. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Using the 8.2 autovacuum values with 8.1
On Fri, Feb 23, 2007 at 10:13:31AM +0100, Csaba Nagy wrote: > You likely don't need the nightly full vacuum run... we also do here a > nightly vacuum beside autovacuum, but not a full one, only for tables > which are big enough that we don't want autovacuum to touch them in high > business time but they have enough change that we want a vacuum on them > frequent enough. I discover them by checking the stats, for example: Something else I like doing is a periodic vacuumdb -av and capture the output. It's a good way to keep an eye on FSM utilization. Once you've got vacuuming under control you can probably just do that once a month or so. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Recommended Initial Settings
Campbell, Lance wrote: Richard, Thanks for your reply. You said: "Your operating-system should be doing the caching for you." My understanding is that as long as Linux has memory available it will cache files. Then from your comment I get the impression that since Linux would be caching the data files for the postgres database it would be redundant to have a large shared_buffers. Did I understand you correctly? That's right - PG works with the O.S. This means it *might* not be a big advantage to have a large shared_buffers. On older versions of PG, the buffer management code wasn't great with large shared_buffers values too. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Recommended Initial Settings
Richard, Thanks for your reply. You said: "Your operating-system should be doing the caching for you." My understanding is that as long as Linux has memory available it will cache files. Then from your comment I get the impression that since Linux would be caching the data files for the postgres database it would be redundant to have a large shared_buffers. Did I understand you correctly? Thanks, Lance Campbell Project Manager/Software Architect Web Services at Public Affairs University of Illinois 217.333.0382 http://webservices.uiuc.edu -Original Message- From: Richard Huxton [mailto:[EMAIL PROTECTED] Sent: Friday, February 23, 2007 10:29 AM To: Campbell, Lance Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Recommended Initial Settings Campbell, Lance wrote: > I would like to get someone's recommendations on the best initial > settings for a dedicated PostgreSQL server. I do realize that there are > a lot of factors that influence how one should configure a database. I > am just looking for a good starting point. Ideally I would like the > database to reside as much as possible in memory with no disk access. > The current database size of my 7.x version of PostgreSQL generates a 6 > Gig file when doing a database dump. Your operating-system should be doing the caching for you. > Dedicated PostgreSQL 8.2 Server > Redhat Linux 4.x AS 64 bit version (EM64T) > 4 Intel Xeon Processors If these are older Xeons, check the mailing list archives for "xeon context switch". > 20 Gig Memory > Current PostgreSQL database is 6 Gig file when doing a database dump OK, so it's plausible the whole thing will fit in RAM (as a rule-of-thumb I assume headers, indexes etc. triple or quadruple the size). To know better, check the actual disk-usage of $PGDATA. > /etc/sysctl.conf file settings: > > # 11 Gig > > kernel.shmmax = 11811160064 Hmm - that's a lot of shared RAM. See shared_buffers below. > kernel.sem = 250 32000 100 128 > > net.ipv4.ip_local_port_range = 1024 65000 > > net.core.rmem_default = 262144 > > net.core.rmem_max = 262144 > > net.core.wmem_default = 262144 > > net.core.wmem_max = 262144 > postgresql.conf file settings (if not listed then I used the defaults): > > max_connections = 300 How many connections do you expect typically/peak? It doesn't cost much to have max_connections set high but your workload is the most important thing missing from your question. > shared_buffers = 10240MB For 7.x that's probably way too big, but 8.x organises its buffers better. I'd still be tempted to start a 1 or 2GB and work up - see where it stops buying you an improvement. > work_mem = 10MB If you have large queries, doing big sorts I'd increase this. Don't forget it's per-sort, so if you have got about 300 connections live at any one time that could be 300*10MB*N if they're all doing something complicated. If you only have one connection live, you can increase this quite substantially. > effective_cache_size = 512MB This isn't setting PG's memory usage, it's telling PG how much data your operating-system is caching. Check "free" and see what it says. For you, I'd expect 10GB+. > maintenance_work_mem = 100MB This is for admin-related tasks, so you could probably increase it. Workload workload workload - we need to know what you're doing with it. Once connection summarising the entire database will want larger numbers than 100 connections running many small queries. HTH -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: R: [PERFORM] Very slow bytea data extraction
On 2/23/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: Thanks for your reply, >Is it in executing the query (what does EXPLAIN ANALYSE show)? Here is the output of explain analyze SELECT * FROM "FILE" "Seq Scan on "FILE" (cost=0.00..1.36 rows=36 width=235) (actual time=0.023..0.107 rows=36 loops=1)" >How are you accessing the database: odbc,jdbc,other? >Does it do this with psql too? The problem is the same when I access the db with jdbc, pgAdmin and even psql are you getting the data from the local box or from a remote site? also explain analyze is showing nothing slow but you did not post the enitre output. also, try the \timing switch in psql. merlin ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Recommended Initial Settings
Campbell, Lance wrote: I would like to get someone's recommendations on the best initial settings for a dedicated PostgreSQL server. I do realize that there are a lot of factors that influence how one should configure a database. I am just looking for a good starting point. Ideally I would like the database to reside as much as possible in memory with no disk access. The current database size of my 7.x version of PostgreSQL generates a 6 Gig file when doing a database dump. Your operating-system should be doing the caching for you. Dedicated PostgreSQL 8.2 Server Redhat Linux 4.x AS 64 bit version (EM64T) 4 Intel Xeon Processors If these are older Xeons, check the mailing list archives for "xeon context switch". 20 Gig Memory Current PostgreSQL database is 6 Gig file when doing a database dump OK, so it's plausible the whole thing will fit in RAM (as a rule-of-thumb I assume headers, indexes etc. triple or quadruple the size). To know better, check the actual disk-usage of $PGDATA. /etc/sysctl.conf file settings: # 11 Gig kernel.shmmax = 11811160064 Hmm - that's a lot of shared RAM. See shared_buffers below. kernel.sem = 250 32000 100 128 net.ipv4.ip_local_port_range = 1024 65000 net.core.rmem_default = 262144 net.core.rmem_max = 262144 net.core.wmem_default = 262144 net.core.wmem_max = 262144 postgresql.conf file settings (if not listed then I used the defaults): max_connections = 300 How many connections do you expect typically/peak? It doesn't cost much to have max_connections set high but your workload is the most important thing missing from your question. shared_buffers = 10240MB For 7.x that's probably way too big, but 8.x organises its buffers better. I'd still be tempted to start a 1 or 2GB and work up - see where it stops buying you an improvement. work_mem = 10MB If you have large queries, doing big sorts I'd increase this. Don't forget it's per-sort, so if you have got about 300 connections live at any one time that could be 300*10MB*N if they're all doing something complicated. If you only have one connection live, you can increase this quite substantially. effective_cache_size = 512MB This isn't setting PG's memory usage, it's telling PG how much data your operating-system is caching. Check "free" and see what it says. For you, I'd expect 10GB+. maintenance_work_mem = 100MB This is for admin-related tasks, so you could probably increase it. Workload workload workload - we need to know what you're doing with it. Once connection summarising the entire database will want larger numbers than 100 connections running many small queries. HTH -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Recommended Initial Settings
I would like to get someone's recommendations on the best initial settings for a dedicated PostgreSQL server. I do realize that there are a lot of factors that influence how one should configure a database. I am just looking for a good starting point. Ideally I would like the database to reside as much as possible in memory with no disk access. The current database size of my 7.x version of PostgreSQL generates a 6 Gig file when doing a database dump. Dedicated PostgreSQL 8.2 Server Redhat Linux 4.x AS 64 bit version (EM64T) 4 Intel Xeon Processors 20 Gig Memory Current PostgreSQL database is 6 Gig file when doing a database dump /etc/sysctl.conf file settings: # 11 Gig kernel.shmmax = 11811160064 kernel.sem = 250 32000 100 128 net.ipv4.ip_local_port_range = 1024 65000 net.core.rmem_default = 262144 net.core.rmem_max = 262144 net.core.wmem_default = 262144 net.core.wmem_max = 262144 postgresql.conf file settings (if not listed then I used the defaults): max_connections = 300 shared_buffers = 10240MB work_mem = 10MB effective_cache_size = 512MB maintenance_work_mem = 100MB Any suggestions would be appreciated! Thanks, Lance Campbell Project Manager/Software Architect Web Services at Public Affairs University of Illinois 217.333.0382 http://webservices.uiuc.edu
Re: R: [PERFORM] Very slow bytea data extraction
[EMAIL PROTECTED] wrote: Thanks for your reply, Is it in executing the query (what does EXPLAIN ANALYSE show)? Here is the output of explain analyze SELECT * FROM "FILE" "Seq Scan on "FILE" (cost=0.00..1.36 rows=36 width=235) (actual time=0.023..0.107 rows=36 loops=1)" If you look at the "actual time" it's completing very quickly indeed. So - it must be something to do with either: 1. Fetching/formatting the data 2. Transferring the data to the client. What happens if you only select half the rows? Does the time to run the select halve? -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
R: [PERFORM] Very slow bytea data extraction
Thanks for your reply, >Is it in executing the query (what does EXPLAIN ANALYSE show)? Here is the output of explain analyze SELECT * FROM "FILE" "Seq Scan on "FILE" (cost=0.00..1.36 rows=36 width=235) (actual time=0.023..0.107 rows=36 loops=1)" >How are you accessing the database: odbc,jdbc,other? >Does it do this with psql too? The problem is the same when I access the db with jdbc, pgAdmin and even psql Massimo ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Very slow bytea data extraction
[EMAIL PROTECTED] wrote: Hi all, I'm using Postgresql 8.2.3 on a Windows XP system. I need to write and retrieve bytea data from a table. The problem is that, while data insertion is quite fast, bytea extraction is very slow. I'm trying to store a 250KB image into the bytea field. A simple select query on a 36-row table takes more than one minute to execute. Where is the problem? Is it in executing the query (what does EXPLAIN ANALYSE show)? Is it in fetching/formatting the data (what does the equivalent COUNT(*) show)? How are you accessing the database: odbc,jdbc,other? Does it do this with psql too? -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] Very slow bytea data extraction
Hi all, I'm using Postgresql 8.2.3 on a Windows XP system. I need to write and retrieve bytea data from a table. The problem is that, while data insertion is quite fast, bytea extraction is very slow. I'm trying to store a 250KB image into the bytea field. A simple select query on a 36-row table takes more than one minute to execute. Any help would be very appreciated Thanks in advance Massimo ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Using the 8.2 autovacuum values with 8.1
On Thu, 2007-02-22 at 22:53, Mark Stosberg wrote: > Thanks to everyone for the feedback about vacuuming. It's been very > useful. The pointers to the pgstattuple and Pgfouine tools were also > helpful. > > I'm now considering the following plan for trying Autovacuuming again > with 8.1. I'd like any peer review you have to offer of the following: > > 1. First, I'll move the settings to match the defaults in 8.2. The ones > I noticed in particular were: > > autovacuum_vacuum_threshold changes: 1000 -> 500 > autovacuum_anayze_threshold changes: 500 -> 250 > autovacuum_scale_factor changes: .4 -> .2 > autovacuum_analyze_scale_factor changes .2 -> .1 > > 2. Try the vacuum cost delay feature, starting with a 20ms value: > > autovacuum_vacuum_cost_delay = 20 > > 3. Immediately add a row to pg_autovacuum for a huge logging table that > would be too slow to vacuum usually. We'll still vacuum it once a week > for good measure by cron. > > 4. For good measure, I think I still keep the nightly cron entry that > does a complete vacuum analyze (except for that large table...). > > Seem like a reasonable plan? You likely don't need the nightly full vacuum run... we also do here a nightly vacuum beside autovacuum, but not a full one, only for tables which are big enough that we don't want autovacuum to touch them in high business time but they have enough change that we want a vacuum on them frequent enough. I discover them by checking the stats, for example: SELECT c.relname, c.reltuples::bigint as rowcnt, pg_stat_get_tuples_inserted(c.oid) AS inserted, pg_stat_get_tuples_updated(c.oid) AS updated, pg_stat_get_tuples_deleted(c.oid) AS deleted FROM pg_class c WHERE c.relkind = 'r'::"char" GROUP BY c.oid, c.relname, c.reltuples HAVING pg_stat_get_tuples_updated(c.oid) + pg_stat_get_tuples_deleted(c.oid) > 1000 ORDER BY pg_stat_get_tuples_updated(c.oid) + pg_stat_get_tuples_deleted(c.oid) DESC; The top tables in this list for which the (deleted + updated) / rowcnt is relatively small but still significant need your attention for nightly vacuum... the rest is handled just fine by autovacuum. On the other end of the scale, if you have tables for which the deletion/update rate is way higher then the row count, that's likely a hot-spot table which you probably need extra vacuuming during the day. Cheers, Csaba. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster