Re: [PERFORM] LIKE search and performance
[EMAIL PROTECTED] wrote: And since it's basically impossible to know the selectivity of this kind of where condition, I doubt the planner would ever realistically want to choose that plan anyway because of its poor worst-case behavior. What is a real life example where an intelligent and researched database application would issue a like or ilike query as their primary condition in a situation where they expected very high selectivity? Avoiding a poor worst-case behaviour for a worst-case behaviour that won't happen doesn't seem practical. But if you are also filtering on e.g. date, and that has an index with good selectivity, you're never going to use the text index anyway are you? If you've only got a dozen rows to check against, might as well just read them in. The only time it's worth considering the behaviour at all is *if* the worst-case is possible. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13)
Hi *, for caching large autogenerated XML files, I have created a bytea table in my database so that the cached files can be used by multiple servers. There are about 500 rows and 10-20 Updates per minute on the table. The files stored in the bytea are anything from 10kB to 10MB. My PostgreSQL version is 8.0.13 on Gentoo Linux (x86) with PostGIS 1.2.0. For vacuum I use the pg_autovacuum daemon. It decided to vacuum my cache table about every 3 hours, the vacuum process takes 20-30 minutes (oops!) every time. Now my big big problem is that the database gets really really slow during these 20 minutes and after the vacuum process is running for a short time, many transactions show state "UPDATE waiting" in the process list. In my Java application server I sometimes get tons of deadlock Exceptions (waiting on ShareLock blahblah). The web frontend gets nearly unusable, logging in takes more than 60 seconds, etc. etc. Under normal circumstances my application is really fast, vacuuming other tables is no problem, only the bytea table is really awkward I hope some of you performance cracks can help me... this is my table definition: Table »public.binary_cache« Column | Type| Attributes --+-+--- cache_id | bigint | not null date | timestamp without time zone | data | bytea | Indexe: »binary_cache_pkey« PRIMARY KEY, btree (cache_id) Thanks in advance for any hints! -- Bastian Voigt Neumünstersche Straße 4 20251 Hamburg telefon +49 - 40 - 67957171 mobil +49 - 179 - 4826359 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] general PG network slowness (possible cure) (repost)
I set up pg to replace a plain gdbm database for my application. But even running to the same machine, via a unix socket * the pg database ran 100 times slower Across the net it was * about 500 to 1000 times slower than local gdbm with no cpu use to speak of. I'd heard that networked databases are slow. I might have left it at that if curiosity hadn't led me to write a network server for gdbm databases, and talk to _that_ just to get a comparison. Lo and behold and smack me with a corncob if it wasn't _slower_ than pg. On a whim I mapped the network bandwidth per packet size with the NPtcp suite, and got surprising answers .. at 1500B, naturally, the bandwidth was the full 10Mb/s (minus overheads, say 8.5Mb/s) of my pathetic little local net. At 100B the bandwidth available was only 25Kb/s. At 10B, you might as well use tin cans and taut string instead. I also mapped the network flows using ntop, and yes, the average packet size for both gdbm and pg in one direction was only about 100B or so. That's it! Clearly there are a lot of short queries going out and the answers were none too big either ( I had a LIMIT 1 in all my PG queries). About 75% of traffic was in the 64-128B range while my application was running, with the peak bandwidth in that range being about 75-125Kb/s (and I do mean bits, not bytes). S ... I took a look at my implementation of remote gdbm, and did a very little work to aggregate outgoing transmissions together into lumps. Three lines added in two places. At the level of the protocol where I could tell how long the immediate conversation segment would be, I "corked" the tcp socket before starting the segment and "uncorked" it after the segment (for "cork", see tcp(7), setsockopt(2) and TCP_CORK in linux). Surprise, ... I got a speed up of hundreds of times. The same application that crawled under my original rgdbm implementation and under PG now maxed out the network bandwidth at close to a full 10Mb/s and 1200 pkts/s, at 10% CPU on my 700MHz client, and a bit less on the 1GHz server. So * Is that what is holding up postgres over the net too? Lots of tiny packets? And if so * can one fix it the way I fixed it for remote gdbm? The speedup was hundreds of times. Can someone point me at the relevant bits of pg code? A quick look seems to say that fe-*.c is interesting. I need to find where the actual read and write on the conn->sock is done. Very illuminating gnuplot outputs available on request. Peter ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] How PostgreSQL handles multiple DDBB instances?
Hi all, I have a doubt/problem about how PostgreSQL handles multiple DDBB instances running on a same server and how I should design the architecture of an application. I have an application that works with multiple customers. Thinking in scalability we are thinking in applying the following approaches: - Create a separate database instance for each customer. - We think that customer's DB will be quite small, about 200MB as average. - The number of clients, then DDBB, can be significant(thousands). - Have as many customers as possible on the same server, so a single server could have more than 300 DDBB instances. Do you think this makes sense? or taking into account that the expected DDBB size, would be better to join several customers DDBB in just one instance. What I'm worried about is, if having so many DDBB instances PostgreSQL's performance would be worse. I have been following the list and one of the advises that appears more often is keep your DB in memory, so if I have just one instance instead of "hundreds" the performance will be better? Thank you very much -- Arnau ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Big problem with sql update operation
Recently I've wrote few pgSql procedures that generates invoices and store it in postgres table. Small test has shown that there is performance problem. I've thought that string operation in pgsql are not perfect but it has appeared that 90% of time program waste on very simple update. Below is my simplified procedures: CREATE TABLE group_fin_account_tst ( group_fin_account_tst_id BIGSERIAL PRIMARY KEY, credit NUMERIC(8,2) DEFAULT 0.00 NOT NULL ) ; ALTER TABLE group_fin_account_tst OWNER TO freeconetadm; INSERT INTO group_fin_account_tst (credit) VALUES (4); CREATE OR REPLACE FUNCTION test() RETURNS void AS $BODY$ DECLARE BEGIN FOR v_i IN 1..4000 LOOP UPDATE group_fin_account_tst SET credit = v_i WHERE group_fin_account_tst_id = 1; -- for real procedure I update different rows END LOOP; END; $BODY$ LANGUAGE 'plpgsql' VOLATILE; ALTER FUNCTION test() OWNER TO freeconetadm; select test(); The strange thing is how program behave when I increase number of iteration. Below my results (where u/s is number of updates per second) On windows 500 - 0.3s(1666u/s) 1000 - 0.7s (1428u/s) 2000 - 2.3s (869u/s) 4000 - 9s (444u/s) 8000 -29s (275u/s) 16000-114s (14u/s) On linux: 500 - 0.5s(1000u/s) 1000 - 1.8s (555u/s) 2000 - 7.0s (285u/s) 4000 - 26s (153u/s) 8000 -101s (79u/s) 16000-400s (40u/s) On both systems relation between number of iteration and time is strongly nonlinear! Do you know what is a problem? Is it possible to commit transaction inside pgsql procedure because I think that maybe transaction is too long? Regards Michal Szymanski http://blog.szymanskich.net ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13)
Bastian Voigt wrote: OK, I'll give that a try. What about pg_autovacuum then? Is it a problem when two processes try to vacuum the same table in parallel? Or do I need to deactivate autovacuum altogether? I was about to say that you can tune pg_autovacuum, but I just checked your original post and you're running 8.0.x - not sure about that one. You'll have to check the documentation for that version to see if you can either: 1. exclude that table from pg_autovacuum 2. increase pg_autovacuum's sensitivity If not, and this table is the most active, it might be simpler just to run your own vacuum-ing from a cron job. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] general PG network slowness (possible cure) (repost)
Peter T. Breuer wrote: I set up pg to replace a plain gdbm database for my application. But even running to the same machine, via a unix socket * the pg database ran 100 times slower For what operations? Bulk reads? 19-way joins? Across the net it was * about 500 to 1000 times slower than local gdbm with no cpu use to speak of. Disk-intensive or memory intensive? I'd heard that networked databases are slow. I might have left it at that if curiosity hadn't led me to write a network server for gdbm databases, and talk to _that_ just to get a comparison. Lo and behold and smack me with a corncob if it wasn't _slower_ than pg. On a whim I mapped the network bandwidth per packet size with the NPtcp suite, and got surprising answers .. at 1500B, naturally, the bandwidth was the full 10Mb/s (minus overheads, say 8.5Mb/s) of my pathetic little local net. At 100B the bandwidth available was only 25Kb/s. At 10B, you might as well use tin cans and taut string instead. This sounds like you're testing a single connection. You would expect "dead time" to dominate in that scenario. What happens when you have 50 simultaneous connections? Or do you think it's just packet overhead? I also mapped the network flows using ntop, and yes, the average packet size for both gdbm and pg in one direction was only about 100B or so. That's it! Clearly there are a lot of short queries going out and the answers were none too big either ( I had a LIMIT 1 in all my PG queries). I'm not sure that 100B query-results are usually the bottleneck. Why would you have LIMIT 1 on all your queries? About 75% of traffic was in the 64-128B range while my application was running, with the peak bandwidth in that range being about 75-125Kb/s (and I do mean bits, not bytes). None of this sounds like typical database traffic to me. Yes, there are lots of small result-sets, but there are also typically larger (several kilobytes) to much larger (10s-100s KB). S ... I took a look at my implementation of remote gdbm, and did a very little work to aggregate outgoing transmissions together into lumps. Three lines added in two places. At the level of the protocol where I could tell how long the immediate conversation segment would be, I "corked" the tcp socket before starting the segment and "uncorked" it after the segment (for "cork", see tcp(7), setsockopt(2) and TCP_CORK in linux). I'm a bit puzzled, because I'd have thought the standard Nagle algorithm would manage this gracefully enough for short-query cases. There's no way (that I know of) for a backend to handle more than one query at a time. Surprise, ... I got a speed up of hundreds of times. The same application that crawled under my original rgdbm implementation and under PG now maxed out the network bandwidth at close to a full 10Mb/s and 1200 pkts/s, at 10% CPU on my 700MHz client, and a bit less on the 1GHz server. So * Is that what is holding up postgres over the net too? Lots of tiny packets? I'm not sure your setup is typical, interesting though the figures are. Google a bit for pg_bench perhaps and see if you can reproduce the effect with a more typical load. I'd be interested in being proved wrong. And if so * can one fix it the way I fixed it for remote gdbm? The speedup was hundreds of times. Can someone point me at the relevant bits of pg code? A quick look seems to say that fe-*.c is interesting. I need to find where the actual read and write on the conn->sock is done. You'll want to look in backend/libpq and interfaces/libpq I think (although I'm not a developer). -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13)
Richard Huxton wrote: I was about to say that you can tune pg_autovacuum, but I just checked your original post and you're running 8.0.x - not sure about that one. The system catalog pg_autovacuum which allows finetuning autovacuum at table level was introduced in 8.1 :-( You'll have to check the documentation for that version to see if you can either: 1. exclude that table from pg_autovacuum 2. increase pg_autovacuum's sensitivity (1) seems to be impossible (correct me if I'm wrong..), so maybe I'll go for (2) ... If not, and this table is the most active, it might be simpler just to run your own vacuum-ing from a cron job. Well, it is one of the most active, but there are others. pg_autovacuum seems to do a very good job, apart from this one table... -- Bastian Voigt Neumünstersche Straße 4 20251 Hamburg telefon +49 - 40 - 67957171 mobil +49 - 179 - 4826359 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13)
Bastian Voigt wrote: Hi *, for caching large autogenerated XML files, I have created a bytea table in my database so that the cached files can be used by multiple servers. There are about 500 rows and 10-20 Updates per minute on the table. The files stored in the bytea are anything from 10kB to 10MB. My PostgreSQL version is 8.0.13 on Gentoo Linux (x86) with PostGIS 1.2.0. For vacuum I use the pg_autovacuum daemon. It decided to vacuum my cache table about every 3 hours, the vacuum process takes 20-30 minutes (oops!) every time. Try vacuuming every 3 minutes and see what happens. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13)
Bastian Voigt wrote: Richard Huxton wrote: I was about to say that you can tune pg_autovacuum, but I just checked your original post and you're running 8.0.x - not sure about that one. The system catalog pg_autovacuum which allows finetuning autovacuum at table level was introduced in 8.1 :-( Hmm - thought it might have been :-( You'll have to check the documentation for that version to see if you can either: 1. exclude that table from pg_autovacuum 2. increase pg_autovacuum's sensitivity (1) seems to be impossible (correct me if I'm wrong..), so maybe I'll go for (2) ... No, the per-table stuff was via the system table. If not, and this table is the most active, it might be simpler just to run your own vacuum-ing from a cron job. Well, it is one of the most active, but there are others. pg_autovacuum seems to do a very good job, apart from this one table... Do you have any settings in your postgresql.conf? Failing that, you would have to poke around the source. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
My quick and dirty "solution" (Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13))
No, this did not help. The vacuum process is still running far too long and makes everything slow. It is even worse than before, cause now the system is slow almost all the time while when vacuuming only every 3 hours it is only slow once every three hours. I now did the following. Well, no comment. Shellscript A: while true do psql -U $user -d $database -c "vacuum analyze verbose binary_cache" echo "Going to sleep" sleep 60 done Shellscript B: while true do ps aux > $tempfile numwaiting=`grep UPDATE.waiting $tempfile | grep -c -v grep` echo "Number of waiting updates: $numwaiting" vacuumpid=`grep VACUUM $tempfile| grep -v grep | awk '{print $2}'` echo "PID of vacuum process: $vacuumpid" if [ $numwaiting -gt 5 ] then echo "Too many waiting transactions, killing vacuum process $vacuumpid..." kill $vacuumpid fi echo "Sleeping 30 Seconds" sleep 30 done -- Bastian Voigt Neumünstersche Straße 4 20251 Hamburg telefon +49 - 40 - 67957171 mobil +49 - 179 - 4826359 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] general PG network slowness (possible cure) (repost)
On Fri, May 25, 2007 at 10:50:58AM +0200, Peter T. Breuer wrote: > I set up pg to replace a plain gdbm database for my application. Postgres and gdbm are completely different. You want to rethink your queries so each does more work, instead of running a zillion of them over the network. /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: My quick and dirty "solution" (Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13))
you should first cluster the table on primary key. The table is probably already bloated from the 3 hr delay it had before. First CLUSTER "primary key index name" ON group_fin_account_tst; Then vacuum it every 3 minutes. NB! clustering takes an access exclusive lock on table Kristo On 25.05.2007, at 15:30, Bastian Voigt wrote: No, this did not help. The vacuum process is still running far too long and makes everything slow. It is even worse than before, cause now the system is slow almost all the time while when vacuuming only every 3 hours it is only slow once every three hours. I now did the following. Well, no comment. Shellscript A: while true do psql -U $user -d $database -c "vacuum analyze verbose binary_cache" echo "Going to sleep" sleep 60 done Shellscript B: while true do ps aux > $tempfile numwaiting=`grep UPDATE.waiting $tempfile | grep -c -v grep` echo "Number of waiting updates: $numwaiting" vacuumpid=`grep VACUUM $tempfile| grep -v grep | awk '{print $2}'` echo "PID of vacuum process: $vacuumpid" if [ $numwaiting -gt 5 ] then echo "Too many waiting transactions, killing vacuum process $vacuumpid..." kill $vacuumpid fi echo "Sleeping 30 Seconds" sleep 30 done -- Bastian Voigt Neumünstersche Straße 4 20251 Hamburg telefon +49 - 40 - 67957171 mobil +49 - 179 - 4826359 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] general PG network slowness (possible cure) (repost)
"Also sprach Kenneth Marshall:" > > Surprise, ... I got a speed up of hundreds of times. The same application > > that crawled under my original rgdbm implementation and under PG now > > maxed out the network bandwidth at close to a full 10Mb/s and 1200 > > pkts/s, at 10% CPU on my 700MHz client, and a bit less on the 1GHz > > server. > > > > So > > > > * Is that what is holding up postgres over the net too? Lots of tiny > > packets? > > > This effect is very common, but you are in effect altering the query/ I imagined so, but no, I am not changing the behaviour - I believe you are imagining something different here. Let me explain. It is usually the case that drivers and the network layer conspire to emit packets when they are otherwise idle, since they have nothing better to do. That is, if the transmission unit is the normal 1500B and there is 200B in the transmission buffer and nothing else is frisking them about the chops, something along the line will shrug and say, OK, I'll just send out a 200B fragment now, apologize, and send out another fragment later if anything else comes along for me to chunter out. It is also the case that drivers do the opposite .. that is, they do NOT send out packets when the transmission buffer is full, even if they have 1500B worth. Why? Well, on Ge for sure, and on 100BT most of the time, it doesn't pay to send out individual packets because the space required between packets is relatively too great to permit the network to work at that speed given the speed of light as it is, and the spacing it implies between packets (I remember when I advised the networking protocol people that Ge was a coming thing about 6 years ago, they all protested and said it was _physically_ impossible. It is. If you send packets one by one!). An ethernet line is fundamentally only electrical and only signals up or down (relative) and needs time to quiesce. And then there's the busmastering .. a PCI bus is only about 33MHz, and 32 bits wide (well, or 16 on portables, or even 64, but you're getting into heavy server equipment then). That's 128MB/s in one direction, and any time one releases the bus there's a re-setup time that costs the earth and will easily lower bandwidth by 75%. So drivers like to take the bus for a good few packets at a time. Even a single packet (1500B) will take 400 multi-step bus cycles to get to the card, and then it's a question of how much onboard memory it has or whether one has to drive it synchronously. Most cards have something like a 32-unit ring buffer, and I think each unit is considerable. Now, if a driver KNOWS what's coming then it can alter its behavior in order to mesh properly with the higher level layers. What I did was _tell_ the driver and the protocol not to send any data until I well and truly tell it to, and then told it to, when I was ready. The result is that a full communication unit (start, header, following data, and stop codon) was sent in one blast. That meant that there were NO tiny fragments blocking up the net, being sent wily-nily. And it also meant that the driver was NOT waiting for more info to come in before getting bored and sending out what it had. It did as I told it to. The evidence from monitoring the PG network thruput is that 75% of its packets are in the 64-128B range, including tcp header. That's hitting the 100Kb/s (10KB/s) bandwidth regime on my network at the lower end. It will be even _worse_ on a faster net, I think (feel free to send me a faster net to compare with :). I also graphed latency, but I haven't taken into account the results as the bandwidth measurements were so striking. > response behavior of the database. Most applications expect an answer > from the database after every query. Well of course. Nothing else would work! (I imagine you have some kind of async scheme, but I haven't investigated). I ask, the db replies. I ask, the db replies. What I did was 1) made the ASK go out as one lump. 2) made the REPLY go out as one lump 3) STOPPED the card waiting for several replies or asks to accumulate before sending out anything at all. > If it could manage retrying failed > queries later, you could use the typical sliding window/delayed ack > that is so useful in improving the bandwidth utilization of many network That is not what is going on (though that's not a bad idea). See above for the explanation. One has to take into account the physical hardware involved and its limitations, and arrange the communications accordingly. All I did was send EACH query and EACH response as a single unit, at the hardware level. One could do better still by managing _several_ threads communications at once. > programs. Maybe an option in libpq to tell it to use delayed "acks". I > do not know what would be involved. Nothing spectacular is required to see a considerable improvement, I think,. apart from a little direction from the high level protocol down to the driver about where
Re: My quick and dirty "solution" (Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13))
Bastian Voigt wrote: No, this did not help. The vacuum process is still running far too long and makes everything slow. It is even worse than before, cause now the system is slow almost all the time while when vacuuming only every 3 hours it is only slow once every three hours. Could you check the output of vacuum verbose on that table and see how much work it's doing? I'd have thought the actual bytea data would be TOASTed away to a separate table for storage, leaving the vacuum with very little work to do. It might well be your actual problem is your disk I/O is constantly saturated and the vacuum just pushes it over the edge. In which case you'll either need more/better disks or to find a quiet time once a day to vacuum and just do so then. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] general PG network slowness (possible cure) (repost)
"Also sprach Richard Huxton:" [Charset ISO-8859-1 unsupported, filtering to ASCII...] > Peter T. Breuer wrote: > > I set up pg to replace a plain gdbm database for my application. But > > even running to the same machine, via a unix socket > > > >* the pg database ran 100 times slower > > For what operations? Bulk reads? 19-way joins? The only operations being done are simple "find the row with this key", or "update the row with this key". That's all. The queries are not an issue (though why the PG thread choose to max out cpu when it gets the chance to do so through a unix socket, I don't know). > > Across the net it was > > > > * about 500 to 1000 times slower than local gdbm > > > > with no cpu use to speak of. > > Disk-intensive or memory intensive? There is no disk as such... it's running on a ramdisk at the server end. But assuming you mean i/o, i/o was completely stalled. Everything was idle, all waiting on the net. > > On a whim I mapped the network bandwidth per packet size with the NPtcp > > suite, and got surprising answers .. at 1500B, naturally, the bandwidth > > was the full 10Mb/s (minus overheads, say 8.5Mb/s) of my pathetic little > > local net. At 100B the bandwidth available was only 25Kb/s. At 10B, > > you might as well use tin cans and taut string instead. > > This sounds like you're testing a single connection. You would expect > "dead time" to dominate in that scenario. What happens when you have 50 Indeed, it is single, because that's my application. I don't have 50 simultaneous connections. The use of the database is as a permanent storage area for the results of previous analyses (static analysis of the linux kernel codes) from a single client. Multiple threads accessing at the same time might help keep the network drivers busier, which would help. They would always see their buffers filling at an even rate and be able to send out groups of packets at once. > simultaneous connections? Or do you think it's just packet overhead? It's not quite overhead in the sense of the logical layer. It's a physical layer thing. I replied in another mail on this thread, but in summary, tcp behaves badly with small packets on ethernet, even on a dedicated line (as this was). One needs to keep it on a tight rein. > > I also mapped the network flows using ntop, and yes, the average packet > > size for both gdbm and pg in one direction was only about 100B or > > so. That's it! Clearly there are a lot of short queries going out and > > the answers were none too big either ( I had a LIMIT 1 in all my PG > > queries). > > I'm not sure that 100B query-results are usually the bottleneck. > Why would you have LIMIT 1 on all your queries? Because there is always only one answer to the query, according to the logic. So I can always tell the database manager to stop looking after one, which will always help it. > > About 75% of traffic was in the 64-128B range while my application was > > running, with the peak bandwidth in that range being about 75-125Kb/s > > (and I do mean bits, not bytes). > > None of this sounds like typical database traffic to me. Yes, there are > lots of small result-sets, but there are also typically larger (several > kilobytes) to much larger (10s-100s KB). There's none here. > > S ... I took a look at my implementation of remote gdbm, and did > > a very little work to aggregate outgoing transmissions together into > > lumps. Three lines added in two places. At the level of the protocol > > where I could tell how long the immediate conversation segment would be, > > I "corked" the tcp socket before starting the segment and "uncorked" it > > after the segment (for "cork", see tcp(7), setsockopt(2) and TCP_CORK in > > linux). > > I'm a bit puzzled, because I'd have thought the standard Nagle algorithm > would manage this gracefully enough for short-query cases. There's no On the contrary, Nagle is also often wrong here because it will delay sending in order to accumulate more data into buffers when only a little has arrived, then give up when no more data arrives to be sent out, then send out the (short) packet anyway, late. There's no other traffic apart from my (single thread) application. What we want is to direct the sending exactly,n this situation saying when to not send, and when to send. Disable Nagle for a start, use async read (noblock), and sync write, with sends from the socket blocked from initiation of a message until the whole message is ready to be sent out. Sending the message piecemeal just hurts too. > way (that I know of) for a backend to handle more than one query at a time. That's not the scenario. > > Surprise, ... I got a speed up of hundreds of times. The same application > > that crawled under my original rgdbm implementation and under PG now > > maxed out the network bandwidth at close to a full 10Mb/s and 1200 > > pkts/s, at 10% CPU on my 700MHz client, and a bit less on the 1GHz > > server. > >
Re: [PERFORM] general PG network slowness (possible cure) (repost)
Peter T. Breuer wrote: The only operations being done are simple "find the row with this key", or "update the row with this key". That's all. The queries are not an issue (though why the PG thread choose to max out cpu when it gets the chance to do so through a unix socket, I don't know). There is no disk as such... it's running on a ramdisk at the server end. But assuming you mean i/o, i/o was completely stalled. Everything was idle, all waiting on the net. Indeed, it is single, because that's my application. I don't have 50 simultaneous connections. The use of the database is as a permanent storage area for the results of previous analyses (static analysis of the linux kernel codes) from a single client. I'm not sure your setup is typical, interesting though the figures are. Google a bit for pg_bench perhaps and see if you can reproduce the effect with a more typical load. I'd be interested in being proved wrong. But the load is typical HERE. The application works well against gdbm and I was hoping to see speedup from using a _real_ full-fledged DB instead. I'm not sure you really want a full RDBMS. If you only have a single connection and are making basic key-lookup queries then 90% of PostgreSQL's code is just getting in your way. Sounds to me like gdbm (or one of its alternatives) is a good match for you. Failing that, sqlite is probably the next lowest-overhead solution. Of course, if you want to have multiple clients interacting and performing complex 19-way joins on gigabyte-sized tables with full-text indexing and full transaction control then you *do* want a RDBMS. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: My quick and dirty "solution" (Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13))
Bastian Voigt wrote: > No, this did not help. The vacuum process is still running far too long > and makes everything slow. It is even worse than before, cause now the > system is slow almost all the time while when vacuuming only every 3 > hours it is only slow once every three hours. > > > I now did the following. Well, no comment. Killing the vacuum mid-process doesn't help you, because the table will be in a sorrier state than it was when it started. I think it would be better if you: 1. Revert pg_autovacuum changes so that it processes every 3 hours or whatever, like you had at the start of this thread. Or maybe less. That one will take care of the _other_ tables. 2. Vacuum the bytea table manually more often, say every 10 minutes or so (vacuum, sleep 10m, goto start). Make sure this is done with an appropriate vacuum_cost_delay setting (and related settings). 3. Raise max_fsm_pages so that a lot of pages with free space can be recorded for that table The point here is that vacuuming the bytea table can take a long time due to vacuum_cost_delay, but it won't affect the rest of the system; regular operation will continue to run at (almost) normal speed. Having a big number of free pages ensures that the free space in the table is not "lost". Also, you may want to reindex that table once, because with so many killing vacuums you have probably screwed up the indexes big time (maybe cluster it once instead of reindexing, because that will compact the heap as well as the indexes). Another recommendation is to upgrade to 8.2.4 which is faster and has a better autovacuum. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] general PG network slowness (possible cure) (repost)
"Also sprach Richard Huxton:" > I'm not sure you really want a full RDBMS. If you only have a single > connection and are making basic key-lookup queries then 90% of > PostgreSQL's code is just getting in your way. Sounds to me like gdbm Yep - I could happily tell it not to try and compile a special lookup scheme each time, for example! (how that?). I could presumably also help it by preloading the commands I will run and sending over the params only with a "do a no. 17 now!". > (or one of its alternatives) is a good match for you. Failing that, > sqlite is probably the next lowest-overhead solution. Not a bad idea. but PG _will_ be useful when folk come to analyse the result of the analyses being done. What is slow is getting the data into the database now via simple store, fetch and update. > Of course, if you want to have multiple clients interacting and > performing complex 19-way joins on gigabyte-sized tables with full-text Well, the dbs are in the tens of MB from a single run over a single file (i.e analysis of a single 30KLOC source). The complete analysis space is something like 4000 times that, for 4300 C files in the linux kernel source. And then there is all the linux kernel versions. Then there is godzilla and apache source .. > indexing and full transaction control then you *do* want a RDBMS. We want one anyway. The problem is filling the data and the simple fetch and update queries on it. I really think it would be worthwhile getting some developer to tell me where the network send is done in PG. Peter ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13)
Bastian Voigt <[EMAIL PROTECTED]> writes: > Now my big big problem is that the database gets really really slow > during these 20 minutes and after the vacuum process is running for a > short time, many transactions show state "UPDATE waiting" in the process > list. In my Java application server I sometimes get tons of deadlock > Exceptions (waiting on ShareLock blahblah). The web frontend gets nearly > unusable, logging in takes more than 60 seconds, etc. etc. Hmm. That's a bit weird --- what are they waiting on exactly? Look in pg_locks to see what the situation is. A vacuum per se ought not be blocking any updates. Aside from the recommendation to make the vacuums happen more frequently instead of less so, you should experiment with vacuum_cost_delay and related parameters. The idea is to reduce vacuum's I/O load so that it doesn't hurt foreground response time. This means any individual vacuum will take longer, but you won't need to care. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: My quick and dirty "solution" (Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13))
Richard Huxton wrote: Could you check the output of vacuum verbose on that table and see how much work it's doing? I'd have thought the actual bytea data would be TOASTed away to a separate table for storage, leaving the vacuum with very little work to do. I'm quite new to postgres (actually I just ported our running application from MySQL...), so I don't know what toast means. But I noticed that vacuum also tried to cleanup some "toast" relations or so. This was what took so long. It might well be your actual problem is your disk I/O is constantly saturated and the vacuum just pushes it over the edge. In which case you'll either need more/better disks or to find a quiet time once a day to vacuum and just do so then. Yes, that was definitely the case. But now everything runs smoothly again, so I don't think I need to buy new disks. Regards Bastian -- Bastian Voigt Neumünstersche Straße 4 20251 Hamburg telefon +49 - 40 - 67957171 mobil +49 - 179 - 4826359 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] general PG network slowness (possible cure) (repost)
Peter T. Breuer escribió: > I really think it would be worthwhile getting some developer to tell me > where the network send is done in PG. See src/backend/libpq/pqcomm.c (particularly internal_flush()). -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: My quick and dirty "solution" (Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13))
Kristo Kaiv wrote: you should first cluster the table on primary key. The table is probably already bloated from the 3 hr delay it had before. First CLUSTER "primary key index name" ON group_fin_account_tst; Then vacuum it every 3 minutes. NB! clustering takes an access exclusive lock on table Kristo, thanks a bunch!! This was the solution... The cluster operation took about 60sec, and after it was done the vacuum finished in only 10sec. or so, with no noticeable performance bottleneck. Now vacuum is running every 2-3 minutes and makes no problems. Hhhh, now I can look forward to a laid-back weekend.. Richard, Kristo, Alvaro, thanks 1000 times for responding so quickly :-) -- Bastian Voigt Neumünstersche Straße 4 20251 Hamburg telefon +49 - 40 - 67957171 mobil +49 - 179 - 4826359 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] general PG network slowness (possible cure) (repost)
"Peter T. Breuer" <[EMAIL PROTECTED]> writes: > S ... I took a look at my implementation of remote gdbm, and did > a very little work to aggregate outgoing transmissions together into > lumps. We do that already --- for a simple query/response such as you are describing, each query cycle will involve one physical client->server message followed by one physical server->client message. The only way to aggregate more is for the application code to merge queries together. Migrating a dbm-style application to a SQL database is often a real pain, precisely because the application is designed to a mindset of "fetch one record, manipulate it, update it", where "fetch" and "update" are assumed to be too stupid to do any of the work for you. The way to get high performance with a SQL engine is to push as much of the work as you can to the database side, and let the engine process multiple records per query; and that can easily mean rewriting the app from the ground up :-( regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] general PG network slowness (possible cure) (repost)
Peter T. Breuer wrote: "Also sprach Richard Huxton:" I'm not sure you really want a full RDBMS. If you only have a single connection and are making basic key-lookup queries then 90% of PostgreSQL's code is just getting in your way. Sounds to me like gdbm Yep - I could happily tell it not to try and compile a special lookup scheme each time, for example! (how that?). I could presumably also help it by preloading the commands I will run and sending over the params only with a "do a no. 17 now!". PREPARE/EXECUTE (or the equivalent libpq functions). Also - if you can have multiple connections to the DB you should be able to have several queries running at once. (or one of its alternatives) is a good match for you. Failing that, sqlite is probably the next lowest-overhead solution. Not a bad idea. but PG _will_ be useful when folk come to analyse the result of the analyses being done. What is slow is getting the data into the database now via simple store, fetch and update. I'd have an hourly/daily bulk-load running from the simple system into PG. If you have to search all the data from your app that's not practical of course. Of course, if you want to have multiple clients interacting and performing complex 19-way joins on gigabyte-sized tables with full-text Well, the dbs are in the tens of MB from a single run over a single file (i.e analysis of a single 30KLOC source). The complete analysis space is something like 4000 times that, for 4300 C files in the linux kernel source. And then there is all the linux kernel versions. Then there is godzilla and apache source .. If you're doing some sort of token analysis on source-code you probably want to look into how tsearch2 / trigram / Gist+GIN indexes work. It might be that you're doing work in your app that the DB can handle for you. indexing and full transaction control then you *do* want a RDBMS. We want one anyway. The problem is filling the data and the simple fetch and update queries on it. OK I really think it would be worthwhile getting some developer to tell me where the network send is done in PG. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Big problem with sql update operation
Michal Szymanski <[EMAIL PROTECTED]> writes: > CREATE OR REPLACE FUNCTION test() > RETURNS void AS > $BODY$ > DECLARE > BEGIN > FOR v_i IN 1..4000 LOOP > UPDATE group_fin_account_tst SET > credit = v_i > WHERE group_fin_account_tst_id = 1; -- for real procedure I > update different rows Does updating the *same* record 4000 times per transaction reflect the real behavior of your application? If not, this is not a good benchmark. If so, consider redesigning your app to avoid so many redundant updates. (For the record, the reason you see nonlinear degradation is the accumulation of tentatively-dead versions of the row, each of which has to be rechecked by each later update.) regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] How PostgreSQL handles multiple DDBB instances?
Arnau <[EMAIL PROTECTED]> writes: >I have an application that works with multiple customers. Thinking in > scalability we are thinking in applying the following approaches: >- Create a separate database instance for each customer. >- We think that customer's DB will be quite small, about 200MB as > average. >- The number of clients, then DDBB, can be significant(thousands). >- Have as many customers as possible on the same server, so a single > server could have more than 300 DDBB instances. This is probably a bad idea, unless each customer's performance demands are so low that you can afford to use very small shared-memory settings for each instance. But even small settings will probably eat ~10MB per instance --- can you afford to build these machines with multiple GB of RAM? Can you instead run things with one postmaster per machine and one database per customer within that instance? From a performance perspective this is likely to work much better. If you desire to give the customers database-superuser capability then this probably won't do, but if they are restricted users it might be OK. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] LIKE search and performance
On Fri, May 25, 2007 at 09:13:25AM +0100, Richard Huxton wrote: > [EMAIL PROTECTED] wrote: > >>And since it's basically impossible to know the selectivity of this kind > >>of where condition, I doubt the planner would ever realistically want to > >>choose that plan anyway because of its poor worst-case behavior. > >What is a real life example where an intelligent and researched > >database application would issue a like or ilike query as their > >primary condition in a situation where they expected very high > >selectivity? > >Avoiding a poor worst-case behaviour for a worst-case behaviour that > >won't happen doesn't seem practical. > But if you are also filtering on e.g. date, and that has an index with > good selectivity, you're never going to use the text index anyway are > you? If you've only got a dozen rows to check against, might as well > just read them in. > The only time it's worth considering the behaviour at all is *if* the > worst-case is possible. I notice you did not provide a real life example as requested. :-) This seems like an ivory tower restriction. Not allowing best performance in a common situation vs not allowing worst performance in a not-so-common situation. Cheers, mark -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __ . . _ ._ . . .__. . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/|_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] general PG network slowness (possible cure) (repost)
"Also sprach Alvaro Herrera:" > > I really think it would be worthwhile getting some developer to tell me > > where the network send is done in PG. > > See src/backend/libpq/pqcomm.c (particularly internal_flush()). Yes. Thanks. That looks like it. It calls secure_write continually until the buffer is empty. Secure_write is located ibe-secure.c, but I'm not using ssl, so the call reduces to just n = send(port->sock, ptr, len, 0); And definitely all those could be grouped if there are several to do. But under normal circumstances the send will be pushing against a lttle resistance (the copy to the driver/protocol stack buffer is faster than the physical network send, by a ratio of GB/s to MB/s, or 1000 to 1), and thus all these sends will probably complete as a single unit once they have been started. It's worth a try. I thought first this may be too low level, but it looks as though internal_flush is only triggered when some other buffer is full, or deliberately, so it may be useful to block until it fires. I'll try it. Peter ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] general PG network slowness (possible cure) (repost)
"Peter T. Breuer" <[EMAIL PROTECTED]> writes: > And definitely all those could be grouped if there are several to do. Except that in the situation you're describing, there's only a hundred or two bytes of response to each query, which means that only one send() will occur anyway. (The flush call comes only when we are done responding to the current client query.) It's possible that for bulk data transmission situations we could optimize things a bit better --- in particular I've wondered whether we can reliably find out the MTU of the connection and use that as the output buffer size, instead of trusting the kernel to choose the best message boundaries --- but for the situation you're worried about there will be only one send. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: My quick and dirty "solution" (Re: [PERFORM] Performance P roblem with Vacuum of bytea table (PG 8.0.13))
TOASTed means storage outside of the main table. But AFAIK, only rows bigger 2K are considered for toasting. Andreas -- Ursprüngl. Mitteil. -- Betreff:Re: My quick and dirty "solution" (Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13)) Von:Bastian Voigt <[EMAIL PROTECTED]> Datum: 25.05.2007 14:13 Richard Huxton wrote: > Could you check the output of vacuum verbose on that table and see how > much work it's doing? I'd have thought the actual bytea data would be > TOASTed away to a separate table for storage, leaving the vacuum with > very little work to do. I'm quite new to postgres (actually I just ported our running application from MySQL...), so I don't know what toast means. But I noticed that vacuum also tried to cleanup some "toast" relations or so. This was what took so long. > It might well be your actual problem is your disk I/O is constantly > saturated and the vacuum just pushes it over the edge. In which case > you'll either need more/better disks or to find a quiet time once a > day to vacuum and just do so then. Yes, that was definitely the case. But now everything runs smoothly again, so I don't think I need to buy new disks. Regards Bastian -- Bastian Voigt Neumünstersche Straße 4 20251 Hamburg telefon +49 - 40 - 67957171 mobil +49 - 179 - 4826359 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] general PG network slowness (possible cure) (repost)
"Also sprach Tom Lane:" > "Peter T. Breuer" <[EMAIL PROTECTED]> writes: > > And definitely all those could be grouped if there are several to do. > > Except that in the situation you're describing, there's only a hundred > or two bytes of response to each query, which means that only one send() > will occur anyway. (The flush call comes only when we are done > responding to the current client query.) It may still be useful. The kernel won't necessarily send data as you push it down to the network protocols and driver. The driver may decide to wait for more data to accumulate, particularly if it only has a couple of hundred bytes to send so far and the medium is high speed and medium latency (fast ethernet). It'll get fed up with waiting for more data eventually, and send it out, but it is essentially waiting on _itself_ in that case, since the outgoing data is required at the other side of the net as a response to be processed before another query can be sent out, only then prompting the postmaster to start stuffing the output buffer with more bytes. Waiting on oneself is bad for us procrastinators. We need some whips. I'll try and really force a send, and try some more tricks. Unfortunately this isn't really quite the right level, so I have to use some heuristics. Can you guarantee that internal_flush is not called until (a) the internal buffer is full, OR (b) we have finished composing a reply, AND (c) there is no other way to send out data? I also need to find where we begin to compose a reply. That's somewhere well before internal flush ever gets called. I want to block output at that point. As it is, I can either unblock just before internal_flush and block after, or block just before internal_flush and unblock after (:-) that's not quite as daft as it sounds, but needs care). Really I want to do query received *block output process query create response *unblock output send Instead, I have here to do query received process query create response *unblock output send *block output Which is not quite the same. It may work though, because the driver will know nothing is going to go out while it is listening for the next query, and it will not have sent anything prematurely or kept it back inopportunely. > It's possible that for bulk data transmission situations we could > optimize things a bit better --- in particular I've wondered whether we > can reliably find out the MTU of the connection and use that as the > output buffer size, instead of trusting the kernel to choose the best > message boundaries --- but for the situation you're worried about Don't bother, I think. MTU is often effectively only notional these days at the hardware level in many media. OTOH, on my little net, MTU really does mean something because it's 10BT. > there will be only one send. True. Peter ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] LIKE search and performance
[EMAIL PROTECTED] wrote: On Fri, May 25, 2007 at 09:13:25AM +0100, Richard Huxton wrote: [EMAIL PROTECTED] wrote: And since it's basically impossible to know the selectivity of this kind of where condition, I doubt the planner would ever realistically want to choose that plan anyway because of its poor worst-case behavior. What is a real life example where an intelligent and researched database application would issue a like or ilike query as their primary condition in a situation where they expected very high selectivity? Avoiding a poor worst-case behaviour for a worst-case behaviour that won't happen doesn't seem practical. But if you are also filtering on e.g. date, and that has an index with good selectivity, you're never going to use the text index anyway are you? If you've only got a dozen rows to check against, might as well just read them in. The only time it's worth considering the behaviour at all is *if* the worst-case is possible. I notice you did not provide a real life example as requested. :-) OK - any application that allows user-built queries: foo> Want another? Any application that has a "search by name" box - users can (and do) put one letter in and hit enter. Unfortunately you don't always have control over the selectivity of queries issued. This seems like an ivory tower restriction. Not allowing best performance in a common situation vs not allowing worst performance in a not-so-common situation. What best performance plan are you thinking of? I'm assuming we're talking about trailing-wildcard matches here, rather than "contains" style matches. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] general PG network slowness (possible cure) (repost)
"Peter T. Breuer" <[EMAIL PROTECTED]> writes: > "Also sprach Tom Lane:" >> Except that in the situation you're describing, there's only a hundred >> or two bytes of response to each query, which means that only one send() >> will occur anyway. (The flush call comes only when we are done >> responding to the current client query.) > It may still be useful. The kernel won't necessarily send data as you > push it down to the network protocols and driver. The driver may decide > to wait for more data to accumulate, No, because we set TCP_NODELAY. Once we've flushed a message to the kernel, we don't want the kernel sitting on it --- any delay there adds directly to the elapsed query time. At least this is the case for the final response to a query. I'm not too clear on whether this means we need to be careful about intermediate message boundaries when there's a lot of data being sent. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] LIKE search and performance
OK - any application that allows user-built queries: foo> Want another? Any application that has a "search by name" box - users can (and do) put one letter in and hit enter. Unfortunately you don't always have control over the selectivity of queries issued. -*- HOW TO MAKE A SEARCH FORM -*- Imagine you have to code the search on IMDB. This is what a smart developer would do First, he uses AJAX autocompletion, so the thing is reactive. Then, he does not bother the user with a many-fields form. Instead of forcing the user to think (users HATE that), he writes smart code. Does Google Maps have separate fields for country, city, street, zipcode ? No. Because Google is about as smart as it gets. So, you parse the user query. If the user types, for instance, less than 3 letters (say, spi), he probably wants stuff that *begins* with those letters. There is no point in searching for the letter "a" in a million movie titles database. So, if the user types "spi", you display "name LIKE spi%", which is indexed, very fast. And since you're smart, you use AJAX. And you display only the most popular results (ie. most clicked on). http://imdb.com/find?s=all&q=spi Since 99% of the time the user wanted "spiderman" or "spielberg", you're done and he's happy. Users like being happy. If the user just types "a", you display the first 10 things that start with "a", this is useless but the user will marvel at your AJAX skillz. Then he will probably type in a few other letters. Then, if the user uses his space bar and types "spi 1980" you'll recognize a year and display spielberg's movies in 1980. Converting your strings to phonetics is also a good idea since about 0.7% of the l33T teenagers can spell stuff especially spiElberg. Only the guy who wants to know who had sex with marilyn monroe on the 17th day of the shooting of Basic Instinct will need to use the Advanced search. If you detect several words, then switch to a prefix-based fulltext search like Xapian which utterly rocks. Example : the user types "savin priv", you search for "savin*" NEAR "priv*" and you display "saving private ryan" before he has even finished typing the second word of his query. Users love that, they feel understood, they will click on your ads and buy your products. In all cases, search results should be limited to less than 100 to be easy on the database. The user doesn't care about a search returning more than 10-20 results, he will just rephrase the query, and the time taken to fetch those thousands of records with name LIKE '%a%' will have been utterly lost. Who goes to page 2 in google results ? BOTTOM LINE : databases don't think, you do. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] general PG network slowness (possible cure) (repost)
"Also sprach Tom Lane:" > > It may still be useful. The kernel won't necessarily send data as you > > push it down to the network protocols and driver. The driver may decide > > to wait for more data to accumulate, > > No, because we set TCP_NODELAY. Once we've flushed a message to the That just means "disable Nagle", which is indeed more or less the correct thing to do .. you don't want to sit around waiting for more data when we're sure there will be none, as you say. Yet you also don't want to send short data out prematurely, which disabling Nagle can cause. And disabling Nagle doesn't actually force data out immediately you want it to be sent ... it just disables extra waits imposed by the Nagle algorithm/protocol. It doesn't stop the driver from waiting around because it feels taking the bus might be a bit premature right now, for example. > kernel, we don't want the kernel sitting on it --- any delay there adds > directly to the elapsed query time. At least this is the case for the > final response to a query. I'm not too clear on whether this means we > need to be careful about intermediate message boundaries when there's a > lot of data being sent. It's unclear. But not my situation. If I clear TCP_CORK all data is sent at that point. If I set TCP_CORK data is held until I clear TCP_CORK, or 200ms have passed with no send. Peter ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] LIKE search and performance
On Fri, May 25, 2007 at 04:35:22PM +0100, Richard Huxton wrote: > >I notice you did not provide a real life example as requested. :-) > OK - any application that allows user-built queries: foo> > Want another? Any application that has a "search by name" box - users > can (and do) put one letter in and hit enter. > Unfortunately you don't always have control over the selectivity of > queries issued. The database has 10 million records. The user enters "bar" and it translates to "%bar%". You are suggesting that we expect bar to match 1 million+ records? :-) I hope not. I would define this as bad process. I would also use "LIMIT" to something like "100". > >This seems like an ivory tower restriction. Not allowing best performance > >in a common situation vs not allowing worst performance in a not-so-common > >situation. > What best performance plan are you thinking of? I'm assuming we're > talking about trailing-wildcard matches here, rather than "contains" > style matches. "Trailing-wildcard" already uses B-Tree index, does it not? I am speaking of contains, as contains is the one that was said to require a seqscan. I am questioning why it requires a seqscan. The claim was made that with MVCC, the index is insufficient to check for visibility and that the table would need to be accessed anyways, therefore a seqscan is required. I question whether a like '%bar%' should be considered a high selectivity query in the general case. I question whether a worst case should be assumed. Perhaps I question too much? :-) Cheers, mark -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __ . . _ ._ . . .__. . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/|_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13)
OK, I'll give that a try. What about pg_autovacuum then? Is it a problem when two processes try to vacuum the same table in parallel? Or do I need to deactivate autovacuum altogether? Try vacuuming every 3 minutes and see what happens. (Sorry Richard, forgot to reply to the list!) -- Bastian Voigt Neumünstersche Straße 4 20251 Hamburg telefon +49 - 40 - 67957171 mobil +49 - 179 - 4826359 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] LIKE search and performance
PFC wrote: OK - any application that allows user-built queries: foo> Want another? Any application that has a "search by name" box - users can (and do) put one letter in and hit enter. Unfortunately you don't always have control over the selectivity of queries issued. -*- HOW TO MAKE A SEARCH FORM -*- Imagine you have to code the search on IMDB. This is what a smart developer would do All good domain-specific tips to provide users with a satisfying search-experience. None of which address the question of what plan PG should produce for: SELECT * FROM bigtable WHERE foo LIKE 's%' -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] How PostgreSQL handles multiple DDBB instances?
Hi Tom, Arnau <[EMAIL PROTECTED]> writes: I have an application that works with multiple customers. Thinking in scalability we are thinking in applying the following approaches: - Create a separate database instance for each customer. - We think that customer's DB will be quite small, about 200MB as average. - The number of clients, then DDBB, can be significant(thousands). - Have as many customers as possible on the same server, so a single server could have more than 300 DDBB instances. This is probably a bad idea, unless each customer's performance demands are so low that you can afford to use very small shared-memory settings for each instance. But even small settings will probably eat ~10MB per instance --- can you afford to build these machines with multiple GB of RAM? Can you instead run things with one postmaster per machine and one database per customer within that instance? From a performance perspective this is likely to work much better. What I meant is just have only one postmaster per server and a lot of databases running in it. Something like that: template1=# \l List of databases Name| Owner | Encoding ---+---+-- alertwdv2 | gguridi | LATIN1 postgres | postgres | LATIN1 template0 | postgres | LATIN1 template1 | postgres | LATIN1 voicexml | root | LATIN1 wikidb| root | LATIN1 (6 rows) Here I just have 6 databases, so my doubt is if instead having 6 databases have 300/600 bases running on the same postmaster how this will impact the performance e.g. template1=# \l List of databases Name| Owner | Encoding ---+---+-- template0 | postgres | LATIN1 template1 | postgres | LATIN1 customers_group_1 | root | LATIN1 (3 rows) Instead of: template1=# \l List of databases Name| Owner | Encoding ---+---+-- template0 | postgres | LATIN1 template1 | postgres | LATIN1 customers_1 | root | LATIN1 customers_2 | root | LATIN1 customers_3 | root | LATIN1 ... customers_500 | root | LATIN1 (502 rows) If you desire to give the customers database-superuser capability then this probably won't do, but if they are restricted users it might be OK. The users won't have superuser access just execute plain queries. Thank you very much -- Arnau ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] LIKE search and performance
[EMAIL PROTECTED] wrote: I am speaking of contains, as contains is the one that was said to require a seqscan. I am questioning why it requires a seqscan. The claim was made that with MVCC, the index is insufficient to check for visibility and that the table would need to be accessed anyways, therefore a seqscan is required. I question whether a like '%bar%' should be considered a high selectivity query in the general case. I question whether a worst case should be assumed. If you are doing %bar% you should be using pg_tgrm or tsearch2. J Perhaps I question too much? :-) Cheers, mark ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] LIKE search and performance
[EMAIL PROTECTED] wrote: On Fri, May 25, 2007 at 04:35:22PM +0100, Richard Huxton wrote: I notice you did not provide a real life example as requested. :-) OK - any application that allows user-built queries: foo> Want another? Any application that has a "search by name" box - users can (and do) put one letter in and hit enter. Unfortunately you don't always have control over the selectivity of queries issued. The database has 10 million records. The user enters "bar" and it translates to "%bar%". You are suggesting that we expect bar to match 1 million+ records? :-) I was saying that you don't know. At least, I don't know of any cheap way of gathering full substring stats or doing a full substring indexing. Even tsearch2 can't do that. I hope not. I would define this as bad process. I would also use "LIMIT" to something like "100". Yes, but that's not the query we're talking about is it? If possible you don't do '%bar%' searches at all. If you do, you try to restrict it further or LIMIT the results. There's nothing to discuss in these cases. This seems like an ivory tower restriction. Not allowing best performance in a common situation vs not allowing worst performance in a not-so-common situation. What best performance plan are you thinking of? I'm assuming we're talking about trailing-wildcard matches here, rather than "contains" style matches. "Trailing-wildcard" already uses B-Tree index, does it not? Yes, it searches the btree and then checks the data for visibility. I thought that was what you felt could be worked around. It appears I was wrong. I am speaking of contains, as contains is the one that was said to require a seqscan. I am questioning why it requires a seqscan. Well, you seemed to be suggesting you had something better in mind. At least, that was my reading of your original post. > The claim was made that with MVCC, the index is insufficient to check for visibility True, for PG's implementation of MVCC. You *could* have visibility in each index, but that obviously would take more space. For a table with many indexes, that could be a *lot* more space. You also have to update all that visibilty information too. > and that the table would need to be accessed anyways, therefore a seqscan is required. I question whether a like '%bar%' should be considered a high selectivity query in the general case. I question whether a worst case should be assumed. Well, the general rule-of-thumb is only about 10% for the changeover between index & seq-scan. That is, once you are reading 10% of the rows on disk (to check visibility) you might as well read them all (since you'll be reading most of the blocks anyway if the rows are randomly distributed). If you are doing SELECT * from that table then you'll want all that data you read. If you are doing SELECT count(*) then you only wanted the visibility :-( Now you and I can look at a substring and probably make a good guess how common it is (assuming we know the targets are British surnames or Japanese towns). PG needs one number - or rather, it picks one number for each length of search-string (afaik). Perhaps I question too much? :-) Not sure it's possible to question too much :-) However, you need to provide answers occasionally too - what numbers would you pick? :-) -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] LIKE search and performance
PFC wrote: None of which address the question of what plan PG should produce for: SELECT * FROM bigtable WHERE foo LIKE 's%' Ah, this one already uses the btree since the '%' is at the end. My point is that a search like this will yield too many results to be useful to the user anyway, so optimizing its performance is a kind of red herring. At the *application level* yes. At the *query planner* level no. At the query planner level I just want it to come up with the best plan it can. The original argument was that PG's estimate of the number of matching rows was too optimistic (or pessimistic) in the case where we are doing a contains substring-search. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] LIKE search and performance
None of which address the question of what plan PG should produce for: SELECT * FROM bigtable WHERE foo LIKE 's%' Ah, this one already uses the btree since the '%' is at the end. My point is that a search like this will yield too many results to be useful to the user anyway, so optimizing its performance is a kind of red herring. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] LIKE search and performance
"Richard Huxton" <[EMAIL PROTECTED]> writes: > Now you and I can look at a substring and probably make a good guess how > common > it is (assuming we know the targets are British surnames or Japanese towns). > PG > needs one number - or rather, it picks one number for each length of > search-string (afaik). I don't think that's true. Postgres calculates the lower and upper bound implied by the search pattern and then uses the histogram to estimate how selective that range is. It's sometimes surprisingly good but obviously it's not perfect. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] How PostgreSQL handles multiple DDBB instances?
Arnau <[EMAIL PROTECTED]> writes: >> Can you instead run things with one postmaster per machine and one >> database per customer within that instance? From a performance >> perspective this is likely to work much better. >What I meant is just have only one postmaster per server and a lot of > databases running in it. OK, we are on the same page then. Should work fine. I think I've heard of people running installations with thousands of DBs in them. You'll want to test it a bit of course ... regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] LIKE search and performance
Gregory Stark wrote: "Richard Huxton" <[EMAIL PROTECTED]> writes: Now you and I can look at a substring and probably make a good guess how common it is (assuming we know the targets are British surnames or Japanese towns). PG needs one number - or rather, it picks one number for each length of search-string (afaik). I don't think that's true. Postgres calculates the lower and upper bound implied by the search pattern and then uses the histogram to estimate how selective that range is. It's sometimes surprisingly good but obviously it's not perfect. Sorry - I'm obviously picking my words badly today. I meant for the "contains" substring match. It gives different (goes away and checks...yes) predictions based on string length. So it guesses that LIKE '%aaa%' will match more than LIKE '%%'. Of course, if we were matching surnames you and I could say that this is very unlikely, but without some big statistics table I guess there's not much more PG can do. For a trailing wildcard LIKE 'aaa%' it can and does as you say convert this into something along the lines of (>= 'aaa' AND < 'aab'). Although IIRC that depends if your locale allows such (not sure, I don't really use non-C/non-English locales enough). -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] How PostgreSQL handles multiple DDBB instances?
Tom Lane wrote: Arnau <[EMAIL PROTECTED]> writes: Can you instead run things with one postmaster per machine and one database per customer within that instance? From a performance perspective this is likely to work much better. What I meant is just have only one postmaster per server and a lot of databases running in it. OK, we are on the same page then. Should work fine. I think I've heard of people running installations with thousands of DBs in them. You'll want to test it a bit of course ... I'm worried about performance, I have done some tests and I have on a server more than 400 DBs, so it's possible to run such amount of DBs in a single postmaster. The point I'm worried is performance. Do you think the performance would be better executing exactly the same queries only adding an extra column to all the tables e.g. customer_id, than open a connection to the only one customers DB and execute the query there? I don't know if PostgreSQL cache's mechanism works as good as querying to 400 possible DBs or just to one possible DB. Thank you very much for your help :) -- Arnau ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] How PostgreSQL handles multiple DDBB instances?
Arnau <[EMAIL PROTECTED]> writes: >The point I'm worried is performance. Do you think the performance > would be better executing exactly the same queries only adding an extra > column to all the tables e.g. customer_id, than open a connection to the > only one customers DB and execute the query there? [ shrug... ] That's going to depend on enough factors that I don't think anyone could give you a generic answer. You'd have to test it for yourself under your own application conditions. However: doing it that way seems to me to create severe risks that the customers might be able to look at each others' data. You probably want to go with separate databases just as a security matter. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PERFORM] Performance problem on 8.2.4, but not 8.2.3
Greetings, We have two servers running pgsql -- an older server running 8.2.3, and a newer (far superior) one running 8.2.4. One of our reporting queries is running painfully slowly on 8.2.4, but it executes in a reasonable timeframe on 8.2.3. Below, I've included a contrived, stripped down query which still exhibits the same unintuitively poor performance, as well as its explain analyze output from both servers. In particular, 8.2.4 opts for filters in a couple places where we would expect index conds. Also, we've noticed that the 8.2.4 box (in other similar queries) consistently underestimates costs, whereas the 8.2.3 box consistently overestimates. All columns involved in this query are indexed (btrees), and there is a functional index on mm_date_trunc('day', created_at)...where mm_date_trunc is simply an immutable version of date_trunc (fine for our purposes). The only configuration differences between the servers are various memory settings... work_mem and temp_buffers are 8mb / 16mb, shared buffers 128mb / 512mb on the 8.2.3 and 8.2.4 servers, respectively. Stats targets are 10 on both, for consistency... but it is worth mentioning that performance was still abysmal under 8.2.4 with 250 as the target. Any insight would be most appreciated, as we're a bit stumped. Thanks! Cheers, Dave Pirotte Director of Technology Media Matters for America === select h.day, h.c as total, (select count(*) as c from hits h2 join uri_qstrings uq on (h2.uri_qstring_id = uq.id) join referrer_paths rp on (h2.referrer_path_id = rp.id) join referrer_domains rd on (rp.referrer_domain_id = rd.id) where mm_date_trunc('day', created_at) = h.day and site_id = 3 and uq.qstring = '?f=h_top' and rd.domain = 'mediamatters.org' ) as h_top from ( select mm_date_trunc('day', h.created_at) as day, count(*) as c from hits h where created_at > date_trunc('day', now() - interval '2 days') group by mm_date_trunc('day', h.created_at) ) h order by h.day asc; QUERY PLAN (8.2.4) Sort (cost=204012.65..204012.66 rows=3 width=16) (actual time=83012.885..83012.885 rows=3 loops=1) Sort Key: "day" -> Subquery Scan h (cost=149811.02..204012.62 rows=3 width=16) (actual time=28875.251..83012.868 rows=3 loops=1) -> HashAggregate (cost=149811.02..149811.06 rows=3 width=8) (actual time=1602.787..1602.794 rows=3 loops=1) -> Bitmap Heap Scan on hits h (cost=6485.90..148079.18 rows=346368 width=8) (actual time=48.222..1358.196 rows=391026 loops=1) Recheck Cond: (created_at > date_trunc ('day'::text, (now() - '2 days'::interval))) -> Bitmap Index Scan on hits_created_idx (cost=0.00..6399.31 rows=346368 width=0) (actual time=47.293..47.293 rows=391027 loops=1) Index Cond: (created_at > date_trunc ('day'::text, (now() - '2 days'::interval))) SubPlan -> Aggregate (cost=18067.17..18067.18 rows=1 width=0) (actual time=27136.681..27136.681 rows=1 loops=3) -> Nested Loop (cost=40.66..18067.16 rows=1 width=0) (actual time=1105.396..27135.496 rows=3394 loops=3) -> Nested Loop (cost=40.66..18063.56 rows=9 width=8) (actual time=32.132..26837.394 rows=50537 loops=3) -> Nested Loop (cost=40.66..5869.35 rows=47 width=8) (actual time=20.482..276.889 rows=121399 loops=3) -> Index Scan using referrer_domains_domains_idx on referrer_domains rd (cost=0.00..8.27 rows=1 width=8) (actual time=0.024..0.026 rows=1 loops=3) Index Cond: (("domain")::text = 'mediamatters.org'::text) -> Bitmap Heap Scan on referrer_paths rp (cost=40.66..5834.77 rows=2105 width=16) (actual time=20.402..210.440 rows=121399 loops=3) Recheck Cond: (rp.referrer_domain_id = rd.id) -> Bitmap Index Scan on referrer_paths_domains_idx (cost=0.00..40.13 rows=2105 width=0) (actual time=17.077..17.077 rows=121399 loops=3) Index Cond: (rp.referrer_domain_id = rd.id) -> Index Scan using hits_refer_idx on hits h2 (cost=0.00..257.59 rows=149 width=16) (actual time=0.167..0.218 rows=0 loops=364197) Index Cond: (h2.r
Re: [PERFORM] Performance problem on 8.2.4, but not 8.2.3
Lo! referrer_paths seems to have totally wrong stats. try full analyze on it. how many records in total do you have in referrer_paths on 8.2.4 server? might be just a problem of usage pattern change from old system to new (1 row vs. 121399 rows) ? does not seem to be just a plan problem as the data itself seems to be quite different. Kristo On 25.05.2007, at 21:08, Dave Pirotte wrote: Greetings, We have two servers running pgsql -- an older server running 8.2.3, and a newer (far superior) one running 8.2.4. One of our reporting queries is running painfully slowly on 8.2.4, but it executes in a reasonable timeframe on 8.2.3. Below, I've included a contrived, stripped down query which still exhibits the same unintuitively poor performance, as well as its explain analyze output from both servers. In particular, 8.2.4 opts for filters in a couple places where we would expect index conds. Also, we've noticed that the 8.2.4 box (in other similar queries) consistently underestimates costs, whereas the 8.2.3 box consistently overestimates. All columns involved in this query are indexed (btrees), and there is a functional index on mm_date_trunc('day', created_at)...where mm_date_trunc is simply an immutable version of date_trunc (fine for our purposes). The only configuration differences between the servers are various memory settings... work_mem and temp_buffers are 8mb / 16mb, shared buffers 128mb / 512mb on the 8.2.3 and 8.2.4 servers, respectively. Stats targets are 10 on both, for consistency... but it is worth mentioning that performance was still abysmal under 8.2.4 with 250 as the target. Any insight would be most appreciated, as we're a bit stumped. Thanks! Cheers, Dave Pirotte Director of Technology Media Matters for America === select h.day, h.c as total, (select count(*) as c from hits h2 join uri_qstrings uq on (h2.uri_qstring_id = uq.id) join referrer_paths rp on (h2.referrer_path_id = rp.id) join referrer_domains rd on (rp.referrer_domain_id = rd.id) where mm_date_trunc('day', created_at) = h.day and site_id = 3 and uq.qstring = '?f=h_top' and rd.domain = 'mediamatters.org' ) as h_top from ( select mm_date_trunc('day', h.created_at) as day, count(*) as c from hits h where created_at > date_trunc('day', now() - interval '2 days') group by mm_date_trunc('day', h.created_at) ) h order by h.day asc; QUERY PLAN (8.2.4) -- -- Sort (cost=204012.65..204012.66 rows=3 width=16) (actual time=83012.885..83012.885 rows=3 loops=1) Sort Key: "day" -> Subquery Scan h (cost=149811.02..204012.62 rows=3 width=16) (actual time=28875.251..83012.868 rows=3 loops=1) -> HashAggregate (cost=149811.02..149811.06 rows=3 width=8) (actual time=1602.787..1602.794 rows=3 loops=1) -> Bitmap Heap Scan on hits h (cost=6485.90..148079.18 rows=346368 width=8) (actual time=48.222..1358.196 rows=391026 loops=1) Recheck Cond: (created_at > date_trunc ('day'::text, (now() - '2 days'::interval))) -> Bitmap Index Scan on hits_created_idx (cost=0.00..6399.31 rows=346368 width=0) (actual time=47.293..47.293 rows=391027 loops=1) Index Cond: (created_at > date_trunc ('day'::text, (now() - '2 days'::interval))) SubPlan -> Aggregate (cost=18067.17..18067.18 rows=1 width=0) (actual time=27136.681..27136.681 rows=1 loops=3) -> Nested Loop (cost=40.66..18067.16 rows=1 width=0) (actual time=1105.396..27135.496 rows=3394 loops=3) -> Nested Loop (cost=40.66..18063.56 rows=9 width=8) (actual time=32.132..26837.394 rows=50537 loops=3) -> Nested Loop (cost=40.66..5869.35 rows=47 width=8) (actual time=20.482..276.889 rows=121399 loops=3) -> Index Scan using referrer_domains_domains_idx on referrer_domains rd (cost=0.00..8.27 rows=1 width=8) (actual time=0.024..0.026 rows=1 loops=3) Index Cond: (("domain")::text = 'mediamatters.org'::text) -> Bitmap Heap Scan on referrer_paths rp (cost=40.66..5834.77 rows=2105 width=16) (actual time=20.402..210.440 rows=121399 loops=3) Recheck Cond: (rp.referrer_domain_id = rd.id) -> Bitmap Index Scan on referrer_paths_doma
Re: [PERFORM] general PG network slowness (possible cure) (repost)
"Also sprach Kenneth Marshall:" > improvement from coalescing the packets. Good luck in your investigations. While I am recompiling stuff, just some stats. Typical network traffic analysis during the PG runs: Total Packets Processed 493,499 Unicast 100.0% 493,417 Broadcast 0.0%82 Multicast 0.0%0 pktCast distribution chart Shortest 42 bytes Average Size192 bytes Longest 1,514 bytes <= 64 bytes 0.0% 158 64 to 128 bytes77.3% 381,532 129 to 256 bytes6.8%33,362 257 to 512 bytes8.6%42,535 513 to 1024 bytes 4.0%19,577 1025 to 1518 bytes 3.3%16,335 Typical application rusage stats: time ./c -timeout 12000 -database postgresql://pebbles/d /tmp/tty_io..c user system elapsed cpu 7.866u 6.038s 5:49.13 3.9% 0+0k 0+0io 0pf+0w Those stats show the system lost in i/o. It's neither in kernel nor in userspace. Presumably the other side plus networking was the holdup. For comparison, against localhost via loopback ("fake" networking): time ./c -timeout 12000 -database postgresql://localhost/d /tmp/tty_io..c user system elapsed cpu 9.483u 5.321s 2:41.78 9.1% 0+0k 0+0io 0pf+0w but in that case postmaster was doing about 54% cpu, so the overall cpu for server + client is 63%. I moved to a unix domain socket and postmaster alone went to 68%. time ./c -timeout 12000 -database postgresql://unix/var/run/postgresql/d /tmp/tty_io..c user system elapsed cpu 9.569u 3.698s 2:52.41 7.6% 0+0k 0+0io 0pf+0w The elapsed time is not much different between unix and localhost. One can see that there is some i/o holdup because the two threads ought to do 100% between them if handover of info were costless. The difference (the system was queiscent o/w apart from the monitoring software, which shows only a fraction of a percent loading). There were no memory shortages and swap was disabled for the test (both sides) For comparison, running against gdbm straignt to disk time ./c -timeout 12000 /tmp/tty_io..c user system elapsed cpu 2.637u 0.735s 0:05.34 62.9% 0+0k 0+0io 0pf+0w Through localhost: time ./c -timeout 12000 -database gdbm://localhost/ptb/c /tmp/tty_io..c user system elapsed cpu 2.746u 3.699s 0:16.00 40.1% 0+0k 0+0io 0pf+0w (the server process was at 35% cpu, for 75% total). Across the net: time ./c -timeout 12000 -database gdbm://pebbles/ptb/c /tmp/tty_io..c user system elapsed cpu 2.982u 4.430s 1:03.44 7.9% 0+0k 0+0io 0pf+0w (the server was at 7% cpu) Have to go shopping Peter ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Performance problem on 8.2.4, but not 8.2.3
Dave Pirotte <[EMAIL PROTECTED]> writes: > We have two servers running pgsql -- an older server running 8.2.3, > and a newer (far superior) one running 8.2.4. One of our reporting > queries is running painfully slowly on 8.2.4, but it executes in a > reasonable timeframe on 8.2.3. Are you sure you've analyzed all these tables in the 8.2.4 database? Some of the rowcount estimates seem a bit far off. I looked through the CVS logs and didn't find any planner changes between 8.2.3 and 8.2.4 that seem likely to affect your query, so I'm thinking it must be a statistical discrepancy. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Performance problem on 8.2.4, but not 8.2.3
On Fri, May 25, 2007 at 03:56:35PM -0400, Tom Lane wrote: > I looked through the CVS logs and didn't find any planner changes > between 8.2.3 and 8.2.4 that seem likely to affect your query, so > I'm thinking it must be a statistical discrepancy. It looks like the estimated cost is lower for 8.2.4 -- could it be that the fact that he's giving it more memory lead to the planner picking a plan that happens to be worse? /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Performance problem on 8.2.4, but not 8.2.3
"Steinar H. Gunderson" <[EMAIL PROTECTED]> writes: > It looks like the estimated cost is lower for 8.2.4 -- could it be that the > fact that he's giving it more memory lead to the planner picking a plan that > happens to be worse? Offhand I don't think so. More work_mem might make a hash join look cheaper (or a sort for a mergejoin), but the problem here seems to be that it's switching away from a hash and to a nestloop. Which is a loser because there are many more outer-relation rows than it's expecting. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Performance problem on 8.2.4, but not 8.2.3
Thanks for the quick responses. :-) The data is almost identical, between the two servers: 8.2.3 has 882198 records, 8.2.4 has 893121. For background, I pg_dump'ed the data into the 8.2.4 server yesterday, and analyzed with the stats target of 250, then reanalyzed with target 10. So, the statistics should theoretically be ok. Running a vacuum full analyze on referrer_paths, per Kristo's suggestion, didn't affect the query plan. We downgraded to 8.2.3 just to rule that out, upped stats target to 100, analyzed, and are still experiencing the same behavior -- it's still coming up with the same bogus rowcount estimates. Over the weekend I'll lower the memory and see if that does anything, just to rule that out... Any other thoughts? Thanks so much for your time and suggestions thus far. Cheers, Dave On May 25, 2007, at 4:33 PM, Tom Lane wrote: "Steinar H. Gunderson" <[EMAIL PROTECTED]> writes: It looks like the estimated cost is lower for 8.2.4 -- could it be that the fact that he's giving it more memory lead to the planner picking a plan that happens to be worse? Offhand I don't think so. More work_mem might make a hash join look cheaper (or a sort for a mergejoin), but the problem here seems to be that it's switching away from a hash and to a nestloop. Which is a loser because there are many more outer-relation rows than it's expecting. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate Dave Pirotte Director of Technology Media Matters for America [EMAIL PROTECTED] phone: 202-756-4122
[PERFORM] Adding disks/xlog & index
I have a busy postgresql server running running on a raid1 of 2 15k rpm scsi drives. I have been running into the problem of maxed out IO bandwidth. I would like to convert my raid1 into a raid10 but that would require a full rebuild which is more downtime than I want so I am looking into other alternatives. The best one I have come up with is moving the xlog/wal (can someone confirm whether these are the same thing?) to another physical drive. I also think it may be beneficial to move some indexes to another drive as well (same one as xlog). Some questions on this: 1. Can the database survive loss/corruption of the xlog and indexes in a recoverable way? To save money (and because I won't need the throughput as much), I am thinking on making this index/wal/xlog drive a single cheap sata drive (or maybe a non-raided 15k scsi for 60% more money). However without the redundancy of a mirror I am concerned about drive failure. Loss of several mins of recent transactions in a serious crash is acceptable to be, but full/serious database corruption (the likes of fsync off) is not. 2. Is there any point using a high performance (ie scsi) disk for this, or would the mirror containing the majority of the data still be the major bottleneck causing the disk usage to not exceed sata performance anyway? 3. Is there any easy way to move ALL indexes to another drive? Is this a good performance idea or would they just bottleneck each other seriously? Other info for reference Running postgresql 8.2 on FreeBSD 6.1 server is a core2 with 4gb of ram. CPU usage is moderate. Also, can anyone recommend a good shared_buffers size? The server is dedicated to postgres except for half a gig used by memcached. Right now I have it set at 51200 which may be too high (I've read varying suggestions with this and I'm not sure how aggressive FreeBSD6's IO cache is). And any suggestions on what effective_cache_size I should use on this hardware and OS? I've been using 384MB but I don't know if this is optimal or not. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] general PG network slowness (possible cure) (repost)
"Also sprach Richard Huxton:" > > scheme each time, for example! (how that?). I could presumably also > > help it by preloading the commands I will run and sending over the > > params only with a "do a no. 17 now!". > > PREPARE/EXECUTE (or the equivalent libpq functions). Yes, thank you. It seems to speed things up by a factor of 2. But can I prepare a DECLARE x BINARY CURSOR FOR SELECT ... statement? The manual seems to say no. Peter ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Big problem with sql update operation
Tom Lane wrote: Michal Szymanski <[EMAIL PROTECTED]> writes: CREATE OR REPLACE FUNCTION test() RETURNS void AS $BODY$ DECLARE BEGIN FOR v_i IN 1..4000 LOOP UPDATE group_fin_account_tst SET credit = v_i WHERE group_fin_account_tst_id = 1; -- for real procedure I update different rows Does updating the *same* record 4000 times per transaction reflect the real behavior of your application? If not, this is not a good benchmark. If so, consider redesigning your app to avoid so many redundant updates. Real application modifiy every time modify different row. (For the record, the reason you see nonlinear degradation is the accumulation of tentatively-dead versions of the row, each of which has to be rechecked by each later update.) There is another strange thing. We have two versions of our test environment one with production DB copy and second genereated with minimal data set and it is odd that update presented above on copy of production is executing 170ms but on small DB it executing 6s Michal Szymanski http://blog.szymanskich.net ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Big problem with sql update operation
Michal Szymanski wrote: > Tom Lane wrote: > >(For the record, the reason you see nonlinear degradation is the > >accumulation of tentatively-dead versions of the row, each of which has > >to be rechecked by each later update.) > > > There is another strange thing. We have two versions of our test > environment one with production DB copy and second genereated with > minimal data set and it is odd that update presented above on copy of > production is executing 170ms but on small DB it executing 6s How are you vacuuming the tables? -- Alvaro Herrerahttp://www.advogato.org/person/alvherre "El conflicto es el camino real hacia la unión" ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Adding disks/xlog & index
[EMAIL PROTECTED] writes: > The best one I have come up with is moving the xlog/wal (can someone > confirm whether these are the same thing?) to another physical drive. Yeah, two names for same thing. > I also think it may be beneficial to move some indexes to another drive as > well (same one as xlog). Depends on how the I/O workload works out. On systems that have fairly heavy write traffic, the standard advice is that you want WAL on its own dedicated spindle, because the less that head needs to move the faster you can write WAL, and WAL output speed is going to determine how fast you can perform updates. If it's a read-mostly database then maybe you can ignore that advice and worry more about separating indexes from tables. > 1. Can the database survive loss/corruption of the xlog and indexes in a > recoverable way? To save money (and because I won't need the throughput as > much), I am thinking on making this index/wal/xlog drive a single cheap > sata drive (or maybe a non-raided 15k scsi for 60% more money). Do not go cheap on the WAL drive --- you lose WAL, you're in serious trouble. Indexes can always be rebuilt with REINDEX, so they're maybe a bit more expendable. > 3. Is there any easy way to move ALL indexes to another drive? No, I think you have to move 'em one at a time :-(. The standard advice for this is to set up a plpgsql function that scans the catalogs and issues the commands you want (ALTER INDEX SET TABLESPACE in this case). > Is this a > good performance idea or would they just bottleneck each other seriously? Impossible to tell without a lot more details than you provided. I'd suggest you try it and see. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Adding disks/xlog & index
[EMAIL PROTECTED] writes: > The best one I have come up with is moving the xlog/wal (can someone > confirm whether these are the same thing?) to another physical drive. Yeah, two names for same thing. > I also think it may be beneficial to move some indexes to another drive as > well (same one as xlog). Depends on how the I/O workload works out. On systems that have fairly heavy write traffic, the standard advice is that you want WAL on its own dedicated spindle, because the less that head needs to move the faster you can write WAL, and WAL output speed is going to determine how fast you can perform updates. If it's a read-mostly database then maybe you can ignore that advice and worry more about separating indexes from tables. > 1. Can the database survive loss/corruption of the xlog and indexes in a > recoverable way? To save money (and because I won't need the throughput as > much), I am thinking on making this index/wal/xlog drive a single cheap > sata drive (or maybe a non-raided 15k scsi for 60% more money). Do not go cheap on the WAL drive --- you lose WAL, you're in serious trouble. Indexes can always be rebuilt with REINDEX, so they're maybe a bit more expendable. > 3. Is there any easy way to move ALL indexes to another drive? No, I think you have to move 'em one at a time :-(. The standard advice for this is to set up a plpgsql function that scans the catalogs and issues the commands you want (ALTER INDEX SET TABLESPACE in this case). > Is this a > good performance idea or would they just bottleneck each other seriously? Impossible to tell without a lot more details than you provided. I'd suggest you try it and see. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] general PG network slowness (possible cure) (repost)
"Peter T. Breuer" <[EMAIL PROTECTED]> writes: > But can I prepare a DECLARE x BINARY CURSOR FOR SELECT ... statement? > The manual seems to say no. No, you just prepare the SELECT. At the protocol level, DECLARE CURSOR is a tad useless. You can still fetch the data in binary if you want... regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Adding disks/xlog & index
<[EMAIL PROTECTED]> writes: > Some questions on this: > 1. Can the database survive loss/corruption of the xlog and indexes in a > recoverable way? To save money (and because I won't need the throughput as > much), I am thinking on making this index/wal/xlog drive a single cheap > sata drive (or maybe a non-raided 15k scsi for 60% more money). However > without the redundancy of a mirror I am concerned about drive failure. > Loss of several mins of recent transactions in a serious crash is > acceptable to be, but full/serious database corruption (the likes of fsync > off) is not. Losing any WAL that the database has fsynced is exactly like having fsync off. > 2. Is there any point using a high performance (ie scsi) disk for this, or > would the mirror containing the majority of the data still be the major > bottleneck causing the disk usage to not exceed sata performance anyway? Well that depends on your database traffic. In most databases the volume of WAL traffic is substantially less than the i/o traffic to the data drives. So you usually don't need to be able to sustain high i/o bandwidth to the WAL drive. However in some database loads the latency to the WAL drive does matter. This is especially true if you're executing a lot of short transactions and response time is critical. Especially if you aren't executing many such transactions in parallel. So for example if you're processing a serial batch of short transactions and committing each one as a separate transaction. In that case you would want a drive that can fsync fast which either means a battery backed cache or 15kRPM drive. It doesn't necessarily mean you need a bit raid array though. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] ECC RAM really needed?
We're thinking of building some new servers. We bought some a while back that have ECC (error correcting) RAM, which is absurdly expensive compared to the same amount of non-ECC RAM. Does anyone have any real-life data about the error rate of non-ECC RAM, and whether it matters or not? In my long career, I've never once had a computer that corrupted memory, or at least I never knew if it did. ECC sound like a good idea, but is it solving a non-problem? Thanks, Craig ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] ECC RAM really needed?
On Fri, May 25, 2007 at 18:45:15 -0700, Craig James <[EMAIL PROTECTED]> wrote: > We're thinking of building some new servers. We bought some a while back > that have ECC (error correcting) RAM, which is absurdly expensive compared > to the same amount of non-ECC RAM. Does anyone have any real-life data > about the error rate of non-ECC RAM, and whether it matters or not? In my > long career, I've never once had a computer that corrupted memory, or at > least I never knew if it did. ECC sound like a good idea, but is it > solving a non-problem? In the past when I purchased ECC ram it wasn't that much more expensive than nonECC ram. Wikipedia suggests a rule of thumb of one error per month per gigabyte, though suggests error rates vary widely. They reference a paper that should provide you with more background. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] ECC RAM really needed?
On Fri, 25 May 2007, Bruno Wolff III wrote: Wikipedia suggests a rule of thumb of one error per month per gigabyte, though suggests error rates vary widely. They reference a paper that should provide you with more background. The paper I would recommend is http://www.tezzaron.com/about/papers/soft_errors_1_1_secure.pdf which is a summary of many other people's papers, and quite informative. I know I had no idea before reading it how much error rates go up with increasing altitute. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] ECC RAM really needed?
Greg Smith <[EMAIL PROTECTED]> writes: > The paper I would recommend is > http://www.tezzaron.com/about/papers/soft_errors_1_1_secure.pdf > which is a summary of many other people's papers, and quite informative. > I know I had no idea before reading it how much error rates go up with > increasing altitute. Not real surprising if you figure the problem is mostly cosmic rays. Anyway, this paper says > Even using a relatively conservative error rate (500 FIT/Mbit), a > system with 1 GByte of RAM can expect an error every two weeks; which should pretty much cure any idea that you want to run a server with non-ECC memory. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings