Re: [PERFORM] PostgreSQL 8.4 performance tuning questions
lzo is much, much, (much) faster than zlib. Note, I've tried several decompression speed is even more awesome... times to contact the author to get clarification on licensing terms and have been unable to get a response. lzop and the LZO library are distributed under the terms of the GNU General Public License (GPL). source : http://www.lzop.org/lzop_man.php -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL 8.4 performance tuning questions
I get very different (contradictory) behavior. Server with fast RAID, 32GB RAM, 2 x 4 core 3.16Ghz Xeon 54xx CPUs. CentOS 5.2 8.3.6 That's a very different serup from my (much less powerful) box, so that would explain it... No disk wait time during any test. One test beforehand was used to prime the disk cache. 100% CPU in the below means one core fully used. 800% means the system is fully loaded. pg_dump > file (on a subset of the DB with lots of tables with small tuples) 6m 27s, 4.9GB; 12.9MB/sec 50% CPU in postgres, 50% CPU in pg_dump If there is no disk wait time, then why do you get 50/50 and not 100/100 or at least 1 core maxed out ? That's interesting... COPY annonces TO '/dev/null'; COPY 413526 Temps : 13871,093 ms \copy annonces to '/dev/null' Temps : 14037,946 ms time pg_dump -Fc -t annonces -U annonces --compress=0 annonces >/dev/null real0m14.596s user0m0.700s sys 0m0.372s In all 3 cases postgres maxes out one core (I've repeated the test until all data was cached, so there is no disk access at all in vmstat). Size of dump is 312MB. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Why is PostgreSQL so slow on Windows ( Postgres 8.3.7) version
,some of our queries to the database taking long time to return the results. fsync: off (even we tested this parameter is on ,we observed the same slowness ) If your queries take long time to return results, I suppose you are talking about SELECTs. fsync = off will not make SELECTs faster (only inserts, updates, deletes) but it is not worth it as you risk data loss. synchronous_commit = on has about the same advantages (faster...) as fsync=off, but with no risk of data loss, so it is much better ! We have 300k row's in PolledData Table.In each STATSDATA table ,we have almost 12 to 13 million rows. OK. So you insert 13 million rows per day ? That is about 150 rows per second. Every one minute interval ,we insert data into to STATSDATA table. I assume you are making an INSERT INTO statsdata VALUES (.. 150 values .) and not 150 inserts, yes ? First Query : SELECT COUNT(*) FROM ( SELECT ID, PolledData.AGENT, NAME, INSTANCE, TTIME, VAL FROM PolledData, STATSDATA8_21_2009 WHERE ( ( PolledData.ID=STATSDATA8_21_2009.POLLID) AND ( ( TTIME >= 1250838027454) AND ( TTIME <=1250838079654) ) ) ) t1; * You could rewrite as : SELECT ID, PolledData.AGENT, NAME, INSTANCE, TTIME, VAL FROM PolledData JOIN STATSDATA8_21_2009 ON ( PolledData.ID = STATSDATA8_21_2009.POLLID) WHERE TTIME BETWEEN ... AND ... - It is exactly the same query, but much easier to read. * some ANALYZE-ing of your tables would be useful, since the estimates from the planner look suspiciously different from reality - ANALYZE is fast, you can run it often if you INSERT rows all the time * You are joining on POLLID which is a NUMERIC in one table and a BIGINT in the other table. - Is there any reason for this type difference ? - Could you use BIGINT in both tables ? - BIGINT is faster than NUMERIC and uses less space. - Type conversions use CPU cycles too. * Should StatsData.ID have a foreign key REFERENCES PolledData.ID ? - This won't make the query faster, but if you know all rows in StatsData reference rows in PolledData (because of the FK constraint) and you want a count(*) like above, you don't need to JOIN. * TTIME >= 1250838027454 AND TTIME <=1250838079654 - TTIME should be TIMESTAMP (with or without TIMEZONE) or BIGINT but certainly not NUMERIC - An index on StatsData.TTIME would be useful, it would avoid Seq Scan, replacing it with a Bitmap Scan, much faster * work_mem - since you have few connections you could increase work_mem Second Query : Same as first query Third Query SELECT COUNT(*) FROM ( SELECT ID, PolledData.AGENT, NAME, INSTANCE, TTIME, VAL FROM PolledData, STATSDATA8_21_2009 WHERE ( ( PolledData.ID=STATSDATA8_21_2009.POLLID) AND ( ( TTIME >= 1250838027454) AND ( TTIME <=1250838027454) ) ) union all SELECT ID, PolledData.AGENT, NAME, INSTANCE, TTIME, VAL FROM PolledData, STATSDATA8_20_2009 WHERE ( ( PolledData.ID=STATSDATA8_20_2009.POLLID) AND ( ( TTIME >= 1250767134601) AND ( TTIME <= 1250767134601) ) ) )t1 ; Basically this is, again, exactly the same query as above, but two times, and UNION ALL'ed * You could rewrite it like this : SELECT ID, PolledData.AGENT, NAME, INSTANCE, TTIME, VAL FROM ( SELECT ... FROM STATSDATA8_21_2009 WHERE TTIME BETWEEN ... AND ... ) UNION ALL SELECT ... FROM STATSDATA8_20_2009 WHERE TTIME BETWEEN ... AND ... ) ) JOIN STATSDATA8_21_2009 ON ( PolledData.ID = STATSDATA8_21_2009.POLLID) * If TTIME is the current time, and you insert data as it comes, data in StatsData tables is probably already ordered on TTIME. - If it is not the case, once a table is filled and becomes read-only, consider CLUSTER on the index you created on TTIME - It will make range queries on TTIME much faster * Query plan Seq Scan on statsdata8_21_2009 (cost=0.00..70574.88 rows=1 width=32) (actual time=0.047..29066.227 rows=227 loops=1) Seq Scan on statsdata8_20_2009 (cost=0.00..382519.60 rows=1 width=32) (actual time=3136.008..93985.540 rows=1 loops=1) Postgres thinks there is 1 row in those tables... that's probably not the case ! The first one returns 227 rows, so the plan chosen in a catastrophe. I was a bit intrigued by your query, so I made a little test... BEGIN; CREATE TABLE test( x INT, y INT ); INSERT INTO test (SELECT n,n FROM generate_series( 1,100 ) AS n ); CREATE INDEX test_x ON test( x ); CREATE INDEX test_y ON test( y ); COMMIT; ANALYZE test; test=> EXPLAIN ANALYZE SELECT * FROM test a JOIN test b ON (b.x=a.x) WHERE a.x BETWEEN 0 AND 1; QUERY PLAN - Hash Join (cost=480.53..23759.14 rows=10406 width=16) (actual time=15.614..1085.085 rows=1 loops=1) Hash Cond: (b.x = a.x) -> Seq Scan on test b (cost=0.00..14424.76 ro
Re: [PERFORM] PostgreSQL 8.4 performance tuning questions
On Fri, 31 Jul 2009 19:04:52 +0200, Tom Lane wrote: Greg Stark writes: On Thu, Jul 30, 2009 at 11:30 PM, Tom Lane wrote: I did some tracing and verified that pg_dump passes data to deflate() one table row at a time. I'm not sure about the performance implications of that, but it does seem like it might be something to look into. I suspect if this was a problem the zlib people would have added internal buffering ages ago. I find it hard to believe we're not the first application to use it this way. I dug into this a bit more. zlib *does* have internal buffering --- it has to, because it needs a minimum lookahead of several hundred bytes to ensure that compression works properly. The per-call overhead of deflate() looks a bit higher than one could wish when submitting short chunks, but oprofile shows that "pg_dump -Fc" breaks down about like this: During dump (size of dump is 2.6 GB), No Compression : - postgres at 70-100% CPU and pg_dump at something like 10-20% - dual core is useful (a bit...) - dump size 2.6G - dump time 2m25.288s Compression Level 1 : - postgres at 70-100% CPU and pg_dump at 20%-100% - dual core is definitely useful - dump size 544MB - dump time 2m33.337s Since this box is mostly idle right now, eating CPU for compression is no problem... Adding an option to use LZO instead of gzip could be useful... Compressing the uncompressed 2.6GB dump : - gzip -1 : - compressed size : 565 MB - compression throughput : 28.5 MB/s - decompression throughput : 74 MB/s - LZO -1 : - compressed size : 696M - compression throughput : 86 MB/s - decompression throughput : 247 MB/s Conclusion : LZO could help for fast disks (RAID) or slow disks on a CPU-starved server... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] SQL select query becomes slow when using limit (with no offset)
The query: select events_events.id FROM events_events left join events_event_types on events_events.eventType_id= events_event_types.id where events_event_types.severity=70 and events_events.cleared='f' order by events_events.dateTime DESC The main problem seems to be lack of a suitable index... - Try creating an index on events_events( eventType_id, cleared ) - Or the other way around : events_events( cleared, eventType_id ) (depends on your other queries) Please try both and report EXPLAIN ANALYZE. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Full text search with ORDER BY performance issue
If love is an uncommon word, there's no help for queries of this type being slow unless the GIN index can return the results in order. But if love is a common word, then it would be faster to do an index scan by timestamp on the baserel and then treat comment_tsv @@ plainto_tsquery('love') as a filter condition. Is this a selectivity estimation bug? If you have really lots of documents to index (this seems the case) perhaps you should consider Xapian. It is very easy to use (although, of course, tsearch integrated in Postgres is much easier since you have nothing to install), and it is *incredibly* fast. In my tests (2 years ago) with many gigabytes of stuff to search into, differences became obvious when the data set is much bigger than RAM. - Postgres' fulltext was 10-100x faster than MySQL fulltext on searches (lol) (and even a lot "more faster" on INSERTs...) - and Xapian was 10-100 times faster than Postgres' fulltext. (on a small table which fits in RAM, differences are small). Of course Xapian is not Postgres when you talk about update concurrency.. (single writer => fulltext index updating background job is needed, a simple Python script does the job) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL with PostGIS on embedded hardware
A smartphone... you're right, I didn't think of that, but the hardware I described is very much like the one of a modern smartphone!!! Are you saying that PostgreSQL+PostGIS can actually run on a smartphone??? Intriguing... Did anyone ever actually tried that??? While the performance of ARM cpus used in smartphones, PDAs, etc, is pretty good, this hardware is optimized for small size and low power use, thus you generally get quite low memory bandwidth, the problem of Flash endurance, and lack of standard interfaces to hook up to the rest of your system. Embedded PC-Compatible hardware in the 600 MHz range you mention would probably get a DIMM memory module (maybe for the only reason that mass-production makes them so cheap) so you'd get a much higher memory bandwidth, and much larger RAM. Even if the CPU is only 2x faster than a smartphone, if the memory bandwidth is 10x higher, you'll see the difference. It would also have standard interfaces, very useful for you, and you can hook it up to a real SSD (not a micro-SD card) with real flash wear leveling algorithms. But yeah since today's smartphones are more powerful that the desktops of 10 years ago (which ran PG just fine) it would probably work, if you can run Linux on it... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL with PostGIS on embedded hardware
They didn't give me complete information, but it should be some kind of industrial PC with a 600MHz CPU. Memory should be not huge nor small, maybe a couple of GBytes, hard disk should be some type of industrial Compact Flash of maybe 16 GBytes. It should work perfectly OK. Remember that you need a fast CPU if you have a database server that processes many queries from many users simultaneously. Since your "server" will process very few queries (maybe one per second, something like that) even a slow (by modern standards) 600 MHz CPU will be more than enough... I'd say for such an application, your hardware is way overkill (it would work on a smartphone...) but since hardware is so cheap... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [HACKERS] high shared buffer and swap
An octocore server with 32GB of ram, running postgresql 8.3.6 Running only postgresql, slony-I and pgbouncer. Just for testing purpose, i tried a setting with 26GB of shared_buffer. I quickly noticed that the performances wasn't very good and the server started to swap slowly but surely. (but still up to 2000query/second as reported by pgfouine) It used all the 2GB of swap. I removed the server from production, added 10GB of swap and left it for the weekend with only slony and postgresql up to keep it in sync with the master database. This morning i found that the whole 12GB of swap were used : Hm, do you really need swap with 32Gb of RAM ? One could argue "yes but swap is useful to avoid out of memory errors". But if a loaded server starts to swap a lot, it is as good as dead anyway... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] performance for high-volume log insertion
Blocking round trips to another process on the same server should be fairly cheap--that is, writing to a socket (or pipe, or localhost TCP connection) where the other side is listening for it; and then blocking in return for the response. The act of writing to an FD that another process is waiting for will make the kernel mark the process as "ready to wake up" immediately, and the act of blocking for the response will kick the scheduler to some waiting process, so as long as there isn't something else to compete for CPU for, each write/read will wake up the other process instantly. There's a task switching cost, but that's too small to be relevant here. Doing 100 local round trips, over a pipe: 5.25s (5 *microseconds* each), code attached. The cost *should* be essentially identical for any local transport (pipes, named pipes, local TCP connections), since the underlying scheduler mechanisms are the same. Roundtrips can be quite fast but they have a hidden problem, which is that everything gets serialized. This means if you have a process that generates data to insert, and a postgres process, and 2 cores on your CPU, you will never use more than 1 core, because both are waiting on each other. Pipelining is a way to solve this... In the ideal case, if postgres is as fast as the data-generating process, each would use 1 core, yielding 2x speedup. Of course if one of the processes is like 10x faster than the other, it doesn't matter. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] difficulties with time based queries
What can I do to prevent the index from getting bloated, or in whatever state it was in? What else can I do to further improve queries on this table? Someone suggested posting details of my conf file. Which settings are most likely to be useful for this? If you often do range queries on date, consider partitioning your table by date (something like 1 partition per month). Of course, if you also often do range queries on something other than date, and uncorrelated, forget it. If you make a lot of big aggregate queries, consider materialized views : Like "how many games player X won this week", etc - create "helper" tables which contain the query results - every night, recompute the results taking into account the most recent data - don't recompute results based on old data that never changes This is only interesting if the aggregation reduces the data volume by "an appreciable amount". For instance, if you run a supermarket with 1000 distinct products in stock and you sell 100.000 items a day, keeping a cache of "count of product X sold each day" will reduce your data load by about 100 on the query "count of product X sold this month". The two suggestion above are not mutually exclusive. You could try bizgres also. Or even MySQL !... MySQL's query engine is slower than pg but the tables take much less space than Postgres, and it can do index-only queries. So you can fit more in the cache. This is only valid for MyISAM (InnoDB is a bloated hog). Of course, noone would want to use MyISAM for the "safe" storage, but it's pretty good as a read-only storage. You can even use the Archive format for even more compactness and use of cache. Of course you'd have to devise a way to dump from pg and load into MySQL but that's not hard. MySQL can be good if you target a table with lots of small rows with a few ints, all of them in a multicolumn index, so it doesn't need to hit the table itself. Note that one in his right mind would never run aggregate queries on a live R/W MyISAM table since the long queries will block all writes and blow up the reaction time. But for a read-only cache updated at night, or replication slave, it's okay. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] difficulties with time based queries
When I try to do queries on this table I always find them slower than what I need and what I believe should be possible. -> Bitmap Index Scan on ad_log_date_all (cost=0.00..72750.51 rows=2488252 width=0) (actual time=49776.332..49776.332 rows=2268490 loops=1) Index Cond: ((date(start_time) < '2009-03-31'::date) AND (date(start_time) >= '2009-03-30'::date)) Total runtime: 65279.352 ms Well, it is grabbing 2.268.490 rows, that's a lot of rows, so it is not going to be very fast like a few milliseconds. Your columns are small, ints, dates, not large text strings which would augment the total amount of data. So your timing looks pretty slow, it should be faster than this, maybe a few seconds. With this quantity of rows, you want to try to make the disk accesses as linear as possible. This means your table should be organized on disk by date, at least roughly. If your data comes from an import that was sorted on some other column, this may not be the case. What kind of bytes/s do you get from the drives ? => Can you post the result of "vmstat 1" during the entire execution of the query ? 2 phases should be visible in the vmstat output, the indexscan, and the bitmap heapscan. You could use CLUSTER on the table (it will take a long time), or simply create another table and INSERT INTO ... SELECT ORDER BY date. This will also take a long time, but faster than CLUSTER. Then you could recreate the indexes. Do you UPDATE or DELETE a lot from this table ? Is it vacuum'd enough ? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL block size for SSD RAID setup?
Hi, I was reading a benchmark that sets out block sizes against raw IO performance for a number of different RAID configurations involving high end SSDs (the Mtron 7535) on a powerful RAID controller (the Areca 1680IX with 4GB RAM). See http://jdevelopment.nl/hardware/one-dvd-per-second/ Lucky guys ;) Something that bothers me about SSDs is the interface... The latest flash chips from Micron (32Gb = 4GB per chip) have something like 25 us "access time" (lol) and push data at 166 MB/s (yes megabytes per second) per chip. So two of these chips are enough to bottleneck a SATA 3Gbps link... there would be 8 of those chips in a 32GB SSD. Parallelizing would depend on the block size : putting all chips in parallel would increase the block size, so in practice I don't know how it's implemented, probably depends on the make and model of SSD. And then RAIDing those (to get back the lost throughput from using SATA) will again increase the block size which is bad for random writes. So it's a bit of a chicken and egg problem. Also since harddisks have high throughput but slow seeks, all the OS'es and RAID cards, drivers, etc are probably optimized for throughput, not IOPS. You need a very different strategy for 100K/s 8kbyte IOs versus 1K/s 1MByte IOs. Like huge queues, smarter hardware, etc. FusionIO got an interesting product by using the PCI-e interface which brings lots of benefits like much higher throughput and the possibility of using custom drivers optimized for handling much more IO requests per second than what the OS and RAID cards, and even SATA protocol, were designed for. Intrigued by this I looked at the FusionIO benchmarks : more than 100.000 IOPS, really mindboggling, but in random access over a 10MB file. A little bit of google image search reveals the board contains a lot of Flash chips (expected) and a fat FPGA (expected) probably a high-end chip from X or A, and two DDR RAM chips from Samsung, probably acting as cache. So I wonder if the 10 MB file used as benchmark to reach those humongous IOPS was actually in the Flash ?... or did they actually benchmark the device's onboard cache ?... It probably has writeback cache so on a random writes benchmark this is an interesting question. A good RAID card with BBU cache would have the same benchmarking gotcha (ie if you go crazy on random writes on a 10 MB file which is very small, and the device is smart, possibly at the end of the benchmark nothing at all was written to the disks !) Anyway in a database use case if random writes are going to be a pain they are probably not going to be distributed in a tiny 10MB zone which the controller cache would handle... (just rambling XDD) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] TCP network cost
python w/ psycopg (or psycopg2), which wraps libpq. Same results w/ either version. I've seen psycopg2 saturate a 100 Mbps ethernet connection (direct connection with crossover cable) between postgres server and client during a benchmark... I had to change the benchmark to not retrieve a large TEXT column to remove this bottleneck... this was last year so versions are probably different, but I don't think this matters a lot... Note the 'bare' transfer times added above. Nothing to write home about (~3Mb/sec) but another order of magnitude faster than the postgresql transfer. You should test with sending a large (>100 MB) amount of data through Netcat. This should give you your maximum wire speed. Use /dev/null as the test file, and use "pv" (pipe viewer) to measure throughput : box 1 : pv < /dev/zero | nc -lp 12345 box 2 : nc (ip) 12345 >/dev/null On gigabit lan you should get 100 MB/s, on 100BaseT about 10 MB/s. If you dont get that, there is a problem somewhere (bad cable, bad NIC, slow switch/router, etc). Monitor CPU during this test (vmstat). Usage should be low. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] analyzing intermediate query
My list can contain 1 .. 10 records and table contains 300 records and is growing. Ah. No IN(), then ;) Temp table + ANALYZE seems your only option... In 8.3 or 8.4 I think that IN() or temp table produce exactly the same result. Andrus. Oh, I just thought about something, I don't remember in which version it was added, but : EXPLAIN ANALYZE SELECT sum(column1) FROM (VALUES ...a million integers... ) AS v Postgres is perfectly happy with that ; it's either a bit slow (about 1 second) or very fast depending on how you view things... Aggregate (cost=15000.00..15000.01 rows=1 width=4) (actual time=1060.253..1060.253 rows=1 loops=1) -> Values Scan on "*VALUES*" (cost=0.00..12500.00 rows=100 width=4) (actual time=0.009..634.728 rows=100 loops=1) Total runtime: 1091.420 ms The most interesting thing, of course, is that the statistics are exact. You can use VALUES like a table (Join, whatever). Of course it's always slightly annoying to juggle around with result sets and stuff them in comma-separated strings, but it works. Here it knows there's few rows ===> nested loop EXPLAIN SELECT a.* FROM annonces a JOIN (VALUES (0),(1),(2),(3),(4),(5),(6),(7)) AS v ON (a.id=v.column1); QUERY PLAN Nested Loop (cost=0.00..66.73 rows=8 width=943) -> Values Scan on "*VALUES*" (cost=0.00..0.10 rows=8 width=4) -> Index Scan using annonces_pkey on annonces a (cost=0.00..8.32 rows=1 width=943) Index Cond: (a.id = "*VALUES*".column1) With a million values it goes hash of course, etc. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] analyzing intermediate query
My list can contain 1 .. 10 records and table contains 300 records and is growing. Ah. No IN(), then ;) Temp table + ANALYZE seems your only option... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] analyzing intermediate query
I noticed that query SELECT dok.* FROM dok JOIN (SELECT DISTINCT dokumnr FROM temptbl ) x USING(dokumnr); is slow in 8.1.4 I cannot use explain analyze since this query uses results from temporary table temptbl which is not available. Generally if you know your temptbl will always contains a few rows (say, generally a few and never more than a few thousands) it is better to use something like that : - get list of items - SELECT * FROM table WHERE id IN (...) Of course you must be pretty damn sure that the list isn't gonna contain 10 million items. Or else you'll have a little problem. But it generally works pretty well. The overhead of generating and parsing the IN() is lower than the overhead of temptables... By the way, sometime ago there was talk about adding estimation of number of rows returned to set-returning functions. What's the status of this ? It doesn't seem to have survived... 8.3> EXPLAIN SELECT * FROM generate_series( 1,10 ); QUERY PLAN Function Scan on generate_series (cost=0.00..12.50 rows=1000 width=4) Sometimes innter table returns only 1 row so maybe seq scan is selected instead of single row index access becauses expected count is 1000 As I understand, PostgreSql requires manually running ANALYZE for temporary tables if their row count is different from 1000 How to force PostgreSql to analyze inner table in this query or use other way to get index using query plan if inner query returns single row ? How -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Sort causes system to freeze
Maybe this is an obviously dumb thing to do, but it looked reasonable to me. Looks reasonable here too - except I'm not sure what I'd do with 2 million rows of sorted table in my console. I'm guessing you're piping the output into something. Probably it's psql that is choking from buffering the rows. If you want to fetch that huge amount of data into a user application, a CURSOR is the best way to do so. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Query optimization
I am struggeling with the following query which fetches a random subset of 200 questions that matches certain tags within certain languages. However, the query takes forever to evaluate, even though I have a "limit 200" appended. Any ideas on how to optimize it? QUERY: SELECT distinct q.question_id FROM question_tags qt, questions q WHERE q.question_id = qt.question_id AND q.STATUS = 1 AND not q.author_id = 105 AND ((qt.language_id = 5 and qt.tag_id in (1101,917,734,550,367,183)) or (qt.language_id = 4 and qt.tag_id in (856,428)) or (qt.language_id = 3 and qt.tag_id in (1156,1028,899,771,642,514,385,257,128)) or (qt.language_id = 2 and qt.tag_id in (1193,1101,1009,917,826,734,642,550,458,367,275,183,91))) and q.question_id not in (413) LIMIT 200 EXPLAIN ANALYZE: = Limit (cost=1.50..1267.27 rows=200 width=4) (actual time=278.169..880.934 rows=200 loops=1) -> Unique (cost=1.50..317614.50 rows=50185 width=4) (actual time=278.165..880.843 rows=200 loops=1) -> Merge Join (cost=1.50..317489.04 rows=50185 width=4) (actual time=278.162..880.579 rows=441 loops=1) Merge Cond: (qt.question_id = q.question_id) -> Index Scan using question_tags_question_id on question_tags qt (cost=0.00..301256.96 rows=82051 width=4) (actual time=24.171..146.811 rows=6067 loops=1) Filter: (((language_id = 5) AND (tag_id = ANY ('{1101,917,734,550,367,183}'::integer[]))) OR ((language_id = 4) AND (tag_id = ANY ('{856,428}'::integer[]))) OR ((language_id = 3) AND (tag_id = ANY ('{1156,1028,899,771,642,514,385,257,128}'::integer[]))) OR ((language_id = 2) AND (tag_id = ANY ('{1193,1101,1009,917,826,734,642,550,458,367,275,183,91}'::integer[] -> Index Scan using questions_pkey on questions q (cost=0.00..15464.12 rows=83488 width=4) (actual time=222.956..731.737 rows=1000 loops=1) Filter: ((q.author_id <> 105) AND (q.question_id <> 413) AND (q.status = 1)) Total runtime: 881.152 ms (9 rows) An index on (language_id,tag_id) should be the first thing to try. Or perhaps even (status,language_id,tag_id) or (language_id, tad_id, status) (but that depends on the stats on "status" column). An index on author_id will probably not be useful for this particular query because your condition is "author_id != constant". Also CLUSTER question_tags on (language_id, tad_id). What is the database size versus RAM ? You must have a hell of a lot of questions to make this slow... (or bloat ?) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] performance tuning queries
First off, any thoughts per tuning inserts into large tables. I have a large table with an insert like this: insert into public.bigtab1 (text_col1, text_col2, id) values ... QUERY PLAN -- Result (cost=0.00..0.01 rows=1 width=0) (1 row) The query cost is low but this is one of the slowest statements per pgfouine Possible Causes of slow inserts : - slow triggers ? - slow foreign key checks ? (missing index on referenced table ?) - functional index on a slow function ? - crummy hardware (5 MB/s RAID cards, etc) - too many indexes ? Next we have a select count(*) that also one of the top offenders: select count(*) from public.tab3 where user_id=31 and state='A' and amount>0; QUERY PLAN - Aggregate (cost=3836.53..3836.54 rows=1 width=0) -> Index Scan using order_user_indx ontab3 user_id (cost=0.00..3834.29 rows=897 width=0) Index Cond: (idx_user_id = 31406948::numeric) Filter: ((state = 'A'::bpchar) AND (amount > 0::numeric)) (4 rows) We have an index on the user_id but not on the state or amount, add index to amount ? Can we see EXPLAIN ANALYZE ? In this case the ideal index would be multicolumn (user_id, state) or (user_id,amount) or (user_id,state,amount) but choosing between the 3 depends on your data... You could do : SELECT count(*), state, amount>0 FROM public.tab3 where user_id=31 GROUP BY state, amount>0; And post the results. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] limit clause produces wrong query plan
SELECT ... FROM bigtable ORDER BY intprimarykey OFFSET 100 LIMIT 100 I think pagination is overrated. If the query produces, for instance, something like 100 rows or less, more often than not, getting all the rows will take the exact same time as getting a portion of the rows... in all those cases, it is much better to cache the results somewhere (user session, table, whatever) and paginate based on that, rather than perform the same query lots of times. Especially when some non-indexed sorting takes place in which case you are gonna fetch all the rows anyway. Something like row-id can be stored instead of the full rows, also. There are exceptions of course. And if the query produces 20.000 results... who is ever going to scroll to page 1257 ? The example which I posted shows that SELECT ... FROM bigtable ORDER BY intprimarykey LIMIT 100 this is extremely *slow*: seq scan is performed over whole bigtable. This is wrong though. It should use an index, especially if you have a LIMIT... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Perc 3 DC
Not yet no, but that's a good suggestion and I do intend to give it a whirl. I get about 27MB/s from raid 1 (10 is about the same) so hopefully I can up the throughput to the speed of about one disk with sw raid. FYI I get more than 200 MB/s out of a Linux Software RAID5 of 3 SATA drives (the new Samsung Spinpoints...) (Intel ICH8 chipset, Core 2 Duo). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hash join on int takes 8..114 seconds
Thank you very much for great sample. I tried to create testcase from this to match production db: 1.2 million orders 3.5 million order details 13400 products with char(20) as primary keys containing ean-13 codes mostly 3 last year data every order has usually 1..3 detail lines same product can appear multiple times in order products are queried by start of code This sample does not distribute products randomly between orders. How to change this so that every order contains 3 (or 1..6 ) random products? I tried to use random row sample from http://www.pgsql.cz/index.php/PostgreSQL_SQL_Tricks-i but in this case constant product is returned always. It seems than query containing randon() is executed only once. You could try writing a plpgsql function which would generate the data set. Or you could use your existing data set. By the way, a simple way to de-bloat your big table without blocking would be this : - stop all inserts and updates - begin - create table new like old table - insert into new select * from old (order by perhaps) - create indexes - rename new into old - commit If this is just a reporting database where you insert a batch of new data every day, for instance, that's very easy to do. If it's OLTP, then, no. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hash join on int takes 8..114 seconds
On Fri, 21 Nov 2008 21:07:02 +0100, Tom Lane <[EMAIL PROTECTED]> wrote: PFC <[EMAIL PROTECTED]> writes: Index on orders_products( product_id ) and orders_products( order_id ): => Same plan Note that in this case, a smarter planner would use the new index to perform a BitmapAnd before hitting the heap to get the rows. Considering that the query has no constraint on orders_products.order_id, I'm not sure what you think the extra index is supposed to be used *for*. (Well, we could put orders as the outside of a nestloop and then we'd have such a constraint, but with 3 orders rows to process that plan would lose big.) (And yes, the planner did consider such a plan along the way. See choose_bitmap_and.) regards, tom lane I think I didn't express myself correctly... Here the indexes are small (therefore well cached) but the orders_products table is large (and not cached). To reproduce this, I put this table on a crummy slow external USB drive. Between each of the following queries, pg was stopped, the USB drive unmounted, remounted, and pg restarted, to purge orders_products table out of all caches. I also modified the statistical distribution (see init script at bottom of message). EXPLAIN ANALYZE SELECT count(*) FROM orders JOIN orders_products USING (order_id) WHERE orders.order_date BETWEEN '2000-01-01' AND '2000-02-01' AND orders_products.product_id = 2345; QUERY PLAN - Aggregate (cost=5431.93..5431.94 rows=1 width=0) (actual time=5176.382..5176.382 rows=1 loops=1) -> Hash Join (cost=1575.13..5431.84 rows=35 width=0) (actual time=62.634..5176.332 rows=36 loops=1) Hash Cond: (orders_products.order_id = orders.order_id) -> Bitmap Heap Scan on orders_products (cost=21.27..3864.85 rows=1023 width=4) (actual time=7.041..5118.512 rows=1004 loops=1) Recheck Cond: (product_id = 2345) -> Bitmap Index Scan on orders_products_product_order (cost=0.00..21.02 rows=1023 width=0) (actual time=0.531..0.531 rows=1004 loops=1) Index Cond: (product_id = 2345) -> Hash (cost=1130.58..1130.58 rows=33862 width=4) (actual time=55.526..55.526 rows=31999 loops=1) -> Index Scan using orders_date on orders (cost=0.00..1130.58 rows=33862 width=4) (actual time=0.139..33.466 rows=31999 loops=1) Index Cond: ((order_date >= '2000-01-01'::date) AND (order_date <= '2000-02-01'::date)) Total runtime: 5176.659 ms This is the original query ; what I don't like about it is that it bitmapscans orders_products way too much, because it reads all orders for the specified product, not just orders in the date period we want. However, since Postgres scanned all order_id's corresponding to the date range already, to build the hash, the list of order_ids of interest is known at no extra cost. In this case, additionnally, correlation is 100% between order_id and date, so I can do : test=> SELECT max(order_id), min(order_id) FROM orders WHERE order_date BETWEEN '2000-01-01' AND '2000-02-01'; max | min ---+- 31999 | 1 And I can add an extra condition to the query, like this : EXPLAIN ANALYZE SELECT count(*) FROM orders JOIN orders_products USING (order_id) WHERE orders.order_date BETWEEN '2000-01-01' AND '2000-02-01' AND orders_products.order_id BETWEEN 1 AND 31999 AND orders_products.product_id = 2345; QUERY PLAN - Aggregate (cost=426.80..426.81 rows=1 width=0) (actual time=179.233..179.233 rows=1 loops=1) -> Nested Loop (cost=0.00..426.79 rows=1 width=0) (actual time=6.667..179.190 rows=36 loops=1) -> Index Scan using orders_products_product_order on orders_products (cost=0.00..142.11 rows=34 width=4) (actual time=6.559..177.597 rows=36 loops=1) Index Cond: ((product_id = 2345) AND (order_id >= 1) AND (order_id <= 31999)) -> Index Scan using orders_pkey on orders (cost=0.00..8.36 rows=1 width=4) (actual time=0.039..0.041 rows=1 loops=36) Index Cond: (orders.order_id = orders_products.order_id) Filter: ((orders.order_date >= '2000-01-01'::date) AND (orders.order_date <= '2000-02-01'::date)) Total runtime: 179.392 ms This is with no ca
Re: [PERFORM] Hash join on int takes 8..114 seconds
log file seems that mostly only those queries are slow: SELECT ... FROM dok JOIN rid USING (dokumnr) JOIN ProductId USING (ProductId) WHERE rid.ProductId LIKE :p1 || '%' AND dok.SaleDate>=:p2 :p1 and :p2 are parameters different for different queries. dok contains several years of data. :p2 is usually only few previous months or last year ago. SELECT column list contains fixed list of known columns from all tables. How to create index or materialized view to optimize this types of queries ? I would remove some granularity, for instance create a summary table (materialized view) by month : - date (contains the first day of the month) - product_id - total quantity, total price sold in given month You get the idea. If your products belong to categories, and you make queries on all the products in a category, it could be worth making a summary table for categories also. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hash join on int takes 8..114 seconds
Server has 2 GB RAM. It has SATA RAID 0,1 integrated controller (1.5Gbps) and SAMSUNG HD160JJ mirrored disks. You could perhaps run a little check on the performance of the RAID, is it better than linux software RAID ? Does it leverage NCQ appropriately when running queries in parallel ? -- Receipt headers: DOK ( dokumnr INT SERIAL PRIMARY KEY, kuupaev DATE --- sales date ) -- Receipt details RID ( dokumnr INT, toode CHAR(20), -- item code CONSTRAINT rid_dokumnr_fkey FOREIGN KEY (dokumnr) REFERENCES dok (dokumnr), CONSTRAINT rid_toode_fkey FOREIGN KEY (toode) REFERENCES firma2.toode (toode) ) -- Products TOODE ( toode CHAR(20) PRIMARY KEY ) OK so pretty straightforward : dok <-(dokumnr)-> rid <-(toode)-> toode toode.toode should really be an INT though. explain analyze SELECT sum(1) FROM dok JOIN rid USING (dokumnr) JOIN toode USING (toode) LEFT JOIN artliik using(grupp,liik) WHERE rid.toode='X05' AND dok.kuupaev>='2008-09-01' By the way, note that the presence of the toode table in the query above is not required at all, unless you use columns of toode in your aggregates. Let's play with that, after all, it's friday night. BEGIN; CREATE TABLE orders (order_id INTEGER NOT NULL, order_date DATE NOT NULL); CREATE TABLE products (product_id INTEGER NOT NULL, product_name TEXT NOT NULL); CREATE TABLE orders_products (order_id INTEGER NOT NULL, product_id INTEGER NOT NULL, padding1 TEXT, padding2 TEXT); INSERT INTO products SELECT n, 'product number ' || n::TEXT FROM generate_series(1,4) AS n; INSERT INTO orders SELECT n,'2000-01-01'::date + (n/1000 * '1 DAY'::interval) FROM generate_series(1,100) AS n; SET work_mem TO '1GB'; INSERT INTO orders_products SELECT a,b,'aibaifbaurgbyioubyfazierugybfoaybofauez', 'hfohbdsqbhjhqsvdfiuazvfgiurvgazrhbazboifhaoifh' FROM (SELECT DISTINCT (1+(n/10))::INTEGER AS a, (1+(random()*3))::INTEGER AS b FROM generate_series( 1,999 ) AS n) AS x; DELETE FROM orders_products WHERE product_id NOT IN (SELECT product_id FROM products); DELETE FROM orders_products WHERE order_id NOT IN (SELECT order_id FROM orders); ALTER TABLE orders ADD PRIMARY KEY (order_id); ALTER TABLE products ADD PRIMARY KEY (product_id); ALTER TABLE orders_products ADD PRIMARY KEY (order_id,product_id); ALTER TABLE orders_products ADD FOREIGN KEY (product_id) REFERENCES products( product_id ) ON DELETE CASCADE; ALTER TABLE orders_products ADD FOREIGN KEY (order_id) REFERENCES orders( order_id ) ON DELETE CASCADE; CREATE INDEX orders_date ON orders( order_date ); COMMIT; SET work_mem TO DEFAULT; ANALYZE; With the following query : EXPLAIN ANALYZE SELECT sum(1) FROM orders JOIN orders_products USING (order_id) JOIN products USING (product_id) WHERE orders.order_date BETWEEN '2000-01-01' AND '2000-02-01' AND products.product_id = 12345; I get the following results : orders_products has a PK index on (order_id, product_id). I dropped it. No index on orders_products : => Big seq scan (16 seconds) Index on orders_products( product_id ) : Aggregate (cost=2227.22..2227.23 rows=1 width=0) (actual time=108.204..108.205 rows=1 loops=1) -> Nested Loop (cost=1312.30..2227.20 rows=7 width=0) (actual time=105.929..108.191 rows=6 loops=1) -> Index Scan using products_pkey on products (cost=0.00..8.27 rows=1 width=4) (actual time=0.010..0.014 rows=1 loops=1) Index Cond: (product_id = 12345) -> Hash Join (cost=1312.30..2218.85 rows=7 width=4) (actual time=105.914..108.167 rows=6 loops=1) Hash Cond: (orders_products.order_id = orders.order_id) -> Bitmap Heap Scan on orders_products (cost=6.93..910.80 rows=232 width=8) (actual time=0.194..2.175 rows=246 loops=1) Recheck Cond: (product_id = 12345) -> Bitmap Index Scan on orders_products_product_id (cost=0.00..6.87 rows=232 width=0) (actual time=0.129..0.129 rows=246 loops=1) Index Cond: (product_id = 12345) -> Hash (cost=949.98..949.98 rows=28432 width=4) (actual time=105.696..105.696 rows=31999 loops=1) -> Index Scan using orders_date on orders (cost=0.00..949.98 rows=28432 width=4) (actual time=0.059..64.443 rows=31999 loops=1) Index Cond: ((order_date >= '2000-01-01'::date) AND (order_date <= '2000-02-01'::date)) Total runtime: 108.357 ms (don't trust this timing, it's a bit cached, this is the same plan as you get) Index on orders_products( product_id ) and orders_products( order_id ): => Same plan Note that in this case, a smarter planner would use the new index to perform a BitmapAnd before hitting the heap to get the rows. Index on ( order_id, product_id ), orders_products( product_id ): Index on ( order_id, product_id ): => Different plan, slower (especially
Re: [PERFORM] Hash join on int takes 8..114 seconds
How to vacuum full pg_shdepend automatically so that other users can work at same time ? Your table is horribly bloated. You must use VACUUM FULL + REINDEX (as superuser) on it, however unfortunately, it is blocking. Therefore, you should wait for sunday night to do this, when noone will notice. Meanwhile, you can always VACUUM it (as superuser) and REINDEX it. And while you're at it, VACUUM FULL + reindex the entire database. To avoid such annoyances in the future, you should ensure that autovacuum runs properly ; you should investigate this. If you use a cron'ed VACUUM that does not run as superuser, then it will not be able to VACUUM the system catalogs, and the problem will come back. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hash join on int takes 8..114 seconds
OK so vmstat says you are IO-bound, this seems logical if the same plan has widely varying timings... Let's look at the usual suspects : - how many dead rows in your tables ? are your tables data, or bloat ? (check vacuum verbose, etc) - what's the size of the dataset relative to the RAM ? Now let's look more closely at the query : explain analyze SELECT sum(1) FROM dok JOIN rid USING (dokumnr) JOIN toode USING (toode) LEFT JOIN artliik using(grupp,liik) WHERE rid.toode='X05' AND dok.kuupaev>='2008-09-01' OK, so artliik is a very small table (84 rows) : Seq Scan on artliik (cost=0.00..6.84 rows=84 width=19) (actual time=20.104..29.845 rows=84 loops=1) I presume doing the query without artliik changes nothing to the runtime, yes ? Let's look at the main part of the query : FROM dok JOIN rid USING (dokumnr) JOIN toode USING (toode) WHERE rid.toode='X05' AND dok.kuupaev>='2008-09-01' Postgres's plan is logical. It starts by joining rid and dok since your WHERE is on those : -> Hash Join (cost=52103.94..233488.08 rows=24126 width=24) (actual time=100386.921..114037.986 rows=20588 loops=1)" Hash Cond: ("outer".dokumnr = "inner".dokumnr)" -> Bitmap Heap Scan on rid (cost=4127.51..175020.84 rows=317003 width=28) (actual time=9.932..76225.918 rows=277294 loops=1)" Recheck Cond: (toode = 'X05'::bpchar)" -> Bitmap Index Scan on rid_toode_idx (cost=0.00..4127.51 rows=317003 width=0) (actual time=11105.807..11105.807 rows=280599 loops=1)" Index Cond: (toode = 'X05'::bpchar)" -> Hash (cost=47376.82..47376.82 rows=93444 width=4) (actual time=35082.427..35082.427 rows=105202 loops=1)" -> Index Scan using dok_kuupaev_idx on dok (cost=0.00..47376.82 rows=93444 width=4) (actual time=42.110..34586.331 rows=105202 loops=1)" Index Cond: (kuupaev >= '2008-09-01'::date)" Your problem here is that, no matter what, postgres will have to examine - all rows where dok.kuupaev>='2008-09-01', - and all rows where rid.toode = 'X05'. If you use dok.kuupaev>='2007-09-01' (note : 2007) it will probably have to scan many, many more rows. If you perform this query often you could CLUSTER rid on (toode) and dok on (kuupaev), but this can screw other queries. What is the meaning of the columns ? To make this type of query faster I would tend to think about : - materialized views - denormalization (ie adding a column in one of your tables and a multicolumn index) - materialized summary tables (ie. summary of sales for last month, for instance) "Aggregate (cost=234278.53..234278.54 rows=1 width=0) (actual time=114479.933..114479.936 rows=1 loops=1)" " -> Hash Left Join (cost=52111.20..234218.21 rows=24126 width=0) (actual time=100435.523..114403.293 rows=20588 loops=1)" "Hash Cond: (("outer".grupp = "inner".grupp) AND ("outer".liik = "inner".liik))" "-> Nested Loop (cost=52103.94..233735.35 rows=24126 width=19) (actual time=100405.258..114207.387 rows=20588 loops=1)" " -> Index Scan using toode_pkey on toode (cost=0.00..6.01 rows=1 width=43) (actual time=18.312..18.325 rows=1 loops=1)" "Index Cond: ('X05'::bpchar = toode)" " -> Hash Join (cost=52103.94..233488.08 rows=24126 width=24) (actual time=100386.921..114037.986 rows=20588 loops=1)" "Hash Cond: ("outer".dokumnr = "inner".dokumnr)" "-> Bitmap Heap Scan on rid (cost=4127.51..175020.84 rows=317003 width=28) (actual time=9.932..76225.918 rows=277294 loops=1)" " Recheck Cond: (toode = 'X05'::bpchar)" " -> Bitmap Index Scan on rid_toode_idx (cost=0.00..4127.51 rows=317003 width=0) (actual time=11105.807..11105.807 rows=280599 loops=1)" "Index Cond: (toode = 'X05'::bpchar)" "-> Hash (cost=47376.82..47376.82 rows=93444 width=4) (actual time=35082.427..35082.427 rows=105202 loops=1)" " -> Index Scan using dok_kuupaev_idx on dok (cost=0.00..47376.82 rows=93444 width=4) (actual time=42.110..34586.331 rows=105202 loops=1)" "Index Cond: (kuupaev >= '2008-09-01'::date)" "-> Hash (cost=6.84..6.84 rows=84 width=19) (actual time=30.220..30.220 rows=84 loops=1)" " -> Seq Scan on artliik (cost=0.00..6.84 rows=84 width=19) (actual time=20.104..29.845 rows=84 loops=1)" "Total runtime: 114480.373 ms" -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hash join on int takes 8..114 seconds
Query below seems to use indexes everywhere in most optimal way. dokumnr column is of type int Speed of this query varies rapidly: In live db fastest response I have got is 8 seconds. Re-running same query after 10 seconds may take 60 seconds. Re-running it again after 10 seconds may take 114 seconds. Any idea how to speed it up ? Is it possible to optimize it, will upgrading to 8.3.5 help or should I require to add more RAM, disk or CPU speed ? Real query contains column list instead of sum(1) used in test below. Andrus. Just a question, what are you doing with the 20.000 result rows ? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Improve Seq scan performance
OK, I see your problem. Try this : read this : http://www.postgresql.org/docs/current/static/pgtrgm.html locate and \i the pg_trgm.sql file CREATE TABLE dict( s TEXT ); I loaded the english - german dictionary in a test table. I didn't parse it, so it's just a bunch of 418552 strings, english and german mixed. test=> EXPLAIN ANALYZE SELECT * FROM dict WHERE s LIKE '%tation%'; QUERY PLAN Seq Scan on dict (cost=0.00..7445.90 rows=133 width=13) (actual time=0.102..217.155 rows=802 loops=1) Filter: (s ~~ '%tation%'::text) Total runtime: 217.451 ms (3 lignes) Temps : 217,846 ms Since this data does not change very often, let's use a gin index. CREATE INDEX trgm_idx ON dict USING gin (s gin_trgm_ops); With trigrams we can search by similarity. So, we can issue this : EXPLAIN ANALYZE SELECT s, similarity(s, 'tation') AS sml FROM dict WHERE s % 'tation' ORDER BY sml DESC, s; QUERY PLAN -- Sort (cost=1114.44..1115.49 rows=419 width=13) (actual time=190.778..190.980 rows=500 loops=1) Sort Key: (similarity(s, 'tation'::text)), s Sort Method: quicksort Memory: 37kB -> Bitmap Heap Scan on dict (cost=35.80..1096.19 rows=419 width=13) (actual time=113.486..188.825 rows=500 loops=1) Filter: (s % 'tation'::text) -> Bitmap Index Scan on trgm_idx (cost=0.00..35.69 rows=419 width=0) (actual time=112.011..112.011 rows=15891 loops=1) Index Cond: (s % 'tation'::text) Total runtime: 191.189 ms It is not much faster than the seq scan, but it can give you useful results, correct spelling errors, etc. Perhaps it's faster when data is not cached. Sample of returns : taxation| 0.6 station | 0.5 tabulation | 0.5 taction | 0.5 Taxation {f}| 0.5 Taxation {f}| 0.5 If you do not want to correct for spelling errors, you can do like this : EXPLAIN ANALYZE SELECT s FROM dict WHERE s LIKE '%tation%' AND s % 'tation'; QUERY PLAN -- Bitmap Heap Scan on dict (cost=35.70..1096.09 rows=1 width=13) (actual time=66.583..80.980 rows=306 loops=1) Filter: ((s ~~ '%tation%'::text) AND (s % 'tation'::text)) -> Bitmap Index Scan on trgm_idx (cost=0.00..35.69 rows=419 width=0) (actual time=65.799..65.799 rows=15891 loops=1) Index Cond: (s % 'tation'::text) Total runtime: 81.140 ms (5 lignes) Temps : 81,652 ms In this case the trigram index is used to narrow the search, and the LIKE to get only exact matches. Careful though, it might not always match, for instance if you search "rat" you won't find "consideration", because the search string is too small. Anyway, I would suggest to change your strategy. You could try preloading everything into an in-memory array of strings. This would be much faster. You could also try to build a list of unique words from your dictionary, which contains lots of expressions. Then, when the user enters a query, get the words that contain the entered text, and use a full-text index to search your dictionary. I tested first only some words. And later with '%a%', '%b% etc. When I re-query the table with the used term (e.g. 1.'%a%' -slow, 2. '%b%'- slow, '%a%' - fast), it is faster than the old method. When the user enters a very short string like 'a' or 'is', I don't think it is relevant to display all entries that contain this, because that could be most of your dictionary. Instead, why not display all unique words which start with this string ? Much less results, faster, and probably more useful too. Then the user can select an longer word and use this. Also, pagination is overrated. If there are 50 pages of results, the user will never click on them anyway. They are more likely to refine their query instead. So, just display the first 100 results and be done with it ;) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance Question
I've been searching for performance metrics and tweaks for a few weeks now. I'm trying to determine if the length of time to process my queries is accurate or not and I'm having a difficult time determining that. I know postgres performance is very dependent on hardware and settings and I understand how difficult it is to tackle. However, I was wondering if I could get some feedback based on my results please. Well, the simplest thing is to measure the time it takes to process a query, but : - EXPLAIN ANALYZE will always report a longer time than the reality, because instrumenting the query takes time. For instance, EXPLAIN ANALYZE on a count(*) on a query could take more time to count how many times the "count" aggregate is called and how much time is spent in it, than to actually compute the aggregate... This is because it takes much longer to measure the time it takes to call "count" on a row (syscalls...) than it takes to increment the count. This is not a problem as long as you are aware of it, and the information provided by EXPLAIN ANALYZE is very valuable. - Using \timing in psql is also a good way to examine queries, but if your query returns lots of results, the time it takes for the client to process those results will mess with your measurements. In this case a simple : SELECT sum(1) FROM (your query) can provide less polluted timings. Remember you are not that interested in client load : you can always add more webservers, but adding more database servers is a lot more difficult. - You can add some query logging in your application (always a good idea IMHO). For instance, the administrator (you) could see a list of queries at the bottom of the page with the time it takes to run them. In that case, keep in mind that any load will add randomness to this measurements. For instance, when you hit F5 in your browser, of the webserver and database run on the same machine as the browser, the browser's CPU usage can make one of your queries appear to take up to half a second... even if it takes, in reality, half a millisecond... So, average. You could push the idea further. Sometimes I log the parameterized query (without args), the args separately, and the query time, so I can get average timings for things like "SELECT stuff FROM table WHERE column=$1", not get a zillion separate queries depending on the parameters. Such logging can destroy your performance, though, use with care. OF COURSE YOU SHOULD MEASURE WHAT IS RELEVANT, that is, queries that your application uses. The database is running on a dual-core 2GHz Opteron processor with 8GB of RAM. 8GB. 64 bits I presume ? The drives are 10K RPM 146GB drives in RAID 5 (I've read RAID 5 is bad for Postgres, but moving the database to another server didn't change performance at all). RAID5 = good for reads, and large writes. RAID5 = hell for small random writes. Depends on your load... shared_buffers = 16MB That's a bit small IMHO. (try 2 GB). work_mem = 64MB everything else is set to the default One of my tables has 660,000 records and doing a SELECT * from that table (without any joins or sorts) takes 72 seconds. Well, sure, but why would you do such a thing ? I mean, I don't know your row size, but say it is 2 KB, you just used 1.5 GB of RAM on the client and on the server. Plus of course transferring all this data over your network connection. If client and server are on the same machine, you just zapped 3 GB of RAM. I hope you don't do too many of those concurrently... This is never going to be fast and it is never going to be a good performance metric. If you need to pull 600.000 rows from a table, use a CURSOR, and pull them in batches of say, 1000. Then you will use 600 times less RAM. I hope you have gigabit ethernet though. Network and disk IO will be your main bottleneck. If you don't need to pull 600.000 rows from a table, well then, don't do it. If you're using a client app to display the results, well, how long does it take to display 600.000 rows in a GUI box ?... Ordering the table based on 3 columns almost doubles that time to an average of 123 seconds. Same as above, if your rows are small, say 100 bytes, you're sorting 66 megabytes, which would easily be done in RAM, but you specified work_mem too small, so it is done on disk, with several passes. If your rows are large, well you're facing a multi gigabyte disksort with only 64 MB of working memory, so it's really going to take lots of passes. If you often need to pull 600.000 rows from a table in a specific order, create an index on the column, use a CURSOR, and pull them in batches of say, 1000. If you seldom need to, don't create an index but do use a CURSOR, and pull them in batches of say, 1000. If you don't need to pull 600.000 rows from a table in a specific order, well
Re: [PERFORM] slow full table update
update songs set views = 0; UPDATE 54909 Time: 101907.837 ms time is actually less than 10 minutes, but it is still very long :( Wow. test=> CREATE TABLE test (id SERIAL PRIMARY KEY, value INTEGER); test=> INSERT INTO test (value) SELECT n FROM generate_series( 1,10 ) AS n; Temps : 1706,495 ms test=> UPDATE test SET value=0; Temps : 1972,420 ms Note this is 8.3.3 on a desktop PC with the database and xlog on a Linux Software RAID1 of rather slow drives (about 50 MB/s). Anyway your 10 minutes are really wrong. First thing to check is if there is a problem with your IO subsystem, try the example queries above, you should get timings in the same ballpark. If you get 10x slower than that, you have a problem. Are the rows large ? I would believe so, because a "songs" table will probably contain things like artist, title, comments, and lots of other information in strings that are too small to be TOAST'ed. Perhaps your problem is in index updates, too. So, make a copy of the songs table, without any indices, and no foreign keys : CREATE TABLE songs2 AS SELECT * FROM songs; Then try your UPDATE on this. How slow is it ? Now drop this table, and recreate it with the foreign keys. Test the update again. Now drop this table, and recreate it with the foreign keys and indexes. Test the update again. This will give you some meaningful information. You will probably update the 'views' column quite often, it will even probably be the most often updated column in your application. In this case, you could try moving it to a separate table with just (song_id, view), that way you will update a very small table. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Improve Seq scan performance
Dear List, I would like to improve seq scan performance. :-) I have many cols in a table. I use only 1 col for search on it. It is indexed with btree with text_pattern_ops. The search method is: r like '%aaa%' When I make another table with only this col values, the search time is better when the data is cached. But wronger when the data isn't in cache. I think the following: - When there is a big table with many cols, the seq search is read all cols not only searched. - If we use an index with full values of a col, it is possible to seq scan over the index is make better performance (lower io with smaller data). It is possible to make an index on the table, and make a seq index scan on this values? You can fake this (as a test) by creating a separate table with just your column of interest and the row id (primary key), updated via triggers, and seq scan this table. Seq scanning this small table should be fast. Of course if you have several column conditions it will get more complex. Note that btrees cannot optimize '%aaa%'. You could use trigrams. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Fusion-io ioDrive
PFC, I have to say these kind of posts make me a fan of yours. I've read many of your storage-related replied and have found them all very educational. I just want to let you know I found your assessment of the impact of Flash storage perfectly-worded and unbelievably insightful. Thanks a million for sharing your knowledge with the list. -Dan Hehe, thanks. There was a time when you had to be a big company full of cash to build a computer, and then sudenly people did it in garages, like Wozniak and Jobs, out of off-the-shelf parts. I feel the ioDrive guys are the same kind of hackers, except today's hackers have much more powerful tools. Perhaps, and I hope it's true, storage is about to undergo a revolution like the personal computer had 20-30 years ago, when the IBMs of the time were eaten from the roots up. IMHO the key is that you can build a ioDrive from off the shelf parts, but you can't do that with a disk drive. Flash manufacturers are smelling blood, they profit from USB keys and digicams but imagine the market for solid state drives ! And in this case the hardware is simple : flash, ram, a fpga, some chips, nothing out of the ordinary, it is the brain juice in the software (which includes FPGAs) which will sort out the high performance and reliability winners from the rest. Lowering the barrier of entry is good for innovation. I believe Linux will benefit, too, since the target is (for now) high-performance servers, and as shown by the ioDrive, innovating hackers prefer to write Linux drivers rather than Vista (argh) drivers. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Fusion-io ioDrive
*) is the flash random write problem going to be solved in hardware or specialized solid state write caching techniques. At least currently, it seems like software is filling the role. Those flash chips are page-based, not unlike a harddisk, ie. you cannot erase and write a byte, you must erase and write a full page. Size of said page depends on the chip implementation. I don't know which chips they used so cannot comment there, but you can easily imagine that smaller pages yield faster random IO write throughput. For reads, you must first select a page and then access it. Thus, it is not like RAM at all. It is much more similar to a harddisk with an almost zero seek time (on reads) and a very small, but significant seek time (on writes) because a page must be erased before being written. Big flash chips include ECC inside to improve reliability. Basically the chips include a small static RAM buffer. When you want to read a page it is first copied to SRAM and ECC checked. When you want to write a page you first write it to SRAM and then order the chip to write it to flash. Usually you can't erase a page, you must erase a block which contains many pages (this is probably why most flash SSDs suck at random writes). NAND flash will never replace SDRAM because of these restrictions (NOR flash acts like RAM but it is slow and has less capacity). However NAND flash is well suited to replace harddisks. When writing a page you write it to the small static RAM buffer on the chip (fast) and tell the chip to write it to flash (slow). When the chip is busy erasing or writing you can not do anything with it, but you can still talk to the other chips. Since the ioDrive has many chips I'd bet they use this feature. I don't know about the ioDrive implementation but you can see that the paging and erasing requirements mean some tricks have to be applied and the thing will probably need some smart buffering in RAM in order to be fast. Since the data in a flash doesn't need to be sequential (read seek time being close to zero) it is possible they use a system which makes all writes sequential (for instance) which would suit the block erasing requirements very well, with the information about block mapping stored in RAM, or perhaps they use some form of copy-on-write. It would be interesting to dissect this algorithm, especially the part which allows to store permanently the block mappings, which cannot be stored in a constant known sector since it would wear out pretty quickly. Ergo, in order to benchmark this thing and get relevant results, I would tend to think that you'd need to fill it to say, 80% of capacity and bombard it with small random writes, the total amount of data being written being many times more than the total capacity of the drive, in order to test the remapping algorithms which are the weak point of such a device. *) do the software solutions really work (unproven) *) when are the major hardware vendors going to get involved. they make a lot of money selling disks and supporting hardware (san, etc). Looking at the pictures of the "drive" I see a bunch of Flash chips which probably make the bulk of the cost, a switching power supply, a small BGA chip which is probably a DDR memory for buffering, and the mystery ASIC which is probably a FPGA, I would tend to think Virtex4 from the shape of the package seen from the side in one of the pictures. A team of talented engineers can design and produce such a board, and assembly would only use standard PCB processes. This is unlike harddisks, which need a huge investment and a specialized factory because of the complex mechanical parts and very tight tolerances. In the case of the ioDrive, most of the value is in the intellectual property : software on the PC CPU (driver), embedded software, and programming the FPGA. All this points to a very different economic model for storage. I could design and build a scaled down version of the ioDrive in my "garage", for instance (well, the PCI Express licensing fees are hefty, so I'd use PCI, but you get the idea). This means I think we are about to see a flood of these devices coming from many small companies. This is very good for the end user, because there will be competition, natural selection, and fast evolution. Interesting times ahead ! I'm not particularly enamored of having a storage device be stuck directly in a pci slot -- although I understand it's probably necessary in the short term as flash changes all the rules and you can't expect it to run well using mainstream hardware raid controllers. By using their own device they have complete control of the i/o stack up to the o/s driver level. Well, SATA is great for harddisks : small cables, less clutter, less failure prone than 80 conductor cables, faster, cheaper, etc... Basically serial LVDS (low vol
Re: [PERFORM] slow delete
by the way, there is a foreign key on another table that references the primary key col0 on table test. Is there an index on the referencing field in the other table ? Postgres must find the rows referencing the deleted rows, so if you forget to index the referencing column, this can take forever. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [QUESTION]Concurrent Access
I want to know if the PostGree has limitations about the concurrent access, because a lot of people will access this database at the same time. PostgreSQL has excellent concurrency provided you use it correctly. But what do you mean by concurrent access ? * Number of opened Postgres connections at the same time ? => each one of those uses a little bit of RAM. (see manual) but if idle they don't use CPU. * Number of opened transactions at the same time ? (between BEGIN and COMMIT) If your transactions are long and you have many transactions at the same time you can get lock problems, for instance transaction A updates row X and transaction B updates the same row X, one will have to wait for the other to commit or rollback of course. If your transactions last 1 ms there is no problem, if they last 5 minutes you will suffer. * Number of queries executing at the same time ? This is different from above, each query will eat some CPU and IO resources, and memory too. * Number of concurrent HTTP connections to your website ? If you have a website, you will probably use some form of connection pooling, or lighttpd/fastcgi, or a proxy, whatever, so the number of open database connections at the same time won't be that high. Unless you use mod_php without connection pooling, in that case it will suck of course, but that's normal. * Number of people using your client ? See number of idle connections above. Or use connection pool. I want to know about the limitations, like how much memory do i have to use That depends on what you want to do ;) How big could be my database ? That depends on what you do with it ;) Working set size is more relevant than total database size. For instance if your database contains orders from the last 10 years, but only current orders (say orders from this month) are accessed all the time, with old orders being rarely accessed, you want the last 1-2 months' worth of orders to fit in RAM for fast access (caching) but you don't need RAM to fit your entire database. So, think about working sets not total sizes. And there is no limit on the table size (well, there is, but you'll never hit it). People have terabytes in postgres and it seems to work ;) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgresql is very slow
SELECT relpages, reltuples FROM pg_class WHERE relname ='users'; relpages | reltuples --+--- 54063 | 2307 (1 row) This is a horribly bloated table. The Output of query on the old server which is fast relpages | reltuples --+--- 42 | 1637 This is a healthy table. You need to clean up the users table. For this the easiest way is either to VACUUM FULL or CLUSTER it. CLUSTER will be faster in your case. Use whatever index makes sense, or even the PK. The Slow server load increases whenever i run a simple query, is it the good idea to run VACUUM full on the live server's database now or it should be run when the traffic is very low may be in weekend. Use CLUSTER. It is blocking so your traffic will suffer during the operation, which should not take very long. Since you have very few rows, most of the needed time will be reading the table from disk. I would suggest to do it right now. CLUSTER users_pk ON users; Then, configure your autovacuum so it runs often enough. On a small table like this (once cleaned up) VACUUM will be very fast, 42 pages should take just a couple tens of ms to vacuum, so you can do it often. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] insert/update tps slow with indices on table > 1M rows
I am using the c-library interface and for these particular transactions I preload PREPARE statements. Then as I get requests, I issue a BEGIN, followed by at most 300 EXECUTES and then a COMMIT. That is the general scenario. What value beyond 300 should I try? Thanks. Do you have PREPARE statements whose performance might change as the table grows ? I mean, some selects, etc... in that case if you start with an empty table, after inserting say 100K rows you might want to just disconnect, reconnect and analyze to trigger replanning of those statements. Also, how might COPY (which involves file I/O) improve the above scenario? It won't but if you see that COPY is very much faster than your INSERT based process it will give you a useful piece of information. I understand your problem is : - Create table with indexes - Insert batches of rows - After a while it gets slow Try : - Create table with indexes - COPY huge batch of rows - Compare time with above Since COPY also updates the indexes just like your inserts do it will tell you if it's the indexes which slow you down or something else. Also for insert heavy loads it's a good idea to put the xlog on a separate disk (to double your disk bandwidth) unless you have a monster disk setup. During your INSERTs, do you also make some SELECTs ? Do you have triggers on the table ? Foreign keys ? Anything ? How much RAM you have ? And can you measure the size of the table+indexes when it gets slow ? - Original Message From: James Mansion <[EMAIL PROTECTED]> To: andrew klassen <[EMAIL PROTECTED]> Cc: pgsql-performance@postgresql.org Sent: Wednesday, June 4, 2008 3:20:26 PM Subject: Re: [PERFORM] insert/update tps slow with indices on table > 1M rows andrew klassen wrote: I'll try adding more threads to update the table as you suggest. You could try materially increasing the update batch size too. As an exercise you could see what the performance of COPY is by backing out the data and reloading it from a suitable file. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] insert/update tps slow with indices on table > 1M rows
On Wed, 04 Jun 2008 00:36:09 +0200, andrew klassen <[EMAIL PROTECTED]> wrote: Running postgres 8.2.5 I have a table that has 5 indices, no foreign keys or any dependency on any other table. If delete the database and start entering entries, everything works very well until I get to some point (let's say 1M rows). Basically, I have a somewhat constant rate of inserts/updates that go into a work queue and then get passed to postgres. The work queue starts filling up as the responsiveness slows down. For example at 1.5M rows it takes >2 seconds for 300 inserts issued in one transaction. Prior to this point I had added regular VACUUM ANALYZE on the table and it did help. I increased maintenance work memory to 128M. I also set the fillfactor on the table indices to 50% (not sure if that made any difference have to study results more closely). In an effort to figure out the bottleneck, I DROPed 4 of the indices on the table and the tps increased to over 1000. I don't really know which index removal gave the best performance improvement. I dropped 2 32-bit indices and 2 text indices which all using btree. The cpu load is not that high, i.e. plenty of idle cpu. I am running an older version of freebsd and the iostat output is not very detailed. During this time, the number is low < 10Mbs. The system has an LSI Logic MegaRAID controller with 2 disks. Any ideas on how to find the bottleneck/decrease overhead of index usage. Thanks. If you are filling an empty table it is generally faster to create the indexes after the data import. Of course if this is a live table or you need the indexes during the import, this is not an option. I find it generally faster to lightly preprocess the data and generate text files that I then import using COPY, then doing the rest of the processing in SQL. How much RAM in the box ? size of the data & indexes ? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] query performance question
Thank you for your reply. I don't really need to count rows in transactions table, I just thought this was a good example to show how slow the query was. Usually you're more interested in the performance of the queries you need to make rather than the ones you don't need to make ;) But based on what you wrote it looks like count(*) is slow in general, so this seems to be OK since the table is rather large. Well any query that needs to scan 60 million rows will be slow... Now understand that this is not a problem with count(*) which can be very fast if you "select count(*) where..." and the condition in the where produces a reasonable number of rows to count, it is just a problem of having to scan the 60 million rows. But fortunately since it is perfectly useless to know the rowcount of this 60 million table with a perfect precision you never need to make this query ;) I just ran other queries (joining transactions table) and they returned quickly, which leads me to believe that there could be a problem not with the database, but with the box the db is running on. Sometimes those same queries take forever and now they complete in no time at all, so perhaps there is a process that is running periodically which is slowing the db down. Then if you have specific queries that you need to optimize you will need to run EXPLAIN ANALYZE on them and post the results, when they are fast and when they are slow to see if there is a difference in plans. Also the output from vmstat in times of big slowness can provide useful information. Crosschecking with your cron jobs, etc is a good idea. Also the usual suspects, like are your tables VACUUM'd and ANALYZE'd etc. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] "Big O" notation for postgres?
On Wed, 21 May 2008 16:10:53 +0200, H. Hall <[EMAIL PROTECTED]> wrote: Does anyone know if there is a source that provides "Big O" notation for postgres's aggregate functions and operations? For example is count(*) = O(1) or O(n)? Do the developers for postgres use Big O when selecting algorithms? If so, is the info easily available? You can't do any better than O( n rows examined by the aggregate ) except for max() and min() on an indexed expression, which in this case aren't really aggrgates anymore since they are internally rewritten as an index lookup to get the value you want... but stuff like sum() or avg() or count() will always have to see all the rows selected (and some more) unless you use clever hacks like materialized views etc, in which case the thing in the O() will change, or at least the O() constant will change... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Posible planner improvement?
On Wed, 21 May 2008 15:09:49 +0200, Richard Huxton <[EMAIL PROTECTED]> wrote: Luke Lonergan wrote: The problem is that the implied join predicate is not being propagated. This is definitely a planner deficiency. IIRC only equality conditions are propagated and gt, lt, between aren't. I seem to remember that the argument given was that the cost of checking for the ability to propagate was too high for the frequency when it ocurred. Of course, what was true for code and machines of 5 years ago might not be so today. Suggestion : when executing a one-off sql statement, optimizer should try to offer "best effort while being fast" ; when making a plan that will be reused many times (ie PREPARE, functions...) planning time could be mch longer... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] improving performance for a delete
On Tue, 20 May 2008 22:03:30 +0200, kevin kempter <[EMAIL PROTECTED]> wrote: Version 8.3.1 On May 20, 2008, at 1:51 PM, kevin kempter wrote: Hi all; I have 2 tables where I basically want to delete from the first table (seg_id_tmp7) any rows where the entire row already exists in the second table (sl_cd_segment_dim) I have a query that looks like this (and it's slow): delete from seg_id_tmp7 where customer_srcid::text || Besides being slow as hell and not able to use any indexes, the string concatenation can also yield incorrect results, for instance : season_name::text || episode_srcid::text Will have the same contents for season_name='season 1' episode_srcid=12 season_name='season 11' episode_srcid=2 I suggest doing it the right way, one possibility being : test=> EXPLAIN DELETE from test where (id,value) in (select id,value from test2); QUERY PLAN - Hash IN Join (cost=2943.00..6385.99 rows=2 width=6) Hash Cond: ((test.id = test2.id) AND (test.value = test2.value)) -> Seq Scan on test (cost=0.00..1442.99 rows=9 width=14) -> Hash (cost=1443.00..1443.00 rows=10 width=8) -> Seq Scan on test2 (cost=0.00..1443.00 rows=10 width=8) Thanks to the hash it is very fast, one seq scan on both tables, instead of one seq scan PER ROW in your query. Another solution would be : test=> EXPLAIN DELETE FROM test USING test2 WHERE test.id=test2.id AND test.value=test2.value; QUERY PLAN - Hash Join (cost=2943.00..6385.99 rows=2 width=6) Hash Cond: ((test.id = test2.id) AND (test.value = test2.value)) -> Seq Scan on test (cost=0.00..1442.99 rows=9 width=14) -> Hash (cost=1443.00..1443.00 rows=10 width=8) -> Seq Scan on test2 (cost=0.00..1443.00 rows=10 width=8) Which chooses the same plan here, quite logically, as it is the best one in this particular case. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] I/O on select count(*)
The real question in my mind is why this turns into a bottleneck before the similar task of cleaning the 16MB XLOG segment does. I expected that one would need to be cracked before the CLOG switch time could possibly be an issue, but reports from the field seem to suggest otherwise. Hm, on current CPUs zeroing 8kB of RAM should take less than 2 us... now if it has to be written to disk, that's another story ! -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] postgres overall performance seems to degrade when large SELECT are requested
The problem seem to arise when a SELECT that returns a lot of rows is Does the SELECT return a lot of rows, or does it scan a lot of rows ? (for instance, if you use aggregates, it might scan lots of data but only return few rows). The problem is that when the SELECTs are run the main application starts running out of available connections which means that postgres is not returning the query results fast enough. What I find a little bit starnge is that the report engine's SELECTs operate on a different set of tables than the ones the main application is using. Also the db box is hardly breaking a sweat, CPU and memory utilization are ridiculously low and IOwaits are typically less than 10%. Is it swapping ? (vmstat -> si/so) Is it locking ? (probably not from what you say) Is the network connection between the client and DB server saturated ? (easy with 100 Mbps connections, SELECT with a large result set will happily blast your LAN) Is the reporting tool running on the same machine as the DB client and killing it ? (swapping, etc) If it's a saturated network, solutions are : - install Gb ethernet - run the report on the database server (no bandwidth problems...) - rewrite the reporting tool to use SQL aggregates to transfer less data over the network - or use a cursor to fetch your results in chunks, and wait a little between chunks Has anyone experienced this? Yeah on benchmarks sometimes the LAN gave up before Postgres broke a sweat... Gb ethernet solved that... Are there any settings I can change to improve throughput? Any help will be greatly appreciated. iptraf will tell you all about your network traffic vmstat will tell you if your server or client is io-cpu-swap bound you'd need to post output from those... Thanks, val __ Sent from Yahoo! Mail. A Smarter Email http://uk.docs.yahoo.com/nowyoucan.html -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] can I move sort to first outer join ?
On Wed, 14 May 2008 06:40:40 +0200, fernando castano <[EMAIL PROTECTED]> wrote: Hi all, This sql is taking too long for the size of my tiny db. Any tips from this alias? I tried moving the sort to the first left outer join (between projects and features tables) using a nested subquery, but postgres tells me only one column could be returned from a subqueyr. Instead of : SELECT * FROM a LEFT JOIN b LEFT JOIN c WHERE c.column=... ORDER BY c.x LIMIT N You could write : SELECT * FROM a LEFT JOIN b LEFT JOIN (SELECT * FROM c WHERE c.column=... ORDER BY c.x LIMIT N) AS cc ORDER BY cc.x LIMIT N This is only interesting of you use a LIMIT and this allows you to reduce the number of rows sorted/joined. However in your case this is not the right thing to do since you do not use LIMIT, and sorting your 846 rows will only take a very small time. Your problem are those seq scans, you need to optimize that query so it can use indexes. -> Seq Scan on projects (cost=0.00..10.90 rows=4 width=1884) (actual time=0.039..0.109 rows=10 loops=1) Filter: (visible AND (id = ANY ('{3,4,5,6,10,7,8,9,13,11}'::integer[]))) -> Hash (cost=10.40..10.40 rows=40 width=1870) (actual time=1.048..1.048 rows=101 loops=1) -> Seq Scan on features (cost=0.00..10.40 rows=40 width=1870) (actual time=0.026..0.464 rows=101 loops=1) -> Hash (cost=10.70..10.70 rows=70 width=1065) (actual time=0.098..0.098 rows=29 loops=1) -> Seq Scan on person_roles (cost=0.00..10.70 rows=70 width=1065) (actual time=0.014..0.037 rows=29 loops=1) -> Hash (cost=15.80..15.80 rows=580 width=106) (actual time=0.105..0.105 rows=32 loops=1) -> Seq Scan on project_tags (cost=0.00..15.80 rows=580 width=106) (actual time=0.013..0.036 rows=32 loops=1) Total runtime: 149.622 ms All those seq scans !!! Please post, for each of those tables : - The total number of rows (SELECT count(*) is fine) - The table definitions with indexes (\d table) EXPLAIN ANALYZE tells you the number of rows it picked out of a seq scan (that's the "rows=") but not the number of rows scanned... this is important, because a seq scan on a small table isn't a problem, but on a big one, it is. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] RAID controllers for Postgresql on large setups
You say that like you don't mind having PCI in a server whose job is to perform massive query over large data sets. I am in my 4th week at a new job. Trying to figure what I am working with. LOOL, ok, hehe, not exactly the time to have a "let's change everything" fit ;) From what I see I will likely get as much improvement from new hardware as from re-doing some of the database design. Can't get everything done at once, not to mention I have to redo one machine sooner rather than later so I need to prioritize. In fact for bulk IO a box with 2 SATA drives would be just as fast as your monster RAID, lol. I am working on setting up a standard test based on the type of operations that the company does. This will give me a beter idea. Specially I will work with the developers to make sure the queries I create for the benchmark are representative of the workload. watching vmstat (or iostat) while running a very big seq scan query will give you information about the reading speed of your drives. Same for writes, during one of your big updates, watch vmstat, you'll know if you are CPU bound or IO bound... - one core at 100% -> CPU bound - lots of free CPU but lots of iowait -> disk bound - disk throughput decent (in your setup, 100 MB/s) -> PCI bus saturation - disk throughput miserable (< 10 MB/s) -> random IO bound (either random reads or fsync() or random writes depending on the case) In your opinion if we get a new machine with PCI-e, at how many spindles will the SCSI random access superiority start to be less notable? Specially given the low number of connections we usually have running against these machines. Sorting of random reads depends on multiple concurrent requests (which you don't have). Sorting of random writes does not depend on concurrent requests so, you'll benefit on your updates. About SCSI vs SATA vs number of spindles : can't answer this one. We are using one single SQL statement. OK, so forget about fsync penalty, but do tune your checkpoints so they are not happening all the time... and bgwriter etc. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] RAID controllers for Postgresql on large setups
Will it pay to go to a controller with higher memory for existing machines? The one machine I am about to redo has PCI which seems to somewhat limit our options. Urgh. You say that like you don't mind having PCI in a server whose job is to perform massive query over large data sets. Your 12 high-end expensive SCSI drives will have a bandwidth of ... say 800 MB/s total (on reads), perhaps more. PCI limits you to 133 MB/s (theoretical), actual speed being around 100-110 MB/s. Conclusion : 85% of the power of your expensive drives is wasted by hooking them up to the slow PCI bus ! (and hence your money is wasted too) For instance here I have a box with PCI, Giga Ethernet and a software RAID5 ; reading from the RAID5 goes to about 110 MB/s (actual disk bandwidth is closer to 250 but it's wasted) ; however when using the giga ethernet to copy a large file over a LAN, disk and ethernet have to share the PCI bus, so throughput falls to 50 MB/s. Crummy, eh ? => If you do big data imports over the network, you lose 50% speed again due to the bus sharing between ethernet nic and disk controller. In fact for bulk IO a box with 2 SATA drives would be just as fast as your monster RAID, lol. And for bulk imports from network a $500 box with a few SATA drives and a giga-ethernet, all via PCIexpress (any recent Core2 chipset) will be faster than your megabuck servers. Let me repeat this : at the current state of SATA drives, just TWO of them is enough to saturate a PCI bus. I'm speaking desktop SATA drives, not high-end SCSI ! (which is not necessarily faster for pure throughput anyway). Adding more drives will help random reads/writes but do nothing for throughput since the tiny PCI pipe is choking. So, use PCIe, PCIx, whatever, but get rid of the bottleneck. Your money is invested in disk drives... keep those, change your RAID controller which sucks anyway, and change your motherboard ... If you're limited by disk throughput (or disk <-> giga ethernet PCI bus contention), you'll get a huge boost by going PCIe or PCIx. You might even need less servers. For future machines I plan to look into controllers with at least 512MB, which likely will be PCI-X/PCI-e.. not seen anything with large caches for PCI. That's because high performance != PCI Whether a SAN or just an external enclosure is 12disk enough to substain 5K inserts/updates per second on rows in the 30 to 90bytes territory? At 5K/second inserting/updating 100 Million records would take 5.5 hours. That is fairly reasonable if we can achieve. Faster would be better, but it depends on what it would cost to achieve. If you mean 5K transactions with begin / insert or update 1 row / commit, that's a lot, and you are going to need cache, BBU, and 8.3 so fsync isn't a problem anymore. On your current setup with 15K drives if you need 1 fsync per INSERT you won't do more than 250 per second, which is very limiting... PG 8.3's "one fsync per second instead of one at each commit" feature is a really cheap alternative to a BBU (not as good as a real BBU, but much better than nothing !) If you mean doing large COPY or inserting/updating lots of rows using one SQL statement, you are going to need disk bandwidth. For instance if you have your 100M x 90 byte rows + overhead, that's about 11 GB The amount of data to write is twice that because of the xlog, so 22 GB to write, and 11 GB to read, total 33 GB. On your setup you have a rather low 110 MB/s throughput it would take a bit more than 3 min 20 s. With 800 MB/s bandwidth it would take 45 seconds. (but I don't know if Postgres can process data this fast, although I'd say probably). Of course if you have many indexes which need to be updated this will add random IO and more WAL traffic to the mix. Checkpoints andbgwriter also need to be tuned so they don't kill your performance when writing lots of data. For your next servers as the other on the list will tell you, a good RAID card, and lots of SATA drives is a good choice. SATA is cheap, so you can get more drives for the same price, which means more bandwidth : http://tweakers.net/reviews/557/17/comparison-of-nine-serial-ata-raid-5-adapters-pagina-17.html Of course none of those uses PCI. RAID5 is good for read speed, and big sequential writes. So if the only thing that you do is load up a multi-gigabyte dump and process it, it's good. Now if you do bulk UPDATEs (like updating all the rows in one of the partitions of your huge table) RAID5 is good too. However RAID5 will choke and burn on small random writes, which will come from UPDATing random rows in a large table, updating indexes, etc. Since you are doing this apparently, RAID5 is therefore NOT advised ! Also consider the usual advice, like CLUSTER, or when you load a large amount of data in the database, COPY it to a temp t
Re: [PERFORM] Creating indexes
Hi, all. I want to ask what type of index is better to create for bigint types. I have table with bigint (bigserial) primary key. What type is better to use for it? I tried btree and hash, but didn't notice any differences in execution time. For GiST and GIN there is a trouble that I must create operator class, so I limited myself to use btree or hash. But if it's better to use gist or gin, coment are welcome. If you use BIGINT, I presume you will have lots of different values, in that case the best one is the btree. It is the most common and most optimized index type. GiST's strength is in using indexes for stuff that can't be done with a simple btree : geometry, full text, ltree, etc, but gist is slower in the case of indexing a simple value. GIN indexes are more compact and very fast for reads but updating is very slow (they are meant for mostly read-only tables). Hash is a bit of a fossil. Also it does not support range queries, so if you need that, btree is definitely better. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Creating a foreign key
When you create a foreign key to a table is there an index that is created on the foreign key automatically? No, Postgres doesn't do it for you, because if you create (ref_id) references table.id, you will perhaps create an index on (ref_id, date) which would then fill the purpose (and other purposes), or perhaps your table will have 10 rows (but postgres doesnt' know that when you create it) and having an index would be useless, or your table could have many rows but only a few distinct referenced values, in which case again the index would only slow things down. PG does not presume to know better than yourself what you're gonna do with your data ;) UNIQUE and PRIMARY KEY do create UNIQUE INDEXes, of course. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] need to speed up query
What is a "period" ? Is it a month, or something more "custom" ? Can periods overlap ? No periods can never overlap. If the periods did you would be in violation of many tax laws around the world. Plus it you would not know how much money you are making or losing. I was wondering if you'd be using the same query to compute how much was gained every month and every week, which would have complicated things. But now it's clear. To make this really funky you can have a Fiscal Calendar year start June 15 2008 and end on June 14 2009 Don't you just love those guys ? Always trying new tricks to make your life more interesting. Note that here you are scanning the entire table multiple times, the complexity of this is basically (rows in gltrans)^2 which is something you'd like to avoid. For accounting purposes you need to know the Beginning Balances, Debits, Credits, Difference between Debits to Credits and the Ending Balance for each account. We have 133 accounts with presently 12 periods defined so we end up 1596 rows returned for this query. Alright, I propose a solution which only works when periods don't overlap. It will scan the entire table, but only once, not many times as your current query does. So period 1 should have for the most part have Zero for Beginning Balances for most types of Accounts. Period 2 is Beginning Balance is Period 1 Ending Balance, Period 3 is Period 2 ending balance so and so on forever. Precisely. So, it is not necessary to recompute everything for each period. Use the previous period's ending balance as the current period's starting balance... There are several ways to do this. First, you could use your current query, but only compute the sum of what happened during a period, for each period, and store that in a temporary table. Then, you use a plpgsql function, or you do that in your client, you take the rows in chronological order, you sum them as they come, and you get your balances. Use a NUMERIC type, not a FLOAT, to avoid rounding errors. The other solution does the same thing but optimizes the first step like this : INSERT INTO temp_table SELECT period, sum(...) GROUP BY period To do this you must be able to compute the period from the date and not the other way around. You could store a period_id in your table, or use a function. Another much more efficient solution would be to have a summary table which keeps the summary data for each period, with beginning balance and end balance. This table will only need to be updated when someone finds an old receipt in their pocket or something. This falls under the stupid question and i'm just curious what other people think what makes a query complex? I have some rather complex queries which postgres burns in a few milliseconds. You could define complexity as the amount of brain sweat that went into writing that query. You could also define complexity as O(n) or O(n^2) etc, for instance your query (as written) is O(n^2) which is something you don't want, I've seen stuff that was O(2^n) or worse, O(n!) in software written by drunk students, in this case getting rid of it is an emergency... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] plan difference between set-returning function with ROWS within IN() and a plain join
On Tue, 06 May 2008 10:21:43 +0200, Frank van Vugt <[EMAIL PROTECTED]> wrote: L.S. I'm noticing a difference in planning between a join and an in() clause, before trying to create an independent test-case, I'd like to know if there's an obvious reason why this would be happening: Is the function STABLE ? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] need to speed up query
i've had to write queries to get trail balance values out of the GL transaction table and i'm not happy with its performance The table has 76K rows growing about 1000 rows per working day so the performance is not that great it takes about 20 to 30 seconds to get all the records for the table and when we limit it to single accounting period it drops down to 2 seconds What is a "period" ? Is it a month, or something more "custom" ? Can periods overlap ? COALESCE(( SELECT sum(gltrans.gltrans_amount) AS sum FROM gltrans WHERE gltrans.gltrans_date < period.period_start AND gltrans.gltrans_accnt_id = accnt.accnt_id AND gltrans.gltrans_posted = true), 0.00)::text::money AS beginbalance, Note that here you are scanning the entire table multiple times, the complexity of this is basically (rows in gltrans)^2 which is something you'd like to avoid. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] two memory-consuming postgres processes
I created several indices for the primary table, yes. Sure I can do a table for a volatile column, but then I'll have to create a new such table for each derived column -- that's why I tried to add a column to the existing table. Yet seeing this is really slow, and I need to to many derived analyses like this -- which are later scanned in other computations, so should persist -- I indeed see no other way but to procreate derived tables with the same key, one column per each... OK, so in that case, if you could do all of your derived column calculations in one query like this : CREATE TABLE derived AS SELECT ... FROM ... (perform all your derived calculations here) or : BEGIN; <-- this is important to avoid writing xlog CREATE TABLE derived AS ... INSERT INTO derived SELECT ... FROM ... (perform all your derived calculations here) COMMIT; Basically, updating the entire table several times to add a few simple columns is a bad idea. If you can compute all the data you need in one query, like above, it will be much faster. Especially if you join one large table to several smaller ones, and as long as the huge data set doesn't need to be sorted (check the query plan using EXPLAIN). Try to do as much as possible in one query to scan the large dataset only once. Note that the above will be faster than updating the entire table since it needs to write much less data : it doesn't need to delete the old rows, and it doesn't need to write the transaction log, since if the transaction rolls back, the table never existed anyway. Also since your newly created table doesn't have any indexes, they won't need to be updated. If you really need to update an entire table multiple times, you will need to : - Use hardware that can handle disk writes at a decent speed (that isn't a characteristic of a laptop drive) - use MyIsam, yes (but if you need to make complex queries on the data afterwards, it could suck). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Postgres Benchmark looking for maintainer
Hello, Executive summary : Last year I wrote a database benchmark which simulates a forum. It works on Postgres and MySQL. It could be useful. I have no time to touch this, so it is rotting on my harddrive. Who wants to adopt it ? I will put it on pgfoundry. I can spend a few hours documenting the source and writing some documentation and pass the package to someone who might be interested and more available. Details : The benchmark is a forum type load (actually it came from me arguing with the phpBB team, lol) but, unlike all forums I know, "correctly" optimized. A bunch of forums are created, and there is a website (in PHP), very basic, which allows you to browse the forums, view topics, and insert posts. It displays the usual forum info like last post, number of topics or posts in forum, number of posts in topic, etc. Then there is a benchmarking client, written in Python. It spawns a number of "users" who perform real-life actions, like viewing pages, adding posts, and there a few simulated moderators who will, once in a while, destroy topics and even forums. This client can hit the PHP website via HTTP. However postgres is so fast that you would need several PHP servers to kill it. So, I added a multi-backend capability to the client : it can hit the database directly, performing the queries the PHP script would have performed. However, postgres is still so fast that you won't be able to benchmark anything more powerful than a Core 2, the client would need to be rewritten in a compiled language like Java. Also, retrieving the posts' text easily blasted the 100 Mbps connection between server and client, so you would need Gigabit ethernet. So, the load is very realistic (it would mimic a real forum pretty well) ; but in order to benchmark it you must simulate humongous traffic levels. The only difference is that my benchmark does a lot more writing (post insertions) than a normal forum ; I wanted the database to grow big in a few hours. It also works on MySQL so you can get a good laugh. Actually I was able to extract some good performance out of MySQL, after lots of headaches, except that I was never able to make it use more than 1 core. Contrary to the usual benchmarks, the code is optimized for MySQL and for Postgres, and the stored procedures also. Thus, what is compared is not a least-common-denominator implementation that happens to work on both databases, but two implementations specifically targeted and optimized at each database. The benchmark is also pretty simple (unlike the TPC) but it is useful, first it is CPU-bound then IO-bound and clustering the tables does a lot for performance (you can test auto-cluster), checkpoints are very visible, etc. So it can provide useful information that is easier to understand that a very complex benchmark. Originally the purpose of the benchmark was to test postgres' full search ; the code is still there. Regards, Pierre -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Where do a novice do to make it run faster?
You got the order slightly wrong I guess. 1) hardware Would only come first if your RAM is really too small, or you use RAID5 on write-heavy tables, or what limits you is transaction fsync (hint : 8.3). Adding RAM is cheap. 2) rewriting my queries and table structures This should really come first. Log expensive queries. Note that an expensive query can be a slow query, or be a rather fast query that you execute lots of times, or a very simple and fast query that you execute really really too often. Now ask yourself : * What is this query supposed to do ? * Do I need this query ? Example : You put your sessions in a database ? => Perhaps put them in the good old filesystem ? Your PHP is loading lots of configuration from the database for every page. => Cache it, generate some PHP code once and include it, put it in the session if it depends on the user, but don't reload the thing on each page ! This feature is useless => Do you really need to display a birthday cake on your forum for those users who have their birthday today ? UPDATEs... => Do you really need to update the last time a user was online every time ? What about updating it every 5 minutes instead ? * Is this query inside a loop ? => Use JOIN. * Do I need all the rows from this query ? Example : You use pagination and perform the same query changing LIMIT/OFFSET ? => Perform the query once, retrieve the first N pages of result, cache it in the session or in a table. * You have a website ? => Use lighttpd and fastcgi * Do I need all the columns from this query ? * Do I suffer from locking ? etc. Now you should see some easy targets. For the queries that are slow, use EXPLAIN ANALYZE. Question your schema. etc. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Question about disk IO an index use and seeking advice
An index scan looks through the index and pulls in each pages as it sees it. A bitmap index scan looks through the index and makes a sorted list of all the pages it needs and then the bitmap heap scan reads all the pages. If your data is scattered then you may as well do the index scan, but if your data is sequential-ish then you should do the bitmap index scan. Is that right? Where can I learn more? I've read That's about it, yes. If your bitmap has large holes, it will seek, but if it has little holes, readahead will work. Hence, fast, and good. On indexscan, readahead doesn't help since the hits are pretty random. If you have N rows in the index with the same date, in which order whill they get scanned ? There is no way to know that, and no way to be sure this order corresponds to physical order on disk. About clustering: I know that CLUSTER takes an exclusive lock on the table. At present, users can query the table at any time, so I'm not allowed to take an exclusive lock for more than a few seconds. Then, CLUSTER is out. Could I achieve the same thing by creating a second copy of the table and then swapping the first copy out for the second? I think something like that would fit in my time frames If the archive table is read-only, then yes, you can do this. . About partitioning: I can definitely see how having the data in more manageable chunks would allow me to do things like clustering. It will definitely make vacuuming easier. About IO speeds: The db is always under some kind of load. I actually get scared if the load average isn't at least 2. Could I try to run something like bonnie++ to get some real load numbers? I'm sure that would cripple the system while it is running, but if it only takes a few seconds that would be ok. There were updates running while I was running the test. The WAL log is on the hardware raid 10. Moving it from the software raid 5 almost doubled our insert performance. Normal ; fsync on a RAID5-6 is bad, bad. You have battery backed up cache ? Thanks again, --Nik -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance of the Materialize operator in a query plan
Do you mean, that the overhead is an artefact of timing the query? In that case, the query should run faster than its evaluation with EXPLAIN ANALYZE, correct? Is there a way to test this assumption regarding the speed of gettimeofday? I'm on a Macbook and have no idea about the performance of its implementation. Run EXPLAIN ANALYZE query Type \timing Run SELECT count(*) FROM (query) AS foo \timing gives timings as seen by the client. If you're local, and the result set is one single integer, client timings are not very different from server timings. If the client must retrieve lots of rows, this will be different, hence the fake count(*) above to prevent this. You might want to explain the count(*) also to be sure the same plan is used... And yes EXPLAIN ANALYZE has overhead, sometimes significant. Think Heisenberg... You will measure it easily with this dumb method ;) Here a very dumb query : SELECT count(*) FROM test; count --- 9 (1 ligne) Temps : 26,924 ms test=> EXPLAIN ANALYZE SELECT count(*) FROM test; QUERY PLAN Aggregate (cost=1692.99..1693.00 rows=1 width=0) (actual time=66.314..66.314 r ows=1 loops=1) -> Seq Scan on test (cost=0.00..1442.99 rows=9 width=0) (actual time=0. 013..34.888 rows=9 loops=1) Total runtime: 66.356 ms (3 lignes) Temps : 66,789 ms Apparently measuring the time it takes to get a row from the table takes 2x as long as actually getting the row from the table. Which is reassuring, in a way, since grabbing rows out of tables isn't such an unusual operation. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Question about disk IO an index use and seeking advice
Our ~600,000,000 row table is changed very infrequently and is on a 12 disk software raid-6 for historical reasons using an LSI Logic / Symbios Logic SAS1068 PCI-X Fusion-MPT SAS Our ~50,000,000 row staging table is on a 12 disk hardware raid-10 using a Dell PowerEdge Expandable RAID controller 5. So my disk IO and index question. When I issue a query on the big table like this: SELECTcolumn, count(*) FROMbigtable GROUP BY column ORDER BY count DESC When I run dstat to see my disk IO I see the software raid-6 consistently holding over 70M/sec. This is fine with me, but I generally don't like to do queries that table scan 600,000,000 rows. So I do: Note that RAID5 or 6 is fine when reading, it's the small random writes that kill it. Is the table being inserted to while you run this query, which will generate small random writes for the index updates ? Or is the table only inserted to during the nightly cron job ? 70 MB/s seems to me quite close to what a single SATA disk could do these days. My software RAID 5 saturates the PCI bus in the machine and pushes more than 120 MB/s. You have PCI-X and 12 disks so you should get hge disk throughput, really mindboggling figures, not 70 MB/s. Since this seems a high-budget system perhaps a good fast hardware RAID ? Or perhaps this test was performed under heavy load and it is actually a good result. All of the rows in the staging table are changed at least once and then deleted and recreated in the bigger table. All of the staging table's indexes are on the raid-10. The postgres data directory itself is on the raid-6. I think all the disks are SATA 10Ks. The setup is kind of a beast. SELECTcolumn, count(*) FROMbigtable WHERE date > '4-24-08' GROUP BY column ORDER BY count DESC When I run dstat I see only around 2M/sec and it is not consistent at all. So my question is, why do I see such low IO load on the index scan version? First, it is probably choosing a bitmap index scan, which means it needs to grab lots of pages from the index. If your index is fragmented, just scanning the index could take a long time. Then, i is probably taking lots of random bites in the table data. If this is an archive table, the dates should be increasing sequentially. If this is not the case you will get random IO which is rather bad on huge data sets. So. If you need the rows to be grouped on-disk by date (or perhaps another field if you more frequently run other types of query, like grouping by category, or perhaps something else, you decide) : The painful thing will be to reorder the table, either - use CLUSTER - or recreate a table and INSERT INTO it ORDER BY the field you chose. This is going to take a while, set sort_mem to a large value. Then create the indexes. Then every time you insert data in the archive, be sure to insert it in big batches, ORDER BY the field you chose. That way new inserts will be still in the order you want. While you're at it you might think about partitioning the monster on a useful criterion (this depends on your querying). If I could tweak some setting to make more aggressive use of IO, would it actually make the query faster? The field I'm scanning has a .960858 correlation, but I haven't vacuumed since importing any of the data that You have ANALYZEd at least ? Cause if you didn't and an index scan (not bitmap) comes up on this kind of query and it does a million index hits you have a problem. I'm scanning, though the correlation should remain very high. When I do a similar set of queries on the hardware raid I see similar performance except the numbers are both more than doubled. Here is the explain output for the queries: SELECTcolumn, count(*) FROMbigtable GROUP BY column ORDER BY count DESC "Sort (cost=74404440.58..7440.53 rows=1581 width=10)" " Sort Key: count(*)" " -> HashAggregate (cost=74404336.81..74404356.58 rows=1581 width=10)" "-> Seq Scan on bigtable (cost=0.00..71422407.21 rows=596385921 width=10)" Plan is OK (nothing else to do really) --- SELECTcolumn, count(*) FROMbigtable WHERE date > '4-24-08' GROUP BY column ORDER BY count DESC "Sort (cost=16948.80..16948.81 rows=1 width=10)" " Sort Key: count(*)" " -> HashAggregate (cost=16948.78..16948.79 rows=1 width=10)" "-> Index Scan using date_idx on bigtable (cost=0.00..16652.77 rows=59201 width=10)" " Index Cond: (date > '2008-04-21 00:00:00'::timestamp without time zone)" Argh. So you got an index scan after all. Is the 59201 rows estimate right ? If it is 10 times that you really have a problem. Is it ANALYZEd ? So now the asking for advice part. I have two questions: What is the fastest way to copy data from the smaller table to the larger table? INSERT INTO SELECT FR
Re: [PERFORM] Optimizer's issue
On Thu, 24 Apr 2008 03:14:54 +0200, Vlad Arkhipov <[EMAIL PROTECTED]> wrote: I found strange issue in very simple query. Statistics for all columns is on the level 1000 but I also tried other levels. create table g ( id bigint primary key, isgroup boolean not null); create table a ( groupid bigint references g(id), id bigint, unique(id, groupid)); analyze g; analyze a; select count(*) from a 294 select count(*) from g 320 explain analyze select * from g join a on a.groupid = g.id where g.isgroup Hash Join (cost=5.35..11.50 rows=11 width=25) (actual time=0.261..1.755 rows=294 loops=1) Hash Cond: (a.groupid = g.id) -> Seq Scan on a (cost=0.00..4.94 rows=294 width=16) (actual time=0.047..0.482 rows=294 loops=1) -> Hash (cost=5.20..5.20 rows=12 width=9) (actual time=0.164..0.164 rows=12 loops=1) -> Seq Scan on g (cost=0.00..5.20 rows=12 width=9) (actual time=0.042..0.136 rows=12 loops=1) Filter: isgroup Total runtime: 2.225 ms You should really put an EXPLAIN ANALYZE of your big query. This little query plan seems OK to me. Two very small tables, ok, hash'em, it's the best. Now, of course if it is repeated for every row in your JOIN, you have a problem. The question is, why is it repeated for every row ? This cannot be answered without seeing the whole query. Another question would be, is there a way to structure the tables differently ? Again, this cannot be answered without seeing the whole query, and some explanation about what the data & fields mean. Please provide more information... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] mysterious difference in speed when combining two queries with OR
I should say that this is on postgresql 7.4.16 (debian stable). Whoa. I cannot understand why the following two queries differ so much in execution time (almost ten times) Post EXPLAIN ANALYZE for both, and also post table definitions (with indexes), use \d table. This will allow people to help you. $ time psql -o /dev/null -f query-a.sql fektest real0m2.016s user0m1.532s sys 0m0.140s You are measuring the time it takes the server to perform the query, plus this : - time for the client (psql) to launch itself, - to read the configuration file, - to connect to the server, send the query - to transfer the results back to the client (is this on network or local ? what is the amount of data transferred ?) - to process the results, format them as text, display them, - to close the connection, - to exit cleanly As you can see from the above numbers, - 2.016 seconds elapsed on your wall clock, of which : - 76% was used as CPU time in the client (therefore of absolutely no relevance to postgresql server performance) - and the rest (24%) distributed in unknown proportion between server CPU spent to process your query, network roundtrips, data transfer, server iowait, etcetera. In order to properly benchmark your query, you should : 1- Ensure the server is not loaded and processing any other query (unless you explicitly intend to test behaviour under load) If you don't do that, your timings will be random, depending on how much load you have, if someone holds a lock you have to wait on, etc. 2- ssh to your server and use a psql session local to the server, to avoid network roundtrips. 3- enable statement timing with \t 2- EXPLAIN your query. Check the plan. Check the time it took to EXPLAIN, this will tell you how much time it takes to parse and plan your query. 2- EXPLAIN ANALYZE your query. Do it several times, note the different timings and understand the query plans. If the data was not cached, the first timing will be much longer than the subsequent other timings. This will give you useful information about the behaviour of this query : if lasts for 1 second (cached) and 5 minutes (not cached), you might not want to execute it at the same time as that huge scheduled backup job. Those timings will also provide hints on wether you should CLUSTER the table, etc. 3- EXPLAIN SELECT count(*) FROM (your query) AS foo Check that the plan is the same. 4- SELECT count(*) FROM (your query) AS foo The count(*) means very little data is exchanged between client and server, so this doesn't mess with the timing. Now, compare : The timings displayed by psql (\t) include query planning, roundtrip to server, and result processing (hence the count(*) to reduce this overhead). The timings displayed by EXPLAIN ANALYZE include only query execution time, but EXPLAIN ANALYZE is slower than just executing the query, because it takes time to instrument the query and measure its performance. For instance, on a very simple query that computes an aggregate on lots of rows, more time will be spent measuring than actually executing the query. Hence steps 3 and 4 above. Knowing this, you deduce the time it takes to parse & plan your query (should you then use PREPAREd statements ? up to you) and the time it takes to execute it. 5- EXPLAIN ANALYZE, while changing the parameters (trying some very selective or less selective ones) to check for plan change, mess with enable_ parameters to check for different plans, rewrite the query differently (DISTINCT/GROUP BY, OR/UNION, JOIN or IN(subquery), etc). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] CPU bound at 99%
about 2300 connections in idle (ps auxwww | grep postgres | idle) [...] The server that connects to the db is an apache server using persistent connections. MaxClients is 2048 thus the high number of connections needed. Application was written in PHP using the Pear DB class. This is pretty classical. When your number of threads gets out of control, everything gets slower, so more requests pile up, spawning more threads, this is positive feedback, and in seconds all hell breaks loose. That's why I call it imploding, like if it collapses under its own weight. There is a threshold effect and it gets from working good to a crawl rather quickly once you pass the threshold, as you experienced. Note that the same applies to Apache, PHP as well as Postgres : there is a "sweet spot" in the number of threads, for optimum efficiency, depending on how many cores you have. Too few threads, and it will be waiting for IO or waiting for the database. Too many threads, and CPU cache utilization becomes suboptimal and context switches eat your performance. This sweet spot is certainly not at 500 connections per core, either for Postgres or for PHP. It is much lower, about 5-20 depending on your load. I will copypaste here an email I wrote to another person with the exact same problem, and the exact same solution. Please read this carefully : * Basically there are three classes of websites in my book. 1- Low traffic (ie a few hits/s on dynamic pages), when performance doesn't matter 2- High traffic (ie 10-100 hits/s on dynamic pages), when you must read the rest of this email 3- Monster traffic (lots more than that) when you need to give some of your cash to Akamai, get some load balancers, replicate your databases, use lots of caching, etc. This is yahoo, flickr, meetic, etc. Usually people whose web sites are imploding under load think they are in class 3 but really most of them are in class 2 but using inadequate technical solutions like MySQL, etc. I had a website with 200K members that ran on a Celeron 1200 with 512 MB RAM, perfectly fine, and lighttpd wasn't even visible in the top. Good news for you is that the solution to your problem is pretty easy. You should be able to solve that in about 4 hours. Suppose you have some web servers for static content ; obviously you are using lighttpd on that since it can service an "unlimited" (up to the OS limit, something like 64K sockets) number of concurrent connections. You could also use nginx or Zeus. I think Akamai uses Zeus. But Lighttpd is perfectly fine (and free). For your static content servers you will want to use lots of RAM for caching, if you serve images, put the small files like thumbnails, css, javascript, html pages on a separate server so that they are all served from RAM, use a cheap CPU since a Pentium-M with lighttpd will happily push 10K http hits/s if you don't wait for IO. Large files should be on the second static server to avoid cache trashing on the server which has all the frequently accessed small files. Then you have some web servers for generating your dynamic content. Let's suppose you have N CPU cores total. With your N cores, the ideal number of threads would be N. However those will also wait for IO and database operations, so you want to fill those wait times with useful work, so maybe you will use something like 2...10 threads per core. This can only be determined by experimentation, it depends on the type and length of your SQL queries so there is no "one size fits all" answer. Example. You have pages that take 20 ms to generate, and you have 100 requests for those coming up. Let's suppose you have one CPU core. (Note : if your pages take longer than 10 ms, you have a problem. On the previously mentioned website, now running on the cheapest Core 2 we could find since the torrent tracker eats lots of CPU, pages take about 2-5 ms to generate, even the forum pages with 30 posts on them. We use PHP with compiled code caching and SQL is properly optimized). And, yes, it uses MySQL. Once I wrote (as an experiment) an extremely simple forum which did 1400 pages/second (which is huge) with a desktop Core2 as the Postgres 8.2 server. - You could use Apache in the old fasion way, have 100 threads, so all your pages will take 20 ms x 100 = 2 seconds, But the CPU cache utilisation will suck because of all those context switches, you'll have 100 processes eating your RAM (count 8MB for a PHP process), 100 database connections, 100 postgres processes, the locks will stay on longer, transactions will last longer, you'll get more dead rows to vacuum, etc. And actually, since Apache will not buffer the output of your scripts, the PHP or Perl interpreter will stay in memory (and hog a database connection) until the client at the o
Re: [PERFORM] Oddly slow queries
that's correct, there are nightly (at least at the moment) processes that insert around 2-3 mio rows and delete about the same amount. I can see that those 'checkpoints are occurring too frequently' messages are only logged during that timeframe. Perhaps you should increase the quantity of xlog PG is allowed to write between each checkpoint (this is checkpoint_segments). Checkpointing every 10 seconds is going to slow down your inserts also, because of the need to fsync()'ing all those pages, not to mention nuking your IO-bound SELECTs. Increase it till it checkpoints every 5 minutes or something. I assume that it's normal that so many INSERT's and DELETE's cause the Well, also, do you use batch-processing or plpgsql or issue a huge mass of individual INSERTs via some script ? If you use a script, make sure that each INSERT doesn't have its own transaction (I think you know that since with a few millions of rows it would take forever... unless you can do 1 commits/s, in which case either you use 8.3 and have activated the "one fsync every N seconds" feature, or your battery backed up cache works, or your disk is lying)... If you use a script and the server is under heavy load you can : BEGIN Process N rows (use multi-values INSERT and DELETE WHERE .. IN (...)), or execute a prepared statement multiple times, or copy to temp table and process with SQL (usually much faster) COMMIT Sleep Wash, rinse, repeat background writer to choke a little bit. I guess I really need to adjust the processes to INSERT and DELETE rows in a slower pace if I want to do other queries during the same time. cheers, tom -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Group by more efficient than distinct?
On Sun, 20 Apr 2008 17:15:36 +0200, Francisco Reyes <[EMAIL PROTECTED]> wrote: PFC writes: - If you process up to some percentage of your RAM worth of data, hashing is going to be a lot faster Thanks for the excellent breakdown and explanation. I will try and get sizes of the tables in question and how much memory the machines have. Actually, the memory used by the hash depends on the number of distinct values, not the number of rows which are processed... Consider : SELECT a GROUP BY a SELECT a,count(*) GROUP BY a In both cases the hash only holds discinct values. So if you have 1 million rows to process but only 10 distinct values of "a", the hash will only contain those 10 values (and the counts), so it will be very small and fast, it will absorb a huge seq scan without problem. If however, you have (say) 100 million distinct values for a, using a hash would be a bad idea. As usual, divide the size of your RAM by the number of concurrent connections or something. Note that "a" could be a column, several columns, anything, the size of the hash will be proportional to the number of distinct values, ie. the number of rows returned by the query, not the number of rows processed (read) by the query. Same with hash joins etc, that's why when you join a very small table to a large one Postgres likes to use seq scan + hash join on the small table. - If you need DISTINCT ON, well, you're stuck with the Sort - So, for the time being, you can replace DISTINCT with GROUP BY... Have seen a few of those already on some code (new job..) so for those it is a matter of having a good disk subsystem? Depends on your RAM, sorting in RAM is always faster than sorting on disk of course, unless you eat all the RAM and trash the other processes. Tradeoffs... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Group by more efficient than distinct?
On Fri, 18 Apr 2008 11:36:02 +0200, Gregory Stark <[EMAIL PROTECTED]> wrote: "Francisco Reyes" <[EMAIL PROTECTED]> writes: Is there any dissadvantage of using "group by" to obtain a unique list? On a small dataset the difference was about 20% percent. Group by HashAggregate (cost=369.61..381.12 rows=1151 width=8) (actual time=76.641..85.167 rows=2890 loops=1) Basically : - If you process up to some percentage of your RAM worth of data, hashing is going to be a lot faster - If the size of the hash grows larger than your RAM, hashing will fail miserably and sorting will be much faster since PG's disksort is really good - GROUP BY knows this and acts accordingly - DISTINCT doesn't know this, it only knows sorting, so it sorts - If you need DISTINCT x ORDER BY x, sorting may be faster too (depending on the % of distinct rows) - If you need DISTINCT ON, well, you're stuck with the Sort - So, for the time being, you can replace DISTINCT with GROUP BY... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] seq scan issue...
- why am I still getting a seq scan ? You'll seq scan tmp1 obviously, and also the other table since you fetch a very large part of it in the process. It's the only way to do this query since there is no WHERE to restrict the number of rows and the DISTINCT applies on columns from both tables. You might want to simplify your query. For instance perhaps you can get pf.segment_id out of the DISTINCT, in which case you can put the distinct in a subquery on tmp1. Thanks in advance. Explain PLan explain select distinct tmp1.affiliate_id, tmp1.name, tmp1.description, tmp1.create_dt, tmp1.playback_device_id, pf.segment_id from aff_id_tmp1 tmp1, playback_fragment pf where tmp1.playback_device_id = pf.playback_device_id ; Unique (cost=2966361.56..3194555.91 rows=10104496 width=97) -> Sort (cost=2966361.56..2998960.76 rows=13039677 width=97) Sort Key: tmp1.affiliate_id, tmp1.name, tmp1.description, tmp1.create_dt, tmp1.playback_device_id, pf.segment_id -> Hash Join (cost=23925.45..814071.14 rows=13039677 width=97) Hash Cond: (pf.playback_device_id = tmp1.playback_device_id) -> Seq Scan on playback_fragment pf (cost=0.00..464153.77 rows=130 39677 width=16) -> Hash (cost=16031.31..16031.31 rows=631531 width=89) -> Seq Scan on aff_id_tmp1 tmp1 (cost=0.00..16031.31 rows=63 1531 width=89) (1068 rows) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Query running slow
On Wed, 16 Apr 2008 23:14:11 +0200, samantha mahindrakar <[EMAIL PROTECTED]> wrote: Hi Iam finding the following query is working a bit slow: EXECUTE '(SELECT ARRAY(SELECT DISTINCT date_part(''day'', measurement_start) FROM ' || gettablestring(dates)|| ' WHERE lane_id IN (' || lanesidarr || ')))' INTO temparr; This function is trying to find all the days in a prticular month whihc has data for the particular lane and put it in an array...which can be used later. gettablestring(dates) returns the partition name from which the data needs to be extracted. These partitions have index on the measurement_start field. lanesidarr is a lane number. The partition has an index on this field to. Could anyone give me some hints???/ OK so I guess you have one partition per month since there is no month in your WHERE. If this is a table which hasn't got much write activity (probably the case for last month's partition, for instance), CLUSTER it on something appropriate that you use often in queries, like lane_id here. And you can use SELECT foo GROUP BY foo, this will use a hash, it is faster than a sort. Example : CREATE TABLE blop AS SELECT '2008-01-01'::TIMESTAMP + ((n%30)*'1 DAY'::INTERVAL) AS t FROM generate_series(1,10) AS n; ALTER TABLE blop ADD d DATE NULL; UPDATE blop SET d=t; VACUUM FULL ANALYZE blop; -- Now blop contains 100K timestamps and 100K dates from the month 2008-01 EXPLAIN ANALYZE SELECT DISTINCT EXTRACT( DAY from t ) FROM blop; QUERY PLAN - Unique (cost=10051.82..10551.82 rows=30 width=8) (actual time=221.740..289.801 rows=30 loops=1) -> Sort (cost=10051.82..10301.82 rows=10 width=8) (actual time=221.737..250.911 rows=10 loops=1) Sort Key: (date_part('day'::text, t)) Sort Method: quicksort Memory: 5955kB -> Seq Scan on blop (cost=0.00..1747.00 rows=10 width=8) (actual time=0.021..115.254 rows=10 loops=1) Total runtime: 290.237 ms (6 lignes) Temps : 290,768 ms EXPLAIN ANALYZE SELECT EXTRACT( DAY from t ) AS day FROM blop GROUP BY day; QUERY PLAN --- HashAggregate (cost=1997.00..1997.38 rows=30 width=8) (actual time=198.375..198.390 rows=30 loops=1) -> Seq Scan on blop (cost=0.00..1747.00 rows=10 width=8) (actual time=0.021..129.779 rows=10 loops=1) Total runtime: 198.437 ms (3 lignes) Temps : 198,894 ms ==> Hash is faster than Sort EXPLAIN ANALYZE SELECT d FROM blop GROUP BY d; QUERY PLAN -- HashAggregate (cost=1747.00..1747.30 rows=30 width=4) (actual time=101.829..101.842 rows=30 loops=1) -> Seq Scan on blop (cost=0.00..1497.00 rows=10 width=4) (actual time=0.012..33.428 rows=10 loops=1) Total runtime: 101.905 ms (3 lignes) Temps : 102,516 ms ==> Not computing the EXTRACT is faster obviously (actually EXPLAIN ANALYZE adds some overhead, the query really takes 60 ms) If you have an index lane_id, measurement_date, you can always do : for day in 1..31: find 1 row with which has this day reutrn the days you found -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Oddly slow queries
On Wed, 16 Apr 2008 06:07:04 +0200, Thomas Spreng <[EMAIL PROTECTED]> wrote: On 16.04.2008, at 01:24, PFC wrote: The queries in question (select's) occasionally take up to 5 mins even if they take ~2-3 sec under "normal" conditions, there are no sequencial scans done in those queries. There are not many users connected (around 3, maybe) to this database usually since it's still in a testing phase. I tried to hunt down the problem by playing around with resource usage cfg options but it didn't really made a difference. Could that be caused by a CHECKPOINT ? actually there are a few log (around 12 per day) entries concerning checkpoints: LOG: checkpoints are occurring too frequently (10 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". But wouldn't that only affect write performance? The main problems I'm concerned about affect SELECT queries. OK, so if you get 12 of those per day, this means your checkpoint interval isn't set to 10 seconds... I hope... Those probably correspond to some large update or insert query that comes from a cron or archive job ?... or a developer doing tests or filling a table... So, if it is checkpointing every 10 seconds it means you have a pretty high write load at that time ; and having to checkpoint and flush the dirty pages makes it worse, so it is possible that your disk(s) choke on writes, also killing the selects in the process. -> Set your checkpoint log segments to a much higher value -> Set your checkpoint timeout to a higher value (5 minutes or something), to be tuned afterwards -> Tune bgwriter settings to taste (this means you need a realistic load, not a test load) -> Use separate disk(s) for the xlog -> For the love of God, don't keep the RAID5 for production ! (RAID5 + 1 small write = N reads + N writes, N=3 in your case) Since this is a test server I would suggest RAID1 for the OS and database files and the third disk for the xlog, if it dies you just recreate the DB... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Oddly slow queries
The queries in question (select's) occasionally take up to 5 mins even if they take ~2-3 sec under "normal" conditions, there are no sequencial scans done in those queries. There are not many users connected (around 3, maybe) to this database usually since it's still in a testing phase. I tried to hunt down the problem by playing around with resource usage cfg options but it didn't really made a difference. Could that be caused by a CHECKPOINT ? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] db size
Will this help with performance ? Depends if the bloat is in part of your working set. If debloating can make the working set fit in RAM, or lower your IOs, you'll get a boost. Now, is the bloat in the tables (which tables ?) or in the indexes (which indexes ?), or in the toast tables perhaps, or in the system catalogs or all of the above ? Or perhaps there is a long-forgotten process that got zombified while holding a huge temp table ? (not very likely, but who knows). Use pg_relation_size() and its friends to get an idea of the size of stuff. I'll look into that, thanks -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] db size
If you suspect your tables or indexes are bloated, restore your dump to a test box. Use fsync=off during restore, you don't care about integrity on the test box. This will avoid slowing down your production database. Then look at the size of the restored database. If it is much smaller than your production database, then you have bloat. I have done that, and I get the following: the live one is 113G the restored one is 78G Ah. Good news for you is that you know that you can do something ;) Now, is the bloat in the tables (which tables ?) or in the indexes (which indexes ?), or in the toast tables perhaps, or in the system catalogs or all of the above ? Or perhaps there is a long-forgotten process that got zombified while holding a huge temp table ? (not very likely, but who knows). Use pg_relation_size() and its friends to get an idea of the size of stuff. Perhaps you have 1 extremely bloated table or index, or perhaps everything is bloated. The solution to your problem depends on which case you have. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] db size
Hi We currently have a 16CPU 32GB box running postgres 8.2. When I do a pg_dump with the following parameters "/usr/bin/pg_dump -E UTF8 -F c -b" I get a file of 14GB in size. But the database is 110GB in size on the disk. Why the big difference in size? Does this have anything to do with performance? I have a 2GB database, which dumps to a 340 MB file... Two reasons : - I have lots of big fat but very necessary indexes (not included in dump) - Dump is compressed with gzip which really works well on database data. If you suspect your tables or indexes are bloated, restore your dump to a test box. Use fsync=off during restore, you don't care about integrity on the test box. This will avoid slowing down your production database. Then look at the size of the restored database. If it is much smaller than your production database, then you have bloat. Time to CLUSTER, or REINDEX, or VACUUM FULL (your choice), on the tables that are bloated, and take note to vacuum those more often (and perhaps tune the autovacuum). Judicious use of CLUSTER on that small, but extremely often updated table can also be a very good option. 8.3 and its new HOT feature are also a good idea. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Cached Query Plans
Well if you're caching per-connection then it doesn't really matter whether you do it on the client side or the server side, it's pretty much exactly the same problem. Actually I thought about doing it on the server since it would then also work with connection pooling. Doing it on the client means the client has to maintain state, which is not possible in a pool... Unsurprisingly most drivers do precisely what you're describing. In Perl DBI for example you just change $dbh->prepare("") into $dbh->prepare_cached("") and it does exactly what you want. I would expect the PHP drivers to have something equivalent. Well, PHP doesn't. Perhaps I should patch PHP instead... Or perhaps this feature should be implemented in pgpool or pgbouncer. But, using prepared statements with persistent connections is messy, because you never know if the connection is new or not, If you were to fix *that* then both this problem and others (such as setting up desired SET-parameter values) would go away. True. Languages that keep a long-running context (like application servers etc) can do this easily. Although in the newer versions of PHP, it's not so bad, pconnect seems to work (ie. it will issue ROLLBACKs when the script dies, reset session variables like enable_indexscan, etc), so the only remaining problem seems to be prepared statements. And again, adding a method for the application to know if the persistent connection is new or not, will not work in a connection pool... Perhaps a GUC flag saying EXECUTE should raise an error but not kill the current transaction if the requested prepared statement does not exist ? Then the application would issue a PREPARE. It could also raise a non-fatal error when the tables have changed (column added, for instance) so the application can re-issue a PREPARE. But I still think it would be cleaner to do it in the server. Also, I rethought about what Gregory Stark said : The contention on the shared cache is likely to negate much of the planning savings but I think it would still be a win. If a shared plan cache is implemented, it will mostly be read-only, ie. when the application is started, new queries will come, so the plans will have to be written to the cache, but then once the cache contains everything it needs, it will not be modified that often, so I wouldn't think contention would be such a problem... It's not so easy as all that. Consider search_path. Consider temp tables. Temp tables : I thought plan revalidation took care of this ? (After testing, it does work, if a temp table is dropped and recreated, PG finds it, although of course if a table is altered by adding a column for instance, it logically fails). search_path: I suggested to either put the search_path in the cache key along with the SQL string, or force queries to specify schema.table for all tables. It is also possible to shoot one's foot with the current PREPARE (ie. search_path is used to PREPARE but of course not for EXECUTE), and also with plpgsql functions (ie. the search path used to compile the function is the one that is active when it is compiled, ie at its first call in the current connection, and not the search path that was active when the function was defined)... SET search_path TO DEFAULT; CREATE SCHEMA a; CREATE SCHEMA b; CREATE TABLE a.test( v TEXT ); CREATE TABLE b.test( v TEXT ); INSERT INTO a.test VALUES ('This is schema a'); INSERT INTO b.test VALUES ('This is schema b'); CREATE OR REPLACE FUNCTION test_search_path() RETURNS SETOF TEXT LANGUAGE plpgsql AS $$ DECLARE x TEXT; BEGIN FOR x IN SELECT v FROM test LOOP RETURN NEXT x; END LOOP; END; $$; test=> SET search_path TO a,public; test=> SELECT * FROM test_search_path(); test_search_path -- This is schema a test=> \q $ psql test test=> SET search_path TO b,public; test=> SELECT * FROM test_search_path(); test_search_path -- This is schema b test=> SET search_path TO a,public; test=> SELECT * FROM test_search_path(); test_search_path -- This is schema b -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Partitioned tables - planner wont use indexes
I tried removing the index from the part_master table and got the same result Since all the data is in the partitions, the part_master table is empty, so the index is not useful for your query. myDB=# explain SELECT min(logdate) FROM part_master; Proposals : 1- Use plpgsql to parse the system catalogs, get the list of partitions, and issue a min() query against each 2- Since dates tend to be incrementing, I guess the minimum date must not be changing that often (unless you delete rows) ; therefore if you need that information often I suggest a trigger that updates a separate table which keeps the min_date (perhaps global or for each client, you choose). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] large tables and simple "= constant" queries using indexes
Thanks a lot, all of you - this is excellent advice. With the data clustered and statistics at a more reasonable value of 100, it now reproducibly takes even less time - 20-57 ms per query. 1000x speedup with proper tuning - always impressive, lol. IO seeks are always your worst enemy. After reading the section on "Statistics Used By the Planner" in the manual, I was a little concerned that, while the statistics sped up the queries that I tried immeasurably, that the most_common_vals array was where the speedup was happening, and that the values which wouldn't fit in this array wouldn't be sped up. Though I couldn't offhand find an example where this occurred, the clustering approach seems intuitively like a much more complete and scalable solution, at least for a read-only table like this. Actually, with statistics set to 100, then 100 values will be stored in most_common_vals. This would mean that the values not in most_common_vals will have less than 1% frequency, and probably much less than that. The choice of plan for these rare values is pretty simple. With two columns, "interesting" stuff can happen, like if you have col1 in [1...10] and col2 in [1...10] and use a condition on col1=const and col2=const, the selectivity of the result depends not only on the distribution of col1 and col2 but also their correlation. As for the tests you did, it's hard to say without seeing the explain analyze outputs. If you change the stats and the plan choice (EXPLAIN) stays the same, and you use the same values in your query, any difference in timing comes from caching, since postgres is executing the same plan and therefore doing the exact same thing. Caching (from PG and from the OS) can make the timings vary a lot. - Trying the same constant a second time gave an instantaneous result, I'm guessing because of query/result caching. PG does not cache queries or results. It caches data & index pages in its shared buffers, and then the OS adds another layer of the usual disk cache. A simple query like selecting one row based on PK takes about 60 microseconds of CPU time, but if it needs one seek for the index and one for the data it may take 20 ms waiting for the moving parts to move... Hence, CLUSTER is a very useful tool. Bitmap index scans love clustered tables because all the interesting rows end up being grouped together, so much less pages need to be visited. - I didn't try decreasing the statistics back to 10 before I ran the cluster command, so I can't show the search times going up because of that. But I tried killing the 500 meg process. The new process uses less than 5 megs of ram, and still reproducibly returns a result in less than 60 ms. Again, this is with a statistics value of 100 and the data clustered by gene_prediction_view_gene_ref_key. Killing it or just restarting postgres ? If you let postgres run (not idle) for a while, naturally it will fill the RAM up to the shared_buffers setting that you specified in the configuration file. This is good, since grabbing data from postgres' own cache is faster than having to make a syscall to the OS to get it from the OS disk cache (or disk). This isn't bloat. But what those 500 MB versus 6 MB show is that before, postgres had to read a lot of data for your query, so it stayed in the cache ; after tuning it needs to read much less data (thanks to CLUSTER) so the cache stays empty. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] large tables and simple "= constant" queries using indexes
Perfect - thanks Arjen. Using your value of 200 decreased the time to 15 seconds, and using a value of 800 makes it almost instantaneous. I'm really not concerned about space usage; if having more statistics increases performance this much, maybe I'll just default it to 1000? Strangely, the steps taken in the explain analyze are all the same. The only differences are the predicted costs (and execution times). explain analyze for a statistics of 200: Actually, since you got the exact same plans and the second one is a lot faster, this can mean that the data is in the disk cache, or that the second query has all the rows it needs contiguous on disk whereas the first one has its rows all over the place. Therefore you are IO-bound. Statistics helped, perhaps (impossible to know since you don't provide the plan wit statistics set to 10), but your main problem is IO. Usually setting the statistics to 100 is enough... Now, here are some solutions to your problem in random order : - Install 64 bit Linux, 64 bit Postgres, and get lots of RAM, lol. - Switch to a RAID10 (4 times the IOs per second, however zero gain if you're single-threaded, but massive gain when concurrent) - If you just need a count by gene_ref, a simple solution is to keep it in a separate table and update it via triggers, this is a frequently used solution, it works well unless gene_ref is updated all the time (which is probably not your case). Since you will be vacuuming this count-cache table often, don't put the count as a field in your sgd_annotations table, just create a small table with 2 fields, gene_ref and count (unless you want to use the count for other things and you don't like the join). From your table definition gene_ref references another table. It would seem that you have many rows in gene_prediction_view with the same gene_ref value. - If you often query rows with the same gene_ref, consider using CLUSTER to physically group those rows on disk. This way you can get all rows with the same gene_ref in 1 seek instead of 2000. Clustered tables also make Bitmap scan happy. This one is good since it can also speed up other queries (not just the count). You could also cluster on (gene_ref,go_id) perhaps, I don't know what your columns mean. Only you can decide that because clustering order has to be meaningful (to group rows according to something that makes sense and not at random). * Lose some weight : CREATE INDEX ix_gene_prediction_view_gene_ref ON gene_prediction_view USING btree (gene_ref); - This index is useless since you have an UNIQUE on (gene_ref, go_id) which is also an index. Remove the index on (gene_ref), it will leave space in the disk cache for other things. - Since (gene_ref, go_id) is UNIQUE NOT NULL, you might be able to use that as your primary key, but only if it is never updated of course. Saves another index. - If you often do queries that fetch many rows, but seldom fetch the description, tell PG to always store the description in offline compressed form (read the docs on ALTER TABLE ... SET STORAGE ..., I forgot the syntax). Point being to make the main table smaller. - Also I see a category as VARCHAR. If you have a million different categories, that's OK, but if you have 100 categories for your 15M rows, put them in a separate table and replace that by a category_id (normalize !) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] large tables and simple "= constant" queries using indexes
Hi, I've started my first project with Postgres (after several years of using Mysql), and I'm having an odd performance problem that I was hoping someone might be able to explain the cause of. My query - select count(*) from gene_prediction_view where gene_ref = 523 - takes 26 seconds to execute, and returns 2400 (out of a total of 15 million records in the table) ---My problem--- Using a single-column index to count 2400 records which are exactly one constant value doesn't sound like something that would take 26 seconds. What's the slowdown? Any silver bullets that might fix this? * Please post an EXPLAIN ANALYZE of your query which will allow to choose between these two options : - If Postgres uses a bad plan (like a seq scan), you need to up the statistics for this column - If you get the correct plan (index scan or bitmap index scan) then it is likely that postgres does one disk seek per row that has to be counted. 26 seconds for 2400 rows would be consistent with a 10ms seek time. The unmistakable sign is that re-running the query will result in a very fast runtime (I'd say a couple ms for counting 2400 rows if no disk IO is involved). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] EXPLAIN detail
Well, this table has a primary key index on first column called acctid which is an integer; instead the calldate column is a TIMESTAMPTZ and in fact I'm using to do (calldate)::date in the ON clause because since the time part of that column is always different and in the nesting I have to identificate the date is the same... the other two columns (src and lastdata) are both VARCHAR(80) and the query is this one: Tip for getting answers from this list : You should just post the output of "\d yourtable" from psql, it would be quicker than writing a paragraph... Be lazy, lol. So, basically if I understand you are doing a self-join on your table, you want all rows from the same day, and you're doing something with the dates, and... Tip for getting answers from this list : Explain (in english) what your query actually does, someone might come up with a better idea on HOW to do it. Snip : EXPLAIN ANALYZE SELECT (a.calldate)::date, a.src, a.dst, MIN(e.calldate) - a.calldate FROM cdr a INNER JOIN cdr e ON ((e.calldate)::date = (a.calldate)::date AND e.src = a.src AND e.lastdata = '/dati/ita/logoutok' AND e.calldate >= a.calldate) WHERE (a.calldate)::date = '2008-04-09' AND a.src = '410' AND substr(a.dst, 1, 4) = '*100' AND a.lastdata = '/dati/ita/loginok' GROUP BY a.calldate, a.src, a.dst OK, I assume you have an index on calldate, which is a TIMESTAMPTZ ? (in that case, why is it called calldate, and not calltimestamp ?...) Bad news, the index is useless for this condition : (a.calldate)::date = '2008-04-09' There, you are asking postgres to scan the entire table, convert the column to date, and test. Bad. In order to use the index, you could rewrite it as something like : a.calldate >= '2008-04-09' AND a.calldate < ('2008-04-09'::DATE + '1 DAY'::INTERVAL) This is a RANGE query (just like BETWEEN) which is index-friendly. Personnaly, I wouldn't do it that way : since you use the date (and not the time, I presume you only use the time for display purposes) I would just store the timestamptz in "calltimestamp" and the date in "calldate", with a trigger to ensure the date is set to calltimestamp::date every time a row is inserted/updated. This is better than a function index since you use that column a lot in your query, it will be slightly faster, and it will save a lot of timestamptz->date casts hence it will save CPU cycles Try this last option (separate date column), and repost EXPLAIN ANALYZE of your query so it can be optimized further. Also, PLEASE don't use substr(), use a.dst LIKE '*100%', look in the manual. LIKE 'foo%' is indexable if you create the proper index. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] bulk insert performance problem
I have a performance problem with a script that does massive bulk insert in 6 tables. When the script starts the performance is really good but will degrade minute after minute and take almost a day to finish! Looks like foreign key checks slow you down. - Batch INSERTS in transactions (1000-1 per transaction) - Run ANALYZE once in a while so the FK checks use indexes - Are there any DELETEs in your script which might hit nonidexed REFERENCES... columns to cascade ? - Do you really need to check for FKs on the fly while inserting ? ie. do you handle FK violations ? Or perhaps your data is already consistent ? In this case, load the data without any constraints (and without any indexes), and add indexes and foreign key constraints after the loading is finished. - Use COPY instead of INSERT. If you use your script to process data, perhaps you could import raw unprocessed data in a table (with COPY) and process it with SQL. This is usually much faster than doing a zillion inserts. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] recommendations for web/db connection pooling or DBD::Gofer reviews
When traffic to our PostgreSQL-backed website spikes, the first resource we see being exhausted is the DB slots on the master server (currently set to about 400). I expect that as new Apache/mod_perl children are being put to us, they are creating new database connections. I'm interested in recommendations to funnel more of that traffic through fewer DB slots, if that's possible. (We could also consider increasing the handles available, since the DB server has some CPU and memory to spare). I'm particularly interested in review of DBD::Gofer, which seems like it would help with this in our Perl application: http://search.cpan.org/dist/DBI/lib/DBD/Gofer.pm I realize it has limitations, like "no transactions", but I think we would still able to use it selectively in our application. Under heavy load, Apache has the usual failure mode of spawning so many threads/processes and database connections that it just exhausts all the memory on the webserver and also kills the database. As usual, I would use lighttpd as a frontend (also serving static files) to handle the large number of concurrent connections to clients, and then have it funnel this to a reasonable number of perl backends, something like 10-30. I don't know if fastcgi works with perl, but with PHP it certainly works very well. If you can't use fastcgi, use lighttpd as a HTTP proxy and apache with mod_perl behind. Recipe for good handling of heavy load is using an asynchronous server (which by design can handle any number of concurrent connections up to the OS' limit) in front of a small number of dynamic webpage generating threads/processes. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Forcing more agressive index scans for BITMAP AND
On Fri, 4 Apr 2008, Ow Mun Heng wrote: select * from table where A=X and B = Y and C = Z and D = AA and E = BB With that kind of WHERE condition, Postgres will use a Bitmap Index Scan to combine your indices. If, however, postgres notices while looking at the statistics gathered during ANALYZE, that for one of your columns, you request a value that happens in a large percentage of the rows (like 20%), and this value has a rather random distribution, Postgres will not bother scanning the index, because it is very likely that all the pages would contain a row satisfying your condition anyway, so the time taken to scan this huge index and mark the bitmap would be lost because it would not allow a better selectivity, since all the pages would get selected for scan anyway. I would guess that Postgres uses Bitmap Index Scan only on your columns that have good selectivity (ie. lots of different values). So : If you use conditions on (a,b) or (a,b,c) or (a,b,c,d) etc, you will benefit GREATLY from a multicolumn index on (a,b,c,d...). However, even if postgres can use some clever tricks, a multicolumn index on (a,b,c,d) will not be optimal for a condition on (b,c,d) for instance. So, if you mostly use conditions on a left-anchored subset of (a,b,c,d,e), the multicolumn index will be a great tool. A multicolumn index on (a,b,c,d,e) is always slightly slower than an index on (a) if you only use a condition on (a), but it is immensely faster when you use a multicolumn condition. Can you tell us more about what those columns mean and what you store in them, how many distinct values, etc ? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Planning a new server - help needed
Why do you claim that 'More platters also means slower seeks and generally slower performance.'? More platters -> more heads -> heavier head assembly -> slower seek time But.. More platters -> higher density -> less seek distance (in mm of head movement) -> faster seek time As usual, no clear-cut case, a real-life test would tell more interesting things. I'm not entirely sure why the extra platters should really count as more moving parts since I think the platter assembly and head assembly are both single parts in effect, albeit they will be more massive with more platters. I'm not sure how much extra bearing friction that will mean, but it is reasonable that some extra energy is going to be needed. Since the bearings are only on one side of the axle (not both), a heavier platter assembly would put more stress on the bearing if the disk is subject to vibrations (like, all those RAID disks seeking together) which would perhaps shorten its life. Everything with conditionals of course ;) I remember reading a paper on vibration from many RAID disks somewhere a year or so ago, vibration from other disks seeking at the exact same time and in the same direction would cause resonances in the housing chassis and disturb the heads of disks, slightly worsening seek times and reliability. But, on the other hand, the 7 disks raided in my home storage server never complained, even though the $30 computer case vibrates all over the place when they seek. Perhaps if they were subject to 24/7 heavy torture, a heavier/better damped chassis would be a good investment. It may be worth considering an alternative approach. I suspect that a god RAID1 or RAID1+0 is worthwhile for WAL, but Actually, now that 8.3 can sync to disk every second instead of at every commit, I wonder, did someone do some enlightening benchmarks ? I remember benchmarking 8.2 on a forum style load and using a separate disk for WAL (SATA, write cache off) made a huge difference (as expected) versus one disk for everything (SATA, and write cache off). Postgres beat the crap out of MyISAM, lol. Seems like Postgres is one of the rare apps which gets faster and meaner with every release, instead of getting slower and more bloated like everyone else. Also, there is a thing called write barriers, which supposedly could be used to implement fsync-like behaviour without the penalty, if the disk, the OS, the controller, and the filesystem support it (that's a lot of ifs)... I haven't done this, so YMMV. But the prices are getting interesting for OLTP where most disks are massively oversized. The latest Samsung and SanDisk are expensive in the UK but the Transcend 16GB TS16GSSD25S-S SATA is about $300 equiv - it can do 'only' 'up to' 28MB/s write and Gigabyte should revamp their i-RAM to use ECC RAM of a larger capacity... and longer lasting battery backup... I wonder, how many write cycles those Flash drives can take before reliability becomes a problem... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] "Slow" query or just "Bad hardware"?
Also, sometimes offine TOASTing is evil : Say you have a forum, you want the posts table to be CLUSTER'ed on (topic_id, post_id) so displaying 1 page with 30 posts on it uses 1 seek, not 30 seeks. But CLUSTER doesn't touch the data that has been pushed offline in the toast table. So, in that case, it can pay (big time actually) to disable toasting, store the data inline, and benefit from cluster. So basically : Data that is seldom used or used only in queries returning/examining 1 row bu otherwise eats cache -> push it away (toast or partition) Data that is used very often in queries that return/examine lots of rows, especially if said rows are in sequence (cluster...) -> keep it inline -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] "Slow" query or just "Bad hardware"?
Hm, so this table has 10 million entries and it does not fit in 32GB of RAM ? Could you investigate : - average size of rows in both tables - a quick description of your table columns especially the average size of your TEXT fields, especially the large one(s) like comments etc (don't bother about INTs unless you have like 50 int columns) - which fields get toasted, which don't, number of accesses to TOASTed fields in this query, could add 1 seek per field per fetched row if they're not cached - other stuff in your database that is using those gigabytes of RAM ? (indexes which are used often do count) I would tend to think that you are not going to display 200 kilobytes of text on your listing webpage, most likely something like 100 or 200 bytes of text from each row, right ? If that is the case, 10M rows * 200 bytes = 2G to keep cached in RAM, plus overhead, so it should work fast. You may want to partition your table in two, one which holds the fields which are often used in bulk, search, and listings, especially when you list 200 rows, and the other table holding the large fields which are only displayed on the "show details" page. Note that one (or several) large text field will not kill your performance, postgres will store that offline (TOAST) for you without you needing to ask, so your main table stays small and well cached. Of course if you grab that large 10 kB text field 200 times to display the first 80 charachers of it followed by "..." in your listing page, then, you're screwed ;) that's one of the things to avoid. However, if your "comments" field is small enough that PG doesn't want to TOAST it offline (say, 500 bytes), but still represents the bulk of your table size (for instance you have just a few INTs beside that that you want to quickly search on) then you may tell postgres to store the large fields offline (EXTERNAL, check the docs), and also please enable automatic compression. If however, you have something like 200 INT columns, or a few dozens of small TEXTs, or just way lots of columns, TOAST is no help and in this case you you must fight bloat by identifying which columns of your table need to be accessed often (for searches, listing, reporting, etc), and which are not accessed often (ie. details page only, monthly reports, etc). If you are lucky the column in the first group will form a much smaller subset of your gigabytes of data. Then, you partition your table in two (vertically), so the small table stays small. EXAMPLE on a community site : - members table, huge, search is slow, join to forum tables to get user's name horribly slow because cache is full and it seeks - push members' profiles and other data that only shows up in the details page to a second table : main members table much smaller, fits in RAM now, search is fast, joins to members are also fast. Word to remember : working set ;) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] how can a couple of expensive queries drag my system down?
is this 'normal'? (loaded question I know) Depends. If you are on the edge, disk-wise, yes a big fat query can push it over and make it fall. Should I be looking to offload expensive reporting queries to read-only replicants of my database? You could do this, especially if the heavy queries involve reading gigabytes of data from disk (as reporting queries like to do). In that case, you can even use a cheap machine with cheap disks for the slave (even striped RAID) since data is duplicated anyway and all that matters is megabytes/second, not IOs/second. Is this a symptom of slow disk? vmstat will tell you this. If iowait time goes through the roof, yes it's disk bound. If cpu use goes 100%, then it's cpu bound. imporoperly tuned postgres settings? bad Also possible, you can try EXPLAIN of the problematic queries. choice of OS, hardware, storage? Depends on how your SAN handles load. No idea about that. Is this a sign of disk contention? Most probable. How does CPU load come into play? With 8 CPUs, less likely. (Your problem query can swamp at most 1 CPU, so if the machine grinds with still 7 other cores available for the usual, it probably isn't cpu-bound) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 1-/2-dimensional indexes for common columns, rationale?
create index foo1 on bla (a); create index foo2 on bla (b); create index foo3 on bla (a,b); You say you need faster INSERT performance. Getting rid of some indexes is a way, but can you tell a bit more about your hardware setup ? For instance, if you only have one HDD, put an extra HDD in the machine, and put the database on it, but leave the pg_xlog on the OS's disk. Or the reverse, depending on which disk is faster, and other factors. Since heavy INSERTs mean heavy log writing traffic, this almost doubles your write bandwidth for the cost of a disk. Cheap and efficient. You can also put the indexes on a third disk, but separating database and log on 2 disks will give you the most benefits. If you already have a monster hardware setup, though... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Preparing statements on connection startup
Hello, I am using Postgres with PHP and persistent connections. For simple queries, parsing & preparing time is often longer than actual query execution time... I would like to execute a bunch of PREPARE statements to prepare my most often used small queries on connection startup, then reuse these prepared statements during all the life of the persistent connection. (Simple queries in PG are actually faster than in MySQL if prepared, lol) How do I achieve this ? Best way, would be of course a "PERSISTENT PREPARE" which would record the information (name, SQL, params, not the Plan) about the prepared statement in a system catalog shared by all connections ; when issuing EXECUTE, if the prepared statement does not exist in the current connection, pg would look there, and if it finds the name of the statement and corresponding SQL, issue a PREPARE so the current connection would then have this statement in its store, and be able to execute it faster for all the times this connection is reused. Is such a feature planned someday ? I tried to write a function which is called by my PHP script just after establishing the connection, it is a simple function which looks in pg_prepared_statements, if it is empty it issues the PREPARE statements I need. It works, no problem, but it is less elegant and needs one extra query per page. I also tried to issue a dummy EXECUTE of a prepared "SELECT 1" just after establishing the connection : if it fails, we prepare the plans (by issuing queries from PHP), if it succeeds, this means we are reusing a connection with all the plans already prepared. This also works well. What do you think ? Regards, Pierre -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] waiting for harddisk
i am using postgresql 8.1.8, Following configurations: shared_buffers = 5000 work_mem = 65536 maintenance_work_mem = 65536 effective_cache_size = 16000 random_page_cost = 0.1 The cpu is waiting percentage goes upto 50%, and query result comes later, i am using normal select query ( select * from table_name ). table has more then 6 million records. When you mean SELECT *, are you selecting the WHOLE 6 million records ? Without WHERE ? Or just a few rows ? Please post EXPLAIN ANALYZE of your query. - Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Simple query showing 270 hours of CPU time
Today, I looked at 'top' on my PG server and saw a pid that reported 270 hours of CPU time. Considering this is a very simple query, I was surprised to say the least. I was about to just kill the pid, but I figured I'd try and see exactly what it was stuck doing for so long. If you are using connection pooling, or if your client keeps the connections for a long time, this backend could be very old... With PHP's persistent connections, for instance, backends restart when you restart the webserver, which isn't usually very often. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] insert vs select into performance
It's the time to parse statements, plan, execute, roundtrips with the client, context switches, time for your client library to escape the data and encode it and for postgres to decode it, etc. In a word : OVERHEAD. I know there is some overhead, but that much when running it batched...? Well, yeah ;) Unfortunately its not fast enough, it needs to be done in no more than 1-2 seconds, ( and in production it will be maybe 20-50 columns of data, perhaps divided over 5-10 tables.) Additionally it needs to scale to perhaps three times as many columns and perhaps 2 - 3 times as many rows in some situation within 1 seconds. Further on it needs to allow for about 20 - 50 clients reading much of that data before the next batch of data arrives. Wow. What is the application ? Test run on a desktop PC, Athlon 64 3200+, 2 IDE disks in RAID1 (pretty slow) : test=> CREATE TABLE test (a INT, b INT, c INT, d INT, e INT, f INT); CREATE TABLE Temps : 11,463 ms test=> INSERT INTO test SELECT 1,2,3,4,5,a FROM generate_series( 1, 10 ) as a; INSERT 0 10 Temps : 721,579 ms OK, so you see, insert speed is pretty fast. With a better CPU and faster disks, you can get a lot more. test=> TRUNCATE TABLE test; TRUNCATE TABLE Temps : 30,010 ms test=> ALTER TABLE test ADD PRIMARY KEY (f); INFO: ALTER TABLE / ADD PRIMARY KEY créera un index implicite «test_pkey» pour la table «test» ALTER TABLE Temps : 100,577 ms test=> INSERT INTO test SELECT 1,2,3,4,5,a FROM generate_series( 1, 10 ) as a; INSERT 0 10 Temps : 1915,928 ms This includes the time to update the index. test=> DROP TABLE test; DROP TABLE Temps : 28,804 ms test=> CREATE TABLE test (a INT, b INT, c INT, d INT, e INT, f INT); CREATE TABLE Temps : 1,626 ms test=> CREATE OR REPLACE FUNCTION test_insert( ) RETURNS VOID LANGUAGE plpgsql AS $$ DECLARE _i INTEGER; BEGIN FOR _i IN 0..10 LOOP INSERT INTO test (a,b,c,d,e,f) VALUES (1,2,3,4,5, _i); END LOOP; END; $$; CREATE FUNCTION Temps : 51,948 ms test=> SELECT test_insert(); test_insert - (1 ligne) Temps : 1885,382 ms Now you see, performing 100K individual inserts inside a plpgsql function is also fast. The postgres engine is pretty damn fast ; it's the communication overhead that you feel, especially switching between client and server processes. Another example : => INSERT INTO test (a,b,c,d,e,f) VALUES (... 10 integer tuples) INSERT 0 10 Temps : 1836,458 ms VALUES is actually pretty fast. Here, there is no context switch, everything is done in 1 INSERT. However COPY is much faster because the parsing overhead and de-escaping of data is faster. COPY is optimized for throughput. So, advice : For optimum throughput, have your application build chunks of data into text files and use COPY. Or if your client lib supports the copy interface, use it. You will need a fast disk system with xlog and data on separate disks, several CPU cores (1 insert thread will max out 1 core, use the others for selects), lots of RAM so index updates don't need to seek, and tuning of bgwriter and checkpoints to avoid load spikes. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] insert vs select into performance
I was doing some testing on "insert" compared to "select into". I inserted 100 000 rows (with 8 column values) into a table, which took 14 seconds, compared to a select into, which took 0.8 seconds. (fyi, the inserts where batched, autocommit was turned off and it all happend on the local machine) Did you use prepared statements ? Did you use INSERT INTO ... VALUES () with a long list of values, or just 100K insert statements ? It's the time to parse statements, plan, execute, roundtrips with the client, context switches, time for your client library to escape the data and encode it and for postgres to decode it, etc. In a word : OVERHEAD. By the way which language and client library are you using ? FYI 14s / 100k = 140 microseconds per individual SQL query. That ain't slow at all. Does the select into translate into a specially optimised function in c that can cut corners which a insert can not do (e.g. lazy copying), or is it some other reason? Yeah : instead of your client having to encode 100K * 8 values, send it over a socket, and postgres decoding it, INSERT INTO SELECT just takes the data, and writes the data. Same thing as writing a file a byte at a time versus using a big buffer. The reason I am asking is that select into shows that a number of rows can be inserted into a table quite a lot faster than one would think was possible with ordinary sql. If that is the case, it means that if I write an pl-pgsql insert function in C instead of sql, then I can have my db perform order of magnitude faster. Fortunately this is already done for you : there is the PREPARE statement, which will remove the parsing overhead. If you must insert many rows, use VALUES (),(),()... ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...
On Tue, 17 > We have the oppotunity to benchmark our application on a large server. I have to prepare the Postgres configuration and I'd appreciate some comments on it as I am not experienced with servers of such a scale. Moreover the configuration should be fail-proof as I won't be able to attend the tests. Our application (java + perl) and Postgres will run on the same server, whereas the application activity is low when Postgres has large transactions to process. Please, can you be more specific about your application : - what does it do ? - what kind of workload does it generate ? [ie: many concurrent small queries (website) ; few huge queries, reporting, warehousing, all of the above, something else ?] - percentage and size of update queries ? - how many concurrent threads / connections / clients do you serve on a busy day ? (I don't mean online users on a website, but ACTIVE concurrent database connections) I assume you find your current server is too slow or foresee it will become too slow soon and want to upgrade, so : - what makes the current server's performance inadequate ? is it IO, CPU, RAM, a mix ? which proportions in the mix ? This is very important. If you go to the dealer and ask "I need a better vehicle", he'll sell you a Porsche. But if you say "I need a better vehcle to carry two tons of cinderblocks" he'll sell you something else I guess. Same with database servers. You could need some humongous CPU power, but you might as well not. Depends. There is a large gap between our current produtcion server (Linux, 4GB RAM, 4 cpus) and the benchmark server; one of the target of this benchmark is to verify the scalability of our application. Define scalability. (no this isn't a joke, I mean, you know your application, how would you like it to "scale" ? How do you think it will scale ? Why ? What did you do so it would scale well ? etc.) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] PREPARE and stuff
Well, that's not completely trivial => the plan might depend upon the concrete value of $1,$2 and $3. When you use PREPARE, it doesn't. I could live with that. The purpose of this would be to have a library of "persistent prepared statements" (just like lightweight functions) for your application, and maximize the performance of persistent connections. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] PREPARE and stuff
Suppose a web application with persistent database connections. I have some queries which take longer to plan than to execute ! I with there was a way to issue a PREPARE (like "PERSISTENT PREPARE"). Now all Postgres connections would know that prepared statement foo( $1, $2, $3 ) corresponds to some SQL query, but it wouldn't plan it yet. Just like a SQL function. When invoking EXECUTE foo( 1,2,3 ) on any given connection the statement would get prepared and planned. Then on subsequent invocations I'd just get the previously prepared plan. Is this planned ? ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Data transfer very slow when connected via DSL
I did not find a solution so far; and for bulk data transfers I now >programmed a workaround. But that is surely based on some component installed on the server, isn't it? Correct. I use a pyro-remote server. On request this remote server copies the relevant rows into a temporary table, uses a copy_to Call to push them into a StringIO-Objekt (that's Pythons version of "In Memory File"), serializes that StringIO-Objekt, does a bz2-compression and transfers the whole block via VPN. I read on in this thread, and I scheduled to check on psycopg2 and what it is doing with cursors. What about a SSH tunnel using data compression ? If you fetch all rows from a query in one go, would it be fast ? Also, PG can now COPY from a query, so you don't really need the temp table... ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Short row header
I have this "poll results" table with just 3 integer fields, which is never updated, only inserted/deleted... Did the Devs consider an option to have VACUUM reduce the row header sizes for tuples that are long commited and are currently visible to all transactions ? (even if this makes the tuples non-updateable, as long as they can be deleted, it would be OK for this type of tables). ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Volunteer to build a configuration tool
To me, the worst catch-22 we face in this area is that we'd like the optimizer's choices of plan to be stable and understandable, but the real-world costs of queries depend enormously on short-term conditions such as how much of the table has been sucked into RAM recently by other queries. I have no good answer to that one. Yeah, there is currently no way to tell the optimizer things like : - this table/portion of a table is not frequently accessed, so it won't be in the cache, so please use low-seek plans (like bitmap index scan) - this table/portion of a table is used all the time so high-seek-count plans can be used like index scan or nested loops since everything is in RAM Except planner hints (argh) I see no way to give this information to the machine... since it's mostly in the mind of the DBA. Maybe a per-table "cache temperature" param (hot, warm, cold), but what about the log table, the end of which is cached, but not the old records ? It's messy. Still PG does a pretty excellent job most of the time. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
What version of PostgreSQL are you using? I think newbies should be pushed a bit to use the latest versions, maybe with some advice on how to setup the apt sources (in debian/ubuntu) to get them. How much memory will be available to PostgreSQL? How many connections will be made to PostgreSQL? I also think Postgres newbies using PHP should be encouraged to use something like ligttpd/fastcgi instead of Apache. The fastcgi model permits use of very few database connections and working PHP processes since lighttpd handles all the slow transfers to the client asynchronously. You can do the same with two Apache instances, one serving static pages and acting as a proxy for the second Apache serving dynamic pages. With this setup, even low-end server setups (For our personal sites, a friend and I share a dedicated server with 256MB of RAM, which we rent for 20€ a month). This thing will never run 200 Apache processes, but we have no problem with lighttpd/php-fcgi and postgres. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Performance query about large tables, lots of concurrent access
Question: Does anyone have any idea what bottleneck I am hitting? An index's performance should in theory scale as the log of the number of rows - what am I missing here? These can help people on the list to help you : - Your hardware config (CPU, RAM, disk) ? - EXPLAIN ANALYZE from slow queries ? - VACUUM and ANALYZE : yes ? how often ? - VACUUM VERBOSE output for huge bits of text with long line length, mail sucks, upload to a web host or something. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq