[PERFORM] Different query plans on same servers
I have a fairly simple query: SELECT FROM "tubesite_image" INNER JOIN "tubesite_object" ON ("tubesite_image"."object_ptr_id" = "tubesite_object"."id") WHERE "tubesite_object"."site_id" = 8 ORDER BY "tubesite_object"."pub_date" ASC LIMIT 21; That query is having a bad query plan on production server: Limit (cost=0.00..1938.67 rows=21 width=275) (actual time=3270.000..3270.000 rows=0 loops=1) -> Nested Loop (cost=0.00..792824.51 rows=8588 width=275) (actual time=3269.997..3269.997 rows=0 loops=1) -> Index Scan using tubesite_object_pub_date_idx on tubesite_object (cost=0.00..789495.13 rows=9711 width=271) (actual time=0.011..3243.629 rows=9905 loops=1) Filter: (site_id = 8) -> Index Scan using tubesite_image_pkey on tubesite_image (cost=0.00..0.33 rows=1 width=4) (actual time=0.002..0.002 rows=0 loops=9905) Index Cond: (tubesite_image.object_ptr_id = tubesite_object.id) Total runtime: 3270.071 ms But, when I turn off nested loops, the query flies: QUERY PLAN -- Limit (cost=31384.35..31384.40 rows=21 width=275) (actual time=37.988..37.988 rows=0 loops=1) -> Sort (cost=31384.35..31405.82 rows=8588 width=275) (actual time=37.986..37.986 rows=0 loops=1) Sort Key: tubesite_object.pub_date Sort Method: quicksort Memory: 25kB -> Hash Join (cost=857.00..31152.80 rows=8588 width=275) (actual time=37.968..37.968 rows=0 loops=1) Hash Cond: (tubesite_object.id = tubesite_image.object_ptr_id) -> Bitmap Heap Scan on tubesite_object (cost=596.77..30685.30 rows=9711 width=271) (actual time=7.414..25.132 rows=9905 loops=1) Recheck Cond: (site_id = 8) -> Bitmap Index Scan on tubesite_object_site_id (cost=0.00..594.34 rows=9711 width=0) (actual time=4.943..4.943 rows=9905 loops=1) Index Cond: (site_id = 8) -> Hash (cost=152.88..152.88 rows=8588 width=4) (actual time=4.620..4.620 rows=8588 loops=1) -> Seq Scan on tubesite_image (cost=0.00..152.88 rows=8588 width=4) (actual time=0.005..2.082 rows=8588 loops=1) Total runtime: 38.071 ms I have rsynced the database from the prod server to the test server, that has same configuration (shared buffers, work mem, estimated cache size, and so on), and there it chooses bitmap heap scan with hash join without disabling the nested loops. I have 8.4.8 on producion and 8.4.9 on test, could that explain the difference in plans chosen? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Different query plans on same servers
Mario Splivalo writes: > I have 8.4.8 on producion and 8.4.9 on test, could that explain the > difference in plans chosen? I'd wonder first if you have the same statistics settings on both. The big problem here is that the estimation of the join size is bad (8588 versus 0). regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Different query plans on same servers
Tom Lane wrote: > I'd wonder first if you have the same statistics settings on both. > The big problem here is that the estimation of the join size is > bad (8588 versus 0). But both servers develop that estimate for the join size. I was wondering more about whether the costing factors were really the same: slow: -> Nested Loop (cost=0.00..792824.51 rows=8588 width=275) (actual time=3269.997..3269.997 rows=0 loops=1) versus fast: -> Hash Join (cost=857.00..31152.80 rows=8588 width=275) (actual time=37.968..37.968 rows=0 loops=1) The hash join path must look more expensive on the first machine, for some reason. Mario, could you post the result of running this query from both servers?: http://wiki.postgresql.org/wiki/Server_Configuration -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Different query plans on same servers
"Kevin Grittner" wrote: > But both servers develop that estimate for the join size. [sigh] Those *were* both from the production server. Please show us the EXPLAIN ANALYZE from the other server. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Response time increases over time
Hi all, I am running a load simulation on Debian with PostgreSQL 8.4.9 (standard Debian package). Certain number of clients do the following stepsin a transaction (read commited level) periodically (about 1.1 transaction per second / client) and concurrently: -reads a record of table Machine and State (they each have about 300 records, read size is about 1.4 KB) -reads a record of table Card (it has about 1200 records) -reads some other records from other tables, all these are straightforward, single line queries (here there are even less records in the tables) -updates Machine (1 record, updates 2-3 integer values) -updates State (1 record, updates a bytea field, about 1,3KB) -updates Card (1 record, updates an integer) -inserts 1-1 record into 2 log tables Its important, that each client updates different records, so there is no update conflict. There are no triggers or rules. Tables have simple indexes, 2 at most. As I run the simulation with more and more clients, I can observe, that at the beginning of the simulation the transaction times are quite acceptable (20-30 ms) and quite uniform/smooth, but as the simultion progresses, it becomes higher (30-40-50-60 ms) and more and more non-uniform, but the tps interestingly remains the same during the simulation. With 100 clients this kind of behaviour can be seen very well. The simulation's duration is 500 sec. I wonder why this happens on this server, and how I can keep the response time as low as at the beginning. Just for comparison, I ran the same simulation on a Windows 7 notebook machine but with PostgreSQL 9.1.2 (downloaded from EnterpriseDB's site, not PostgreSQL Plus), and it did not show this problem even with 120 clients. It's transaction times were surprisingly smooth and consistent. The client code was the same in the 2 cases. Actually I ran first the test on the Windows machine, and after that on the better Debian. I expected that it would be even better there. Network latency is quite minimal, because the clients and the database server run on VMs on a server machine in the Linux case. Here is some important config variables from the 8.4 (9.1.2 is configured similarly): ssl=false shared_buffers=24MB (OS max currently, but should not be a problem because 9.1.2 performed quite well on Windows with 24 MB) work_mem=1MB maintainance_work_mem=16MB fsync=on sync_commit=on wal_sync_method=fsync full_page_writes=on wal_buffers=1MB commit_delay=0 checkpoint segments=8 effective_cache_size=256MB vacuum: default bgwriter: default I suspected that due to the lot of update, the tables get bloated with dead rows, but vacuum analyze verbose did not show that. It seems that something cannot keep up with the load, but tps does not change, just the response time gets higher. Could you please help me with what can cause this kind of behaviour on Linux? What setting should I change perhaps? Is there so much difference between 8.4 and 9.1, or is this something else? Please tell me if any other info is needed. Thanks in advance, Otto
Re: [PERFORM] Different query plans on same servers
On 12/06/2011 09:00 PM, Tom Lane wrote: > Mario Splivalo writes: >> I have 8.4.8 on producion and 8.4.9 on test, could that explain the >> difference in plans chosen? > > I'd wonder first if you have the same statistics settings on both. > The big problem here is that the estimation of the join size is bad > (8588 versus 0). They do, I guess. I did rsync postgres datadir from the prod server to the test server. The only difference is that prod server was a bit more loaded than the test server. Mario -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Different query plans on same servers
On 12/06/2011 09:17 PM, Kevin Grittner wrote: > > The hash join path must look more expensive on the first machine, > for some reason. > > Mario, could you post the result of running this query from both > servers?: > > http://wiki.postgresql.org/wiki/Server_Configuration Sure. Here is from the prod server: name | current_setting -+ version | PostgreSQL 8.4.8 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.3.real (Debian 4.3.2-1.1) 4.3.2, 64-bit checkpoint_segments | 64 default_statistics_target | 2000 effective_cache_size| 36GB external_pid_file | /var/run/postgresql/8.4-main.pid lc_collate | en_US.UTF-8 lc_ctype| en_US.UTF-8 listen_addresses| * log_autovacuum_min_duration | 0 log_checkpoints | on log_line_prefix | %t [%p]: [%l-1] [%d] log_min_duration_statement | 1s maintenance_work_mem| 256MB max_connections | 1500 max_stack_depth | 3MB port| 5432 server_encoding | UTF8 shared_buffers | 4GB statement_timeout | 30min temp_buffers| 4096 TimeZone| localtime track_activity_query_size | 2048 unix_socket_directory | /var/run/postgresql wal_buffers | 128MB work_mem| 64MB And here is from the test server: name| current_setting +-- version| PostgreSQL 8.4.9 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.4.real (Debian 4.4.5-8) 4.4.5, 64-bit checkpoint_segments| 64 default_statistics_target | 2000 effective_cache_size | 36GB external_pid_file | /var/run/postgresql/8.4-main.pid lc_collate | en_US.UTF-8 lc_ctype | en_US.UTF-8 listen_addresses | * log_connections| on log_disconnections | on log_line_prefix| %t [%p]: [%l-1] [%d] log_min_duration_statement | 0 maintenance_work_mem | 256MB max_connections| 40 max_stack_depth| 3MB port | 5432 server_encoding| UTF8 shared_buffers | 4GB ssl| on temp_buffers | 4096 TimeZone | localtime unix_socket_directory | /var/run/postgresql wal_buffers| 128MB work_mem | 64MB (24 rows) At the time of doing 'explain analyze' on the prod server there were cca 80 connections on the server. Mario -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Different query plans on same servers
On 12/06/2011 09:29 PM, Kevin Grittner wrote: > "Kevin Grittner" wrote: > >> But both servers develop that estimate for the join size. > > [sigh] Those *were* both from the production server. Please show > us the EXPLAIN ANALYZE from the other server. Huh, right... missed that one. Here is the 'explain analyze' from the other server: QUERY PLAN -- Limit (cost=31531.75..31531.80 rows=21 width=275) (actual time=45.584..45.584 rows=0 loops=1) -> Sort (cost=31531.75..31531.84 rows=36 width=275) (actual time=45.579..45.579 rows=0 loops=1) Sort Key: tubesite_object.pub_date Sort Method: quicksort Memory: 25kB -> Hash Join (cost=866.34..31530.82 rows=36 width=275) (actual time=45.544..45.544 rows=0 loops=1) Hash Cond: (tubesite_object.id = tubesite_image.object_ptr_id) -> Bitmap Heap Scan on tubesite_object (cost=606.11..31146.68 rows=9884 width=271) (actual time=6.861..37.497 rows=9905 loops=1) Recheck Cond: (site_id = 8) -> Bitmap Index Scan on tubesite_object_site_id (cost=0.00..603.64 rows=9884 width=0) (actual time=4.792..4.792 rows=9905 loops=1) Index Cond: (site_id = 8) -> Hash (cost=152.88..152.88 rows=8588 width=4) (actual time=3.816..3.816 rows=8588 loops=1) -> Seq Scan on tubesite_image (cost=0.00..152.88 rows=8588 width=4) (actual time=0.003..1.740 rows=8588 loops=1) Total runtime: 45.798 ms This is also a query from the prod server, but without LIMIT: Sort (cost=31713.95..31735.42 rows=8588 width=275) (actual time=60.311..60.311 rows=0 loops=1) Sort Key: tubesite_object.pub_date Sort Method: quicksort Memory: 25kB -> Hash Join (cost=857.00..31152.80 rows=8588 width=275) (actual time=60.255..60.255 rows=0 loops=1) Hash Cond: (tubesite_object.id = tubesite_image.object_ptr_id) -> Bitmap Heap Scan on tubesite_object (cost=596.77..30685.30 rows=9711 width=271) (actual time=8.682..49.721 rows=9905 loops=1) Recheck Cond: (site_id = 8) -> Bitmap Index Scan on tubesite_object_site_id (cost=0.00..594.34 rows=9711 width=0) (actual time=5.705..5.705 rows=9905 loops=1) Index Cond: (site_id = 8) -> Hash (cost=152.88..152.88 rows=8588 width=4) (actual time=4.281..4.281 rows=8588 loops=1) -> Seq Scan on tubesite_image (cost=0.00..152.88 rows=8588 width=4) (actual time=0.005..1.437 rows=8588 loops=1) Total runtime: 60.483 ms (12 rows) I will try to rsync prod database to 8.4.8 on test server tomorrow, and see what happens. Hopefully upgrade to 8.4.9 (or even 8.4.10 if Debian packages is by tomorrow) will solve the issue... Mario -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Response time increases over time
On 12/6/11 4:30 PM, Havasvölgyi Ottó wrote: > Is there so much difference between 8.4 and 9.1, or is this something else? > Please tell me if any other info is needed. It is fairly likely that the difference you're seeing here is due to improvements made in checkpointing and other operations made between 8.4 and 9.1. Is there some reason you didn't test 9.1 on Linux to compare the two? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Question about VACUUM
On 12/5/11 1:36 PM, Kevin Grittner wrote: > I understand the impulse to run autovacuum less frequently or less > aggressively. When we first started running PostgreSQL the default > configuration was very cautious. The default settings are deliberately cautious, as default settings should be. But yes, anyone with a really large/high-traffic database will often want to make autovac more aggressive. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance