Re: [PERFORM] Slow Delete : Seq scan instead of index scan
On 10/16/2012 04:41 PM, Filippos Kalamidas wrote: the first thing you should probably do is run an 'analyze' on one of these tables and then run again the delete statement. if there are no stats for these tables, it's normal not to have very good plans. Yep, and the fact that the stats are that bad suggests that autovaccum probably isn't running, or isn't running often enough. If you have a high INSERT/UPDATE/DELETE load, then turn autovacuum up on that table. See: http://www.postgresql.org/docs/current/static/routine-vacuuming.html http://www.postgresql.org/docs/current/static/runtime-config-autovacuum.html If the table is badly bloated it might be worth running VACUUM FULL on it or (if you're on PostgreSQL 8.4 or below) instead CLUSTER the table on an index, as VACUUM FULL is very inefficient in 8.4 and older (I think; I might be misremembering the versions). Please specify your PostgreSQL version in all questions. See https://wiki.postgresql.org/wiki/Slow_Query_Questions -- Craig Ringer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] pgbounce max_client_conn and default_pool_size
What is the adequate *pgbounce* *max_client_conn ,default_pool_size* values for a postgres config which has *max_connections = 400*. We want to move to pgbouncer to let postgres do the only db job but it confused us. We have over 120 databases in a single postgres engine with as i said max_connections = 400 . 120+ databases are driven from 3 separate web servers, we are planning to install pgbounce to three of them. PS: in source code of pgbouncer it looks like opens separate fd's for max_client_conn * default_pool_size value. Regards, Yetkin Öztürk
[PERFORM] have: seq scan - want: index scan
Hi guys, PG = 9.1.5 OS = winDOS 2008R8 I have a table that currently has 207 million rows. there is a timestamp field that contains data. more data gets copied from another database into this database. How do I make this do an index scan instead? I did an analyze audittrailclinical to no avail. I tested different indexes - no same behavior. The query does this: SELECT audittrailclinical.pgid, audittrailclinical.timestamp, mmuser.logon, audittrailclinical.entityname, audittrailclinical.clinicalactivity, audittrailclinical.audittraileventcode, account.accountnumber, patient.dnsortpersonnumber FROM public.account, public.audittrailclinical, public.encounter, public.entity, public.mmuser, public.patient, public.patientaccount WHERE audittrailclinical.encountersid = encounter.encountersid and audittrailclinical.timestamp = '2008-01-01'::timestamp without time zone and audittrailclinical.timestamp = '2012-10-05'::timestamp without time zone AND encounter.practiceid = patient.practiceid AND encounter.patientid = patient.patientid AND encounter.staffid = patient.staffid AND entity.entitysid = audittrailclinical.entitysid AND mmuser.mmusersid = audittrailclinical.mmusersid AND patient.practiceid = patientaccount.practiceid AND patient.patientid = patientaccount.patientid AND patientaccount.accountsid = account.accountsid AND patientaccount.defaultaccount = 'Y' AND patient.dnsortpersonnumber = '347450' ; The query plan says: - Seq Scan on audittrailclinical (cost=0.00..8637598.76 rows=203856829 width=62) Filter: ((timestamp = '2008-01-01 00:00:00'::timestamp without time zone) AND (timestamp = '2012-10-05 00:00:00'::timestamp without time zone)) which takes forever. How do I make this do an index scan instead? I did an analyze audittrailclinical to no avail. the table definitions are (the createstamp field is empty - I know, bad data): CREATE TABLE audittrailclinical ( audittrailid text, audittraileventcode text, clinicalactivity text, eventsuccessful text, externalunique text, recordstamp timestamp without time zone, recorddescription text, encountersid integer, eventuserlogon text, computername text, applicationcode text, practiceid integer, mmusersid integer, entitysid integer, entityname text, timestamp timestamp without time zone, lastuser integer, createstamp timestamp without time zone, pgid bigint DEFAULT nextval(('bravepoint_seq'::text)::regclass) ) WITH ( OIDS=FALSE ); ALTER TABLE audittrailclinical OWNER TO intergy; GRANT ALL ON TABLE audittrailclinical TO intergy; GRANT SELECT ON TABLE audittrailclinical TO rb; -- Index: atc_en_time CREATE INDEX atc_en_time ON audittrailclinical USING btree (entitysid , timestamp ); -- Index: atc_id -- DROP INDEX atc_id; CREATE INDEX atc_id ON audittrailclinical USING btree (audittrailid COLLATE pg_catalog.default ); -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Out of shared mem on new box with more mem, 9.1.5 - 9.1.6
We've run into a perplexing issue with a customer database. He moved from a 9.1.5 to a 9.1.6 and upgraded from an EC2 m1.medium (3.75GB RAM, 1.3 GB shmmax), to an m2.xlarge (17GB RAM, 5.7 GB shmmax), and is now regularly getting constant errors regarding running out of shared memory (there were none on the old system in the recent couple of days' logs from before the upgrade): ERROR: out of shared memory HINT: You might need to increase max_pred_locks_per_transaction. The query causing this has structurally identical plans on both systems: old: http://explain.depesz.com/s/Epzq new: http://explain.depesz.com/s/WZo The settings ( select name, setting from pg_settings where source 'default' and name not like 'log%' and name not like 'ssl%' and name not like 'syslog%' ) are almost identical (max_pred_locks_per_transaction itself is at the default): 17c17 effective_cache_size | 153 --- effective_cache_size | 337500 38c38 shared_buffers | 424960 --- shared_buffers | 93696 The kernels are both 2.6.32. The workload has not changed significantly. Could something in 9.1.6 be to blame here? Looking at the changelog, this seems vanishingly unlikely. Any ideas? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Out of shared mem on new box with more mem, 9.1.5 - 9.1.6
We've run into a perplexing issue with a customer database. He moved from a 9.1.5 to a 9.1.6 and upgraded from an EC2 m1.medium (3.75GB RAM, 1.3 GB shmmax), to an m2.xlarge (17GB RAM, 5.7 GB shmmax), and is now regularly getting constant errors regarding running out of shared memory (there were none on the old system in the recent couple of days' logs from before the upgrade): ERROR: out of shared memory HINT: You might need to increase max_pred_locks_per_transaction. The query causing this has structurally identical plans on both systems: old: http://explain.depesz.com/s/Epzq new: http://explain.depesz.com/s/WZo The settings ( select name, setting from pg_settings where source 'default' and name not like 'log%' and name not like 'ssl%' and name not like 'syslog%' ) are almost identical (max_pred_locks_per_transaction itself is at the default): 17c17 effective_cache_size | 153 --- effective_cache_size | 337500 38c38 shared_buffers | 424960 --- shared_buffers | 93696 The kernels are both 2.6.32. The workload has not changed significantly. Could something in 9.1.6 be to blame here? Looking at the changelog, this seems vanishingly unlikely. Any ideas? What are the settings for: work_mem maintenance_work_mem How many concurrent connections are there? Have you ran explain analyze on the query that doesn't crash (i.e the old box) to get the exact execution plan? Has the DB been vacuum analyzed? Cheers = Romax Technology Limited Rutherford House Nottingham Science Technology Park Nottingham, NG7 2PZ England Telephone numbers: +44 (0)115 951 88 00 (main) For other office locations see: http://www.romaxtech.com/Contact = === E-mail: i...@romaxtech.com Website: www.romaxtech.com = Confidentiality Statement This transmission is for the addressee only and contains information that is confidential and privileged. Unless you are the named addressee, or authorised to receive it on behalf of the addressee you may not copy or use it, or disclose it to anyone else. If you have received this transmission in error please delete from your system and contact the sender. Thank you for your cooperation. =
Re: [PERFORM] have: seq scan - want: index scan
On Tue, Oct 16, 2012 at 4:45 PM, Chris Ruprecht ch...@cdrbill.com wrote: Hi guys, PG = 9.1.5 OS = winDOS 2008R8 I have a table that currently has 207 million rows. there is a timestamp field that contains data. more data gets copied from another database into this database. How do I make this do an index scan instead? I did an analyze audittrailclinical to no avail. I tested different indexes - no same behavior. The query plan says: - Seq Scan on audittrailclinical (cost=0.00..8637598.76 rows=203856829 width=62) Filter: ((timestamp = '2008-01-01 00:00:00'::timestamp without time zone) AND (timestamp = '2012-10-05 00:00:00'::timestamp without time zone)) which takes forever. How do I make this do an index scan instead? I did an analyze audittrailclinical to no avail. analyze says 203 million out of 207 million rows are matched by your timestamp filter, so it is definitely going to favour a sequential scan, since an index scan that matches that many rows will inevitably be slower than simply scanning the table, since it will have to both do the lookups and load the actual records from the table (all of them, basically) in order to determine their visibility to you, so your index scan will just turn sequential access of the table pages into random access and require index lookups as well. You can possibly verify this by setting enable_seqscan to false and running your analyze again and see how the plan changes, though I don't believe that will necessarily remove all sequential scans, it just reduces their likelihood, so you may see that nothing changes. If the estimate for the number of matching rows is incorrect, you'll want to increase the statistics gathering for that table or just that column. ALTER TABLE table ALTER COLUMN column SET STATISTICS number where number is between 10 and 1000 and I think the default is 100. Then re-analyze the table and see if the query plan shows better estimates. I think 9.2 also supports index only scans which eliminate the need to load the matched records in certain circumstances. However, all of the columns used by the query would need to be in the index, and you are using an awful lot of columns between the select clause and the table joins. Are you lacking indexes on the columns used for joins that would allow more selective index scans on those columns which could then just filter by timestamp? I'm not much of an expert on the query planner, so I'm not sure what exactly will cause that behaviour, but I'd think that good statistics and useful indexes should allow the rest of the where clause to be more selective of the rows from audittrailclinical unless patientaccount.defaultaccount = 'Y' and patient.dnsortpersonnumber = '347450' are similarly non-selective, though patient.dnsortpersonnumber would seem like it is probably the strong filter, so make sure you've got indexes and accurate stats on all of the foreign keys that connect patient table and audittrailclinical table. It'd be useful to see the rest of the explain analyze output so we could see how it is handling the joins and why. Note that because you have multiple composite foreign keys joining tables in your query, you almost certainly won't those composite keys in a single index. If you have indexes on those columns but they are single-column indexes, that may be what is causing the planner to try to filter the atc table on the timestamp rather than via the joins. I'm sure someone more knowledgable than I will be along eventually to correct any misinformation I may have passed along. Without knowing anything about your schema or the rest of the explain analyze output, I'm mostly just guessing. There is an entire page devoted to formulating useful mailing list questions, incidentally. Yours really isn't. Or if the atc table definition is complete, you are definitely missing potentially useful indexes, since you are joining to that table via encountersid and you don't show an index on that column - yet that is the column that eventually joins out to the patient and patientaccount tables, which have the stronger filters on them. Incidentally, why the join to the entity table via entitysid? No columns from that table appear to be used anywhere else in the query. --sam
Re: [PERFORM] Out of shared mem on new box with more mem, 9.1.5 - 9.1.6
On Wed, Oct 17, 2012 at 1:53 AM, Martin French martin.fre...@romaxtech.com wrote: Thanks for your response. What are the settings for: work_mem 100MB maintenance_work_mem 64MB How many concurrent connections are there? ~20 Have you ran explain analyze on the query that doesn't crash (i.e the old box) to get the exact execution plan? I can try that in the morning, but I didn't think this was relevant. I know cost estimates can be off, but can the plan actually change between a vanilla explain and an explain analyze? Has the DB been vacuum analyzed? Not outside of autovacuum, no, but it's actually a former replica of the first database (sorry I neglected to mention this earlier). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Out of shared mem on new box with more mem, 9.1.5 - 9.1.6
On Wed, Oct 17, 2012 at 1:53 AM, Martin French martin.fre...@romaxtech.com wrote: Thanks for your response. What are the settings for: work_mem 100MB This is a little higher than I would ordinarily set. I tend to cap at about 64MB maintenance_work_mem 64MB In Contrast, this is a little low for me, but I guess that table size is a big factor here. How many concurrent connections are there? ~20 Have you ran explain analyze on the query that doesn't crash (i.e the old box) to get the exact execution plan? I can try that in the morning, but I didn't think this was relevant. I know cost estimates can be off, but can the plan actually change between a vanilla explain and an explain analyze? The explain analyze gives a more detailed output. Has the DB been vacuum analyzed? Not outside of autovacuum, no, but it's actually a former replica of the first database (sorry I neglected to mention this earlier). This may be worthwhile. Even with autovacuum on, I still Vac Analyze manually during quiet periods. whether it's actually necessary or not, figure it's belt and braces. Looking at the explain, It'd suggest the tables aren't very large, so I can't see really why there'd be a problem. Notwithstanding the fact that you are only relatively small shared_buffers. Are there no other messages in the log files re: out of memory. There should be a dump which will show you where the memory usage is occurring. Other than that, you may want to consider increasing the shared buffers and see if that has any effect. Alternately, you may want to increase max_pred_locks_per_transaction beyond the default of 64, although this is not a parameter I've had to yet adjust. Cheers = Romax Technology Limited Rutherford House Nottingham Science Technology Park Nottingham, NG7 2PZ England Telephone numbers: +44 (0)115 951 88 00 (main) For other office locations see: http://www.romaxtech.com/Contact = === E-mail: i...@romaxtech.com Website: www.romaxtech.com = Confidentiality Statement This transmission is for the addressee only and contains information that is confidential and privileged. Unless you are the named addressee, or authorised to receive it on behalf of the addressee you may not copy or use it, or disclose it to anyone else. If you have received this transmission in error please delete from your system and contact the sender. Thank you for your cooperation. =
Re: [PERFORM] SELECT AND AGG huge tables
Hi all, Thanks for your advice and the link about posting my question in an appropriate form. Here are the info. I thank all of you in advance. Best regards Houman Postgres version: 9.1.4 = Postgres.conf max_connections = 100 shared_buffers = 8192MB work_mem = 500MB log_statement = 'none' datestyle = 'iso, mdy' lc_messages = 'en_US.UTF-8' lc_monetary = 'en_US.UTF-8' lc_numeric = 'en_US.UTF-8' lc_time = 'en_US.UTF-8' default_text_search_config = 'pg_catalog.english' max_locks_per_transaction = 256 = Hardware: CPU Quad Core Intel CPU processor : 0-7 vendor_id : GenuineIntel cpu family : 6 model : 45 model name : Intel(R) Core(TM) i7-3820 CPU @ 3.60GHz Memory: MemTotal: 32927920 kB HDD: OCZ VeloDrive - Solid-State-Disk - 600 GB - intern - PCI Express 2.0 x8 Multi-Level-Cell (MLC) PCI Express 2.0 x8 IO/stat=== iostat sdb1 1 Linux 3.2.0-23-generic (regula2)10/17/2012 _x86_64_(8 CPU) Device:tpskB_read/skB_wrtn/skB_readkB_wrtn sdb1 6.44 217.91 240.45 1956400373 2158777589 sdb1 0.00 0.00 0.00 0 0 sdb1 0.00 0.00 0.00 0 0 sdb1 0.00 0.00 0.00 0 0 sdb1 0.00 0.00 0.00 0 0 sdb1 0.00 0.00 0.00 0 0 =vmstat== procs ---memory-- ---swap-- -io -system-- cpu r b swpd free buff cache si sobibo in cs us sy id wa 1 0 44376 2417096 210784 2866402400303500 0 0 100 0 0 0 44376 2416964 210784 2866402400 0 0 80 138 0 0 100 0 1 0 44376 2416592 210784 2866402400 0 0 278 228 7 0 93 0 1 0 44376 2416592 210784 2866428000 0 0 457 305 12 0 88 0 1 0 44376 2416716 210784 2866428000 0 0 472 303 12 0 88 0 1 0 44376 2416716 210784 2866428000 0 0 462 296 13 0 88 0 1 0 44376 2416716 210784 2866428000 0 0 478 293 12 0 88 0 1 0 44376 2416716 210784 2866428000 0 0 470 317 12 0 87 0 1 0 44376 2416716 210784 2866428000 0 0 455 299 12 0 88 0 1 0 44376 2416716 210784 2866428000 0 0 459 301 12 0 87 0 1 0 44376 2416716 210784 2866428000 0 0 370 291 7 5 88 0 1 0 44376 2416716 210784 2866428000 029 459 319 12 1 88 0 1 0 44376 2416716 210784 2866428000 0 0 453 295 12 0 88 0 1 0 44376 2416716 210784 2866428000 0 0 449 284 12 0 88 0 1 0 44376 2416716 210784 2866428000 0 8 462 304 12 0 88 0 1 0 44376 2416716 210784 2866428000 0 0 459 307 12 0 88 0 2 0 44376 2416716 210784 2866428000 0 0 461 300 12 0 88 0 1 0 44376 2416716 210784 2866428000 0 0 457 299 12 0 87 0 1 0 44376 2416716 210784 2866428000 0 0 439 295 12 0 88 0 1 0 44376 2416716 210784 2866428000 0 0 439 306 12 0 88 0 1 0 44376 2416716 210784 2866428000 0 0 448 305 12 0 88 0 1 0 44376 2416716 210784 2866428000 0 0 457 289 12 0 88 0 0 0 44376 2416716 210784 2866428000 0 0 174 179 3 0 97 0 0 0 44376 2416716 210784 2866428000 0 0 73 133 0 0 100 0 0 0 44376 2416716 210784 2866428000 0 0 75 133 0 0 100 0 0 0 44376 2416716 210784 2866428000 0 0 70 127 0 0 100 0 Column |Type | Modifiers ---+-+--- modifying_action | integer | modifying_client | integer | modification_time | timestamp without time zone | instance_entity | integer | id| integer | not null default nextval('enigma.fact_seq'::regclass) successor | integer | reporting_date| integer | legal_entity | integer | client_system | integer | customer | integer | customer_type | integer | borrower | integer | nace | integer | lsk | integer | review_date
[PERFORM] High cost estimates when n_distinct is set
Hello I am working on a potentially large database table, let's call it observation, that has a foreign key to table measurement. Each measurement is associated with either none or around five observations. In this kind of situation, it is well known that the statistics on the foreign key column in observation table can get arbitrarily bad as the row count increases. Especially, the estimate of the number of distinct values in the foreign key column can be completely off. To combat this issue I have set n_distinct=-0.2 on the foreign key column. With this, the query planner gets good estimates of row counts, but it would appear that this setting does not affect the cost estimate of an index scan. With this, I get odd cost estimates, as if fetching these approximately five rows using index scan would take hundreds of random disk reads. Due to this high cost estimate, when joining these two tables, PostgreSQL changes from using multiple small index scans to scanning the whole table a lot earlier than would be beneficial. So, in more detail: I am using PostgreSQL 9.2.1 on Windows 7 SP 1 installed with EnterpriseDB one-click installer and with default settings. SELECT version(); PostgreSQL 9.2.1, compiled by Visual C++ build 1600, 64-bit I have set up a simple testing database as follows (running these commands takes around an hour on my DB. It's slow, sorry, but I need lots of rows to show the issue): CREATE TABLE observation ( id bigserial NOT NULL, measurement_id bigint NOT NULL, CONSTRAINT observation_pkey PRIMARY KEY (id) ); CREATE INDEX observation_measurement_id_idx ON observation USING btree (measurement_id); INSERT INTO observation SELECT x as id, CASE WHEN (x - 1) % 15 3 THEN ((x - 1) / 15) * 4 WHEN (x - 1) % 15 8 THEN ((x - 1) / 15) * 4 + 1 ELSE ((x - 1) / 15) * 4 + 2 END AS measurement_id FROM (SELECT generate_series(1, 1) AS x) AS series; ANALYZE observation; Here the measurement_id stands for the foreign key to table measurement. Actually having that table is not required to show the issue, though. Each number from range 1...2666 (1e8 * 4 / 15) appears 0, 3, 5 or 7 times in measurement_id column. After this, the statistics on measurement_id column look something like this: Distinct Values 1.23203e+006 Most Common Values {6895590, 8496970, 23294094, 75266, 128877, 150786, 175001, 192645, 216918, 262742, ... Most Common Frequencies {0.0001, 0.0001, 0.0001, 6.7e-005, 6.7e-005, 6.7e-005, 6.7e-005, 6.7e-005, 6.7e-005, 6.7e-005, ... Histogram Bounds {14, 208364, 511400, 840725, 1091642, 1392585, 1713998, 1945482, 2204897, 2476654, ... Correlation 1 As there are 20e6 distinct values, the 1.2e6 estimate is already somewhat off and will keep on going worse if more rows are added. Let's have a look on the query plan of fetching all observations of a single measurement: EXPLAIN (ANALYZE on, VERBOSE on, COSTS on, BUFFERS on, TIMING on ) SELECT id, measurement_id FROM observation WHERE measurement_id = 21; Index Scan using observation_measurement_id_idx on public.observation (cost=0.00..119.36 rows=82 width=16) (actual time=0.060..0.062 rows=5 loops=1) Output: id, measurement_id Index Cond: (observation.measurement_id = 21) Buffers: shared read=5 Total runtime: 0.081 ms The row estimate is off by a factor of 10, so let's help the planner and tell how many rows it is likely to find: ALTER TABLE observation ALTER COLUMN measurement_id SET (n_distinct=-0.2); ANALYZE observation; This doesn't change the row statistics noticeably, except for changing the number of distinct values. EXPLAIN (ANALYZE on, VERBOSE on, COSTS on, BUFFERS on, TIMING on ) SELECT id, measurement_id FROM observation WHERE measurement_id = 21; Index Scan using observation_measurement_id_idx on public.observation (cost=0.00..118.01 rows=5 width=16) (actual time=0.060..0.061 rows=5 loops=1) Output: id, measurement_id Index Cond: (observation.measurement_id = 21) Buffers: shared read=5 Total runtime: 0.073 ms So, the row count estimate is now good, but the cost estimate has not changed much. If I halve the random_page_cost (from 4 to 2), it nearly exactly halves the estimated cost, so it would appear this cost is almost completely from random page accesses, approximately 29 of them. In the original plan this made sense, as it expected to fetch 81 rows, but for five rows it would seem quite excessive. Enlarging the table from 100 million to 200 million rows almost doubles the estimated cost (from 118.01 to 227.69). Since the estimated and actual count of returned rows stay the same and the query is using a B tree, I'd expect the cost to rise only slightly. I have a distinct feeling that this is either a bug in the cost estimator or there's a quite valid reason which I have missed. Regardless, any insights you might have to this issue would be appreciated.
Re: [PERFORM] Out of shared mem on new box with more mem, 9.1.5 - 9.1.6
Maciek Sakrejda m.sakre...@gmail.com writes: We've run into a perplexing issue with a customer database. He moved from a 9.1.5 to a 9.1.6 and upgraded from an EC2 m1.medium (3.75GB RAM, 1.3 GB shmmax), to an m2.xlarge (17GB RAM, 5.7 GB shmmax), and is now regularly getting constant errors regarding running out of shared memory (there were none on the old system in the recent couple of days' logs from before the upgrade): ERROR: out of shared memory HINT: You might need to increase max_pred_locks_per_transaction. This has nothing to do with work_mem nor maintenance_work_mem; rather, it means you're running out of space in the database-wide lock table. You need to take the hint's advice. The query causing this has structurally identical plans on both systems: old: http://explain.depesz.com/s/Epzq new: http://explain.depesz.com/s/WZo The query in itself doesn't seem very exceptional. I wonder whether you recently switched your application to use serializable mode? But anyway, a query's demand for predicate locks can depend on a lot of not-very-visible factors, such as how many physical pages the tuples it accesses are spread across. I don't find it too hard to credit that yesterday you were just under the limit and today you're just over even though nothing changed. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Out of shared mem on new box with more mem, 9.1.5 - 9.1.6
On Wed, Oct 17, 2012 at 7:18 AM, Tom Lane t...@sss.pgh.pa.us wrote: ERROR: out of shared memory HINT: You might need to increase max_pred_locks_per_transaction. This has nothing to do with work_mem nor maintenance_work_mem; rather, it means you're running out of space in the database-wide lock table. You need to take the hint's advice. Sure, just trying to understand why this happened in the first place. The query in itself doesn't seem very exceptional. I wonder whether you recently switched your application to use serializable mode? The change (for some transactions) was relatively recent, but predated the switch to the replica by several days. Before the switch, everything was running fine. But anyway, a query's demand for predicate locks can depend on a lot of not-very-visible factors, such as how many physical pages the tuples it accesses are spread across. I don't find it too hard to credit that yesterday you were just under the limit and today you're just over even though nothing changed. Interesting, thanks for the input. So it could be just a coincidence that the errors occurred in lock-step with the promotion? Or does a replica have a different (or different enough) physical layout that this could have been a factor (my understanding of replication is relatively high level--read: vague)? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Two identical systems, radically different performance
On 10/15/2012 05:34 PM, Scott Marlowe wrote: On Mon, Oct 15, 2012 at 9:28 AM, Claudio Freire klaussfre...@gmail.com wrote: On Mon, Oct 15, 2012 at 12:24 PM, Andrea Suisani sick...@opinioni.net wrote: sure you're right. It's just that my bet was on a higher throughput when HT was isabled from the BIOS (as you stated previously in this thread). Yes, mine too. It's bizarre. If I were you, I'd look into it more deeply. It may be a flaw in your test methodology (maybe you disabled the wrong cores?). If not, it would be good to know why the extra TPS to replicate elsewhere. I'd recommend more synthetic benchmarks when trying to compare systems like this. bonnie++, you were right. bonnie++ (-f -n 0 -c 4) show that there's very little (if any) difference in terms of sequential input whether or not cache is enabled on the RAID1 (SAS 15K, sdb). I've run 2 bonnie++ test with both cache enabled and disabled and what I get (see attachments for more details) it's a 400MB/s sequential input (cache) vs 390MBs (nocache). I dunno why but I would have expected a higher delta (due to the 512MB cache) not a mere 10MB/s, but this is only based on my gut feeling. I've also tried to test RAID1 array where the OS is installed (2 SATA 7.2Krpm, sda) just to verify if cache effect is comparable with the one I get from SAS disks. Well it seems that there's no cache effects or if it's is there is so small as to be confused with the noise. Both array are configured with this params Read Policy : Adaptive Read Ahead Write Policy : Write Back Stripe Element Size : 64 KB Disk Cache Policy : Disabled those tests are performed with HT disable from the BIOS, but without using noht kernel boot param. the scheduler for sdb was setted to deadline while the default cfq for sda. the memory stream test that Greg Smith was working on, and so on. this one https://github.com/gregs1104/stream-scaling, right? I've executed the test with HT enabled, HT disabled from the BIOS and HT disable using sys interface. Attached 3 graphs and related text files Get an idea what core differences the machines display under such testing. I'm trying... hard :) Andrea Version 1.96 --Sequential Output-- --Sequential Input- --Random- Concurrency 4 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP cloud32000M 180829 23 112581 15 401611 19 917.7 10 Latency 445ms 954ms 360ms 64788us 1.96,1.96,cloud,4,1350463530,32000M180829,23,112581,15,,,401611,19,917.7,10,,,445ms,954ms,,360ms,64788us,, Version 1.96 --Sequential Output-- --Sequential Input- --Random- Concurrency 4 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP cloud32000M 179259 23 112093 15 400021 18 600.9 26 Latency 670ms 514ms 99640us 86081us 1.96,1.96,cloud,4,1350465025,32000M179259,23,112093,15,,,400021,18,600.9,26,,,670ms,514ms,,99640us,86081us,, Version 1.96 --Sequential Output-- --Sequential Input- --Random- Concurrency 4 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP cloud32000M 174212 22 108909 15 387598 18 928.9 11 Latency 960ms1397ms 967ms 71495us 1.96,1.96,cloud,4,1350465002,32000M174212,22,108909,15,,,387598,18,928.9,11,,,960ms,1397ms,,967ms,71495us,, Version 1.96 --Sequential Output-- --Sequential Input- --Random- Concurrency 4 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP cloud32000M 175185 22 109229 15 391409 18 912.9 10 Latency1002ms1225ms 1059ms 91373us 1.96,1.96,cloud,4,1350466614,32000M175185,22,109229,15,,,391409,18,912.9,10,,,1002ms,1225ms,,1059ms,91373us,, attachment: stream-ht_disabled_bios.png=== CPU cache information === [288/809] CPU /sys/devices/system/cpu/cpu0 Level 1 Cache: 32K (Data) CPU /sys/devices/system/cpu/cpu0 Level 1 Cache: 32K (Instruction) CPU /sys/devices/system/cpu/cpu0 Level 2 Cache: 256K (Unified) CPU /sys/devices/system/cpu/cpu0 Level 3 Cache: 15360K (Unified) CPU /sys/devices/system/cpu/cpu1 Level 1 Cache: 32K (Data) CPU /sys/devices/system/cpu/cpu1 Level 1
Re: [PERFORM] Two identical systems, radically different performance
On Wed, Oct 17, 2012 at 9:45 AM, Andrea Suisani sick...@opinioni.net wrote: On 10/15/2012 05:34 PM, Scott Marlowe wrote: I'd recommend more synthetic benchmarks when trying to compare systems like this. bonnie++, you were right. bonnie++ (-f -n 0 -c 4) show that there's very little (if any) difference in terms of sequential input whether or not cache is enabled on the RAID1 (SAS 15K, sdb). I'm mainly wanting to know the difference between the two systems, so if you can run it on the old and new machine and compare that that's the real test. I've run 2 bonnie++ test with both cache enabled and disabled and what I get (see attachments for more details) it's a 400MB/s sequential input (cache) vs 390MBs (nocache). I dunno why but I would have expected a higher delta (due to the 512MB cache) not a mere 10MB/s, but this is only based on my gut feeling. Well the sequential throughput doesn't really rely on caching. It's the random writes that benefit from caching, and the other things (random reads and seq read/write) that indirectly benefit because the random writes are so much faster that they no longer get in the way. So mostly compare random access between the old and new machines and look for differences there. the memory stream test that Greg Smith was working on, and so on. this one https://github.com/gregs1104/stream-scaling, right? Yep. I've executed the test with HT enabled, HT disabled from the BIOS and HT disable using sys interface. Attached 3 graphs and related text files Well it's pretty meh. I'd like to see the older machine compared to the newer one here tho. I'm trying... hard :) You're doing great. These problems take effort to sort out. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] LIKE op with B-Tree Index?
On Tue, Oct 16, 2012 at 8:01 PM, Sam Wong s...@hellosam.net wrote: On Wednesday, October 17, 2012 4:30, Merlin Moncure wrote, On Tue, Oct 16, 2012 at 3:15 AM, Sam Wong s...@hellosam.net wrote: Hi communities, I am investigating a performance issue involved with LIKE '%' on an index in a complex query with joins. The problem boils down into this simple scenario---: Scenario My database locale is C, using UTF-8 encoding. I tested this on 9.1.6 and 9. 2.1. Q1. SELECT * FROM shipments WHERE shipment_id LIKE '12345678%' Q2. SELECT * FROM shipments WHERE shipment_id = '12345678' AND shipment_id '12345679' ...snip... Question Is Q1 and Q2 equivalent? From what I see and the result they seems to be the same, or did I miss something? (Charset: C, Encoding: UTF-8) If they are equivalent, is that a bug of the planner? They are most certainly not equivalent. What if the shipping_id is 12345678Z? merlin But '12345678Z' is indeed = '12345678' AND '12345679'. Just like 'apple' 'apples' 'apply' in a dictionary. Right -- I didn't visualize it properly. Still, you're asking the server to infer that since you're looking between to adjacent textual characters range bounded [) it convert the 'between' to a partial string search. That hold up logically but probably isn't worth spending cycles to do, particularly in cases of non-ascii mappable unicode characters. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] LIKE op with B-Tree Index?
Moncure wrote on Thursday, October 18, 2012 1:45 On Tue, Oct 16, 2012 at 8:01 PM, Sam Wong s...@hellosam.net wrote: On Wednesday, October 17, 2012 4:30, Merlin Moncure wrote, On Tue, Oct 16, 2012 at 3:15 AM, Sam Wong s...@hellosam.net wrote: Hi communities, I am investigating a performance issue involved with LIKE '%' on an index in a complex query with joins. The problem boils down into this simple scenario---: Scenario My database locale is C, using UTF-8 encoding. I tested this on 9.1.6 and 9. 2.1. Q1. SELECT * FROM shipments WHERE shipment_id LIKE '12345678%' Q2. SELECT * FROM shipments WHERE shipment_id = '12345678' AND shipment_id '12345679' ...snip... Question Is Q1 and Q2 equivalent? From what I see and the result they seems to be the same, or did I miss something? (Charset: C, Encoding: UTF-8) If they are equivalent, is that a bug of the planner? They are most certainly not equivalent. What if the shipping_id is 12345678Z? merlin But '12345678Z' is indeed = '12345678' AND '12345679'. Just like 'apple' 'apples' 'apply' in a dictionary. Right -- I didn't visualize it properly. Still, you're asking the server to infer that since you're looking between to adjacent textual characters range bounded [) it convert the 'between' to a partial string search. That hold up logically but probably isn't worth spending cycles to do, particularly in cases of non-ascii mappable unicode characters. merlin Postgresql did that already. Refer to the analyze result of Q1 and Q2, it gives Index Cond: ((shipment_id = '12345678'::text) AND (shipment_id '12345679'::text)) (I also just realized they did it just now) Yet, with additional Filter (ref Q1 analyze), it's surprisingly that it estimates Q1 will have more rows that Q2. FYI, I made a self-contained test case and submitted a bug #7610. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance