[PERFORM] Explicit joins
I referred chapter 14.3 of postgres document version 9.0. explicit joins help the planner in planninng thus improve performance. On what relations are explicit joins to be added?? I am getting data from 10 tables in a view. I don't know on which pair of tables I have to add explicit joins to improve performance. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Explicit-joins-tp4372000p4372000.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres 9.0.4 + Hot Standby + FusionIO Drive + Performance = Query failed ERROR: catalog is missing 1 attribute(s) for relid 172226
I did the hot standby configured earlier and at that time I started using(querying) the standby database. May be something missed on the archive command. On Wed, May 4, 2011 at 8:44 PM, Josh Berkus j...@agliodbs.com wrote: While doing so I got the *Query failed ERROR: catalog is missing 1 attribute(s) for relid 172226*. Any idea on this error? Is that combination PG + HotSB + Fusion IO Drive is not advisable?! What were you doing when you got this error? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] amazon ec2
On Thu, May 5, 2011 at 1:02 AM, Josh Berkus j...@agliodbs.com wrote: FWIW, EnterpriseDB's InfiniCache provides the same caching benefit. The way that works is when PG goes to evict a page from shared buffers that page gets compressed and stuffed into a memcache cluster. When PG determines that a given page isn't in shared buffers it will then check that memcache cluster before reading the page from disk. This allows you to cache amounts of data that far exceed the amount of memory you could put in a physical server. So memcached basically replaces the filesystem? No, it sits in between shared buffers and the filesystem, effectively providing an additional layer of extremely large, compressed cache. Even on a single server there can be benefits over larger shared buffers due to the compression. That sounds cool, but I'm wondering if it's actually a performance speedup. Seems like it would only be a benefit for single-row lookups; any large reads would be a mess. Depends on the database and the workload - if you can fit your entire 100GB database in cache, and your workload is read intensive then the speedups are potentially huge (I've seen benchmarks showing 20x+). Write intensive workloads, less so, similarly if the working set is far larger than your cache size. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Explicit joins
Rishabh Kumar Jain rishabh.w...@yahoo.com wrote: I am getting data from 10 tables in a view. I don't know on which pair of tables I have to add explicit joins to improve performance. There's usually some fairly natural order in terms of understanding the request. I find it's often good to try to state in words what data I want to see (*not* how I think I could get that data, but describe logically which set of data I want), and list the tables in the order the appear in that description. The description will naturally tend to include or imply your join conditions and other selection criteria. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Poor query plan chosen in 9.0.3 vs 8.3.7
We have a generated query from our web application which takes far longer to complete in 9.0.4, than in 8.3.7 (60sec in 9.0.4 ~10sec in 8.3.7) The query plan generated in 9.0, includes a Materialize step which takes the bulk of the time for the query. If I disable materialize (by running set enable_material='off';) the query takes ~6 seconds to run (NOTE: the query plan chosen in this case is very similar to the plan chosen in 8.3.7). From reading http://rhaas.blogspot.com/2010/04/materialization-in-postgresql-90.html, I realize that the planner will be more aggressive in choosing Materialize in 9.0. Is there a way to modify the planner cost settings to minimize it's use in cases like this? Below, I have included the query in question and the explain analyze output for the case where enable_material is on. . NOTE: I will send explain analyze output for when enable_material is 'off' and information on the postgresql version, settings and server configuration in a follow up email due the length of email restrictions on the mailing list. (A vacuum analyze was run prior to running the queries for this email) Any help that you can provide would be greatly appreciated. Query SELECT * FROM ( SELECT Peter_SizeByType.RowId AS RowId, Peter_SizeByType.Name AS Name, Peter_SizeByType.Type AS Type, Peter_SizeByType.ZAve AS ZAve, Peter_SizeByType.TimeLabel AS TimeLabel, Peter_SizeByType.StorageTemperature AS StorageTemperature, Peter_SizeByType.Pdl AS Pdl, Peter_SizeByType.Cumulants AS Cumulants, Peter_SizeByType.AnalysisTool AS AnalysisTool, Peter_SizeByType.meanCountRate AS meanCountRate, Peter_SizeByType.ExtractionNumber AS ExtractionNumber, Peter_SizeByType.TestNumber AS TestNumber, Peter_SizeByType.Sort AS Sort FROM (SELECT PS_2.rowid AS RowId, F_3.Name AS Name, F_3.Type AS Type, PS_2.zave AS ZAve, PS_2.timelabel AS TimeLabel, PS_2.storagetemperature AS StorageTemperature, PS_2.pdl AS Pdl, PS_2.cumulants AS Cumulants, PS_2.analysistool AS AnalysisTool, PS_2.meancountrate AS meanCountRate, PS_2.extractionnumber AS ExtractionNumber, PS_2.testnumber AS TestNumber, T_4.sort AS Sort FROM (SELECT c69d129_particle_size_result_fields_5.analysistool AS analysistool, c69d129_particle_size_result_fields_5.cumulants AS cumulants, c69d129_particle_size_result_fields_5.extractionnumber AS extractionnumber, c69d129_particle_size_result_fields_5.pdl AS pdl, c69d129_particle_size_result_fields_5.rowid AS rowid, (SELECT RunId FROM exp.Data WHERE RowId = c69d129_particle_size_result_fields_5.DataId) AS Run, c69d129_particle_size_result_fields_5$Run$.container AS Run_Folder, c69d129_particle_size_result_fields_5$Run$Folder$.entityid AS Run_Folder_EntityId, c69d129_particle_size_result_fields_5$Run$.name AS Run_Name, c69d129_particle_size_result_fields_5.storagetemperature AS storagetemperature, c69d129_particle_size_result_fields_5.testnumber AS testnumber, c69d129_particle_size_result_fields_5.timelabel AS timelabel, c69d129_particle_size_result_fields_5.zave AS zave, c69d129_particle_size_result_fields_5.meancountrate AS meancountrate, c69d129_particle_size_result_fields_5.rowid AS rowid1 FROM (SELECT * FROM assayresult.c69d129_particle_size_result_fields WHERE (((SELECT Container FROM exp.Data WHERE RowId = DataId) IN ('d938da12-1b43-102d-a8a2-78911b79dd1c' c69d129_particle_size_result_fields_5 LEFT OUTER JOIN (SELECT * FROM exp.experimentrun WHERE (((protocollsid LIKE 'urn:lsid:labkey.com:Particle+SizeProtocol.Folder-69:Particle+Size'))) AND (container IN ('d938da12-1b43-102d-a8a2-78911b79dd1c'))) c69d129_particle_size_result_fields_5$Run$ ON ((SELECT RunId FROM exp.Data WHERE RowId = c69d129_particle_size_result_fields_5.DataId) = c69d129_particle_size_result_fields_5$Run$.rowid) LEFT OUTER JOIN core.containers c69d129_particle_size_result_fields_5$Run$Folder$ ON (c69d129_particle_size_result_fields_5$Run$.container = c69d129_particle_size_result_fields_5$Run$Folder$.entityid)) PS_2 INNER JOIN (SELECT Formulations_6.container AS Folder, Formulations_6$Folder$.entityid AS Folder_EntityId, Formulations_6.name AS Name, Formulations_6.rowid AS RowId, CAST((SELECT StringValue FROM exp.ObjectProperty WHERE exp.ObjectProperty.PropertyId = 560 AND exp.ObjectProperty.ObjectId = Formulations_6$LSID$_C.ObjectId) AS VARCHAR(4000)) AS Type FROM (SELECT * FROM exp.material WHERE (container IN ('d938da12-1b43-102d-a8a2-78911b79dd1c')) AND ((cpastype = 'urn:lsid:labkey.com:SampleSet.Folder-69:Formulations'))) Formulations_6 LEFT OUTER JOIN core.containers Formulations_6$Folder$ ON (Formulations_6.container = Formulations_6$Folder$.entityid) LEFT OUTER JOIN exp.object Formulations_6$LSID$_C ON (Formulations_6.lsid = Formulations_6$LSID$_C.objecturi) AND Formulations_6.Container = 'd938da12-1b43-102d-a8a2-78911b79dd1c') F_3 ON PS_2.Run_Name=F_3.Name || '.xls' AND
Re: [PERFORM] VX_CONCURRENT flag on vxfs( 5.1 or later) for performance for postgresql?
On Sat, Apr 30, 2011 at 4:51 AM, Hsien-Wen Chu chu.hsien@gmail.com wrote: since the block size is 8k for the default, and it consisted with many tuple/line; as my understand, if any tuple/line is changed(maybe update, insert, delete). the block will be marked as dirty block. and then it will be flashed to disk by bgwriter. True... so my question is if the data block(8k) is aligned with the file system block? if it is aligned with file system block, so what's the potential issue make it is not safe for direct io. (please assume vxfs, vxvm and the disk sector is aligned ).please correct me if any incorrect. It's not about safety - it's about performance. On a machine with 64GB of RAM, a typical setting for shared_buffers set to 8GB. If you start reading blocks into the PostgreSQL cache - or writing them out of the cache - in a way that bypasses the filesystem cache, you're going to have only 8GB of cache, instead of some much larger amount. More cache = better performance. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] ask the database engine tuning on the server
Didik Prasetyo prasetyodidi...@yahoo.com wrote: I had a problem with performance engine database, I use the server with the following specifications 1. its storage configuration? Storage SCSI 15K RAID 5 2. how his network? 2 gigabit bonding. 3. type / behavior of applications that connect to the db? Direct connects one segment. 4. use the machine? DB only 5. its os? * nix or windows? Linux 6. how large the existing data? growth? growth 100M / day I had a problem with memory is always exhausted, causing database stack or very slowly, I wanted to ask how to do the tuning postgresql.conf settings for the above case, or whether there is another solution I'm not entirely sure what problem you are seeing. If you post again, please show your version and configuration. An easy way to do this is to run the query on this page and pastte the output into your post: http://wiki.postgresql.org/wiki/Server_Configuration For general tuning advice you should read this page: http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server If there is a particular query which is causing problems, please post detail related to that query. See this page: http://wiki.postgresql.org/wiki/SlowQueryQuestions If it is some other sort of problem, it helps to provide more detail and to copy and paste any error messages: http://wiki.postgresql.org/wiki/Guide_to_reporting_problems A couple general points which might apply: (1) RAID 5 is OK for reads, but is slow for random writes in heavy loads. This is mitigated somewhat by having a good RAID controller with a battery backed RAM cache configured for write-back. (2) You need to be running autovacuum and you need to avoid long-running transactions (including those which show as idle in transaction) to avoid bloat, which can cause your database to grow rapidly. I hope this helps, -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Poor query plan chosen in 9.0.3 vs 8.3.7
Here is the explain analyze output for when enable_material is 'off' and information on the postgresql version, settings and server configuration (I had to send a follow up email due the length of email restrictions on the mailing list.) (I apologize for the length of these email messages. And if this message does not get threaded properly as I did not receive a copy of my original message to mailing list) Explain Analyze output with enable_material='off' Limit (cost=231785.69..231785.94 rows=101 width=71) (actual time=6616.943..6617.130 rows=101 loops=1) - Sort (cost=231785.69..231785.95 rows=104 width=71) (actual time=6616.940..6617.003 rows=101 loops=1) Sort Key: material.name Sort Method: top-N heapsort Memory: 39kB - Nested Loop Left Join (cost=239.46..231782.21 rows=104 width=71) (actual time=3.840..6484.883 rows=67044 loops=1) Join Filter: ((material.container)::text = 'd938da12-1b43-102d-a8a2-78911b79dd1c'::text) - Hash Join (cost=239.46..230812.42 rows=104 width=155) (actual time=3.785..2676.643 rows=67044 loops=1) Hash Cond: ((experimentrun.name)::text = (( material.name)::text || '.xls'::text)) - Nested Loop (cost=3.27..230557.06 rows=93 width=129) (actual time=0.170..2502.106 rows=67044 loops=1) - Nested Loop (cost=0.00..229848.97 rows=93 width=137) (actual time=0.153..1368.990 rows=67044 loops=1) - Nested Loop (cost=0.00..2.33 rows=1 width=74) (actual time=0.016..0.025 rows=1 loops=1) - Seq Scan on containers c69d129_particle_size_result_fields_5$run$folder$ (cost=0.00..1.16 rows=1 width=37) (actual time=0.010..0.013 rows=1 loops=1) Filter: ((entityid)::text = 'd938da12-1b43-102d-a8a2-78911b79dd1c'::text) - Seq Scan on containers formulations_6$folder$ (cost=0.00..1.16 rows=1 width=37) (actual time=0.003..0.007 rows=1 loops=1) Filter: ((formulations_6$folder$.entityid)::text = 'd938da12-1b43-102d-a8a2-78911b79dd1c'::text) - Nested Loop (cost=0.00..229845.71 rows=93 width=63) (actual time=0.133..1244.558 rows=67044 loops=1) - Seq Scan on c69d129_particle_size_result_fields (cost=0.00..229742.02 rows=348 width=59) (actual time=0.018..572.402 rows=69654 loops=1) Filter: (((SubPlan 3))::text = 'd938da12-1b43-102d-a8a2-78911b79dd1c'::text) SubPlan 3 - Index Scan using pk_data on data (cost=0.00..3.27 rows=1 width=37) (actual time=0.004..0.005 rows=1 loops=69654) Index Cond: (rowid = $2) - Index Scan using pk_indexvarchar on indexvarchar (cost=0.00..0.29 rows=1 width=10) (actual time=0.004..0.006 rows=1 loops=69654) Index Cond: ((indexvarchar.listid = 17) AND ((indexvarchar.key)::text = (c69d129_particle_size_result_fields.timelabel)::text)) - Index Scan using pk_experimentrun on experimentrun (cost=3.27..4.33 rows=1 width=74) (actual time=0.005..0.006 rows=1 loops=67044) Index Cond: (experimentrun.rowid = (SubPlan 4)) Filter: (((experimentrun.protocollsid)::text ~~ 'urn:lsid:labkey.com:Particle+SizeProtocol.Folder-69:Particle+Size'::text) AND ((experimentrun.container)::text = 'd938da12-1b43-102d-a8a2-7891 1b79dd1c'::text)) SubPlan 4 - Index Scan using pk_data on data (cost=0.00..3.27 rows=1 width=4) (actual time=0.004..0.005 rows=1 loops=67044) Index Cond: (rowid = $2) SubPlan 4 - Index Scan using pk_data on data (cost=0.00..3.27 rows=1 width=4) (actual time=0.004..0.005 rows=1 loops=67044) Index Cond: (rowid = $2) - Hash (cost=226.51..226.51 rows=774 width=96) (actual time=3.587..3.587 rows=1303 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 158kB - Seq Scan on material (cost=0.00..226.51 rows=774 width=96) (actual time=0.071..2.354 rows=1303 loops=1) Filter: (((container)::text = 'd938da12-1b43-102d-a8a2-78911b79dd1c'::text) AND ((cpastype)::text = 'urn:lsid:labkey.com:SampleSet.Folder-69:Formulations'::text)) - Index Scan using uq_object on object
Re: [PERFORM] Poor query plan chosen in 9.0.3 vs 8.3.7
Hey Brian, Brian Connolly wrote: (I had to send a follow up email due the length of email restrictions on the mailing list.) A tip for when you have this problem in the future -- turn off html mail. It will reduce your email message length by 50% - 90%. HTH Bosco. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Poor query plan chosen in 9.0.3 vs 8.3.7
Brian Connolly bc...@labkey.com writes: Any help that you can provide would be greatly appreciated. I'd suggest trying to get rid of the weird little subselects, like this one: ... SELECT * FROM assayresult.c69d129_particle_size_result_fields WHERE (((SELECT Container FROM exp.Data WHERE RowId = DataId) IN ('d938da12-1b43-102d-a8a2-78911b79dd1c'))) ... If you turned that into a regular join between c69d129_particle_size_result_fields and Data, the planner probably wouldn't be nearly as confused about how many rows would result. It's the way-off rowcount estimate for this construct that's causing most of the problem, AFAICS: - Seq Scan on c69d129_particle_size_result_fields (cost=0.00..229742.02 rows=348 width=59) (actual time=0.018..572.402 rows=69654 loops=1) Filter: (((SubPlan 3))::text = 'd938da12-1b43-102d-a8a2-78911b79dd1c'::text) SubPlan 3 - Index Scan using pk_data on data (cost=0.00..3.27 rows=1 width=37) (actual time=0.004..0.005 rows=1 loops=69654) Index Cond: (rowid = $2) regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance