[PERFORM] How to read query plan
Hi all, I am new to PostgreSQL and query optimizations. We have recently moved our project from MySQL to PostgreSQL and we are having performance problem with one of our most often used queries. On MySQL the speed was sufficient but PostgreSQL chooses time expensive query plan. I would like to optimize it somehow but the query plan from EXPLAIN ANALYZE is little bit cryptic to me. So the first thing I would like is to understand the query plan. I have read performance tips and FAQ but it didn't move me too much further. I would appreciate if someone could help me to understand the query plan and what are the possible general options I can test. I think at this moment the most expensive part is the Sort. Am I right? If so, how could I generally avoid it (turning something on or off, using parentheses for JOINs etc.) to force some more efficient query plan? Thank you for any suggestions. QUERY PLAN Merge Right Join (cost=9868.84..9997.74 rows=6364 width=815) (actual time=9982.022..10801.216 rows=6364 loops=1) Merge Cond: (outer.idpk = inner.cadastralunitidfk) - Index Scan using cadastralunits_pkey on cadastralunits (cost=0.00..314.72 rows=13027 width=31) (actual time=0.457..0.552 rows=63 loops=1) - Sort (cost=9868.84..9884.75 rows=6364 width=788) (actual time=9981.405..10013.708 rows=6364 loops=1) Sort Key: addevicessites.cadastralunitidfk - Hash Left Join (cost=5615.03..7816.51 rows=6364 width=788) (actual time=3898.603..9884.248 rows=6364 loops=1) Hash Cond: (outer.addevicessitepartnerstickeridfk = inner.idpk) - Hash Left Join (cost=5612.27..7718.29 rows=6364 width=762) (actual time=3898.243..9104.791 rows=6364 loops=1) Hash Cond: (outer.addevicessitepartnermaintaineridfk = inner.idpk) - Hash Left Join (cost=5609.51..7620.06 rows=6364 width=736) (actual time=3897.996..8341.965 rows=6364 loops=1) Hash Cond: (outer.addevicessitepartnerelectricitysupplieridfk = inner.idpk) - Hash Left Join (cost=5606.74..7521.84 rows=6364 width=710) (actual time=3897.736..7572.182 rows=6364 loops=1) Hash Cond: (outer.addevicessitepartneridentificationoperatoridfk = inner.idpk) - Nested Loop Left Join (cost=5603.98..7423.62 rows=6364 width=684) (actual time=3897.436..6821.713 rows=6364 loops=1) Join Filter: (outer.addevicessitestatustypeidfk = inner.idpk) - Nested Loop Left Join (cost=5602.93..6706.61 rows=6364 width=657) (actual time=3897.294..6038.976 rows=6364 loops=1) Join Filter: (outer.addevicessitepositionidfk = inner.idpk) - Nested Loop Left Join (cost=5601.89..6276.01 rows=6364 width=634) (actual time=3897.158..5303.575 rows=6364 loops=1) Join Filter: (outer.addevicessitevisibilityidfk = inner.idpk) - Merge Right Join (cost=5600.85..5702.21 rows=6364 width=602) (actual time=3896.963..4583.749 rows=6364 loops=1) Merge Cond: (outer.idpk = inner.addevicessitesizeidfk) - Index Scan using addevicessitesizes_pkey on addevicessitesizes (cost=0.00..5.62 rows=110 width=14) (actual time=0.059..0.492 rows=110 loops=1) - Sort (cost=5600.85..5616.76 rows=6364 width=592) (actual time=3896.754..3915.022 rows=6364 loops=1) Sort Key: addevicessites.addevicessitesizeidfk - Hash Left Join (cost=2546.59..4066.81 rows=6364 width=592) (actual time=646.162..3792.310 rows=6364 loops=1) Hash Cond: (outer.addevicessitedistrictidfk = inner.idpk) - Hash Left Join (cost=2539.29..3964.05 rows=6364 width=579) (actual time=645.296..3142.128 rows=6364 loops=1) Hash Cond: (outer.addevicessitestreetdescriptionidfk = inner.idpk) - Hash Left Join (cost=2389.98..2724.64 rows=6364 width=544) (actual time=632.806..2466.030 rows=6364 loops=1) Hash Cond: (outer.addevicessitestreetidfk = inner.idpk) - Hash Left Join (cost=2324.25..2515.72 rows=6364 width=518) (actual time=626.081..1822.137 rows=6364 loops=1)
Re: [PERFORM] How to read query plan
Miroslav ulc wrote: Hi all, I am new to PostgreSQL and query optimizations. We have recently moved our project from MySQL to PostgreSQL and we are having performance problem with one of our most often used queries. On MySQL the speed was sufficient but PostgreSQL chooses time expensive query plan. I would like to optimize it somehow but the query plan from EXPLAIN ANALYZE is little bit cryptic to me. So the first thing I would like is to understand the query plan. I have read performance tips and FAQ but it didn't move me too much further. I would appreciate if someone could help me to understand the query plan and what are the possible general options I can test. I think at this moment the most expensive part is the Sort. Am I right? If so, how could I generally avoid it (turning something on or off, using parentheses for JOINs etc.) to force some more efficient query plan? Thank you for any suggestions. You really need to post the original query, so we can see *why* postgres thinks it needs to run the plan this way. Also, the final sort actually isn't that expensive. When you have the numbers (cost=xxx..yyy) the xxx is the time when the step can start, and the yyy is the time when the step can finish. For a lot of steps, it can start running while the sub-steps are still feeding back more data, for others, it has to wait for the sub-steps to finish. The first thing to look for, is to make sure the estimated number of rows is close to the actual number of rows. If they are off, then postgres may be mis-estimating the optimal plan. (If postgres thinks it is going to only need 10 rows, it may use an index scan, but when 1000 rows are returned, a seq scan might have been faster.) You seem to be doing a lot of outer joins. Is that necessary? I don't really know what you are looking for, but you are joining against enough tables, that I think this query is always going to be slow. From what I can tell, you have 1 table which has 6364 rows, and you are grabbing all of those rows, and then outer joining it with about 11 other tables. I would actually guess that the most expensive parts of the plan are the NESTED LOOPS which when they go to materialize have to do a sequential scan, and they get executed 6364 times. It looks like the other tables are small (only 3-5 rows), so it takes about 0.05 ms for each seqscan, the problem is that because you are doing it 6k times, it ends up taking about 300ms of your time. You could try setting set enable_nestloop to off. I don't know that it will be faster, but it could be. In general, though, it seems like you should be asking a different question, rather than trying to optimize the query that you have. Can you post the original SQL statement, and maybe describe what you are trying to do? John =:- signature.asc Description: OpenPGP digital signature
Re: [PERFORM] How to read query plan
On Sun, 2005-03-13 at 16:32 +0100, Miroslav ulc wrote: Hi all, I am new to PostgreSQL and query optimizations. We have recently moved our project from MySQL to PostgreSQL and we are having performance problem with one of our most often used queries. On MySQL the speed was sufficient but PostgreSQL chooses time expensive query plan. I would like to optimize it somehow but the query plan from EXPLAIN ANALYZE is little bit cryptic to me. [snip output of EXPLAIN ANALYZE] for those of us who have not yet reached the level where one can infer it from the query plan, how abour showing us the actual query too ? but as an example of what to look for, consider the first few lines (reformatted): Merge Right Join (cost=9868.84..9997.74 rows=6364 width=815) (actual time=9982.022..10801.216 rows=6364 loops=1) Merge Cond: (outer.idpk = inner.cadastralunitidfk) - Index Scan using cadastralunits_pkey on cadastralunits (cost=0.00..314.72 rows=13027 width=31) (actual time=0.457..0.552 rows=63 loops=1) - Sort (cost=9868.84..9884.75 rows=6364 width=788) (actual time=9981.405..10013.708 rows=6364 loops=1) notice that the index scan is expected to return 13027 rows, but actually returns 63. this might influence the a choice of plan. gnari ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] How to read query plan
Hi John, thank you for your response. John Arbash Meinel wrote: You really need to post the original query, so we can see *why* postgres thinks it needs to run the plan this way. Here it is: SELECT AdDevicesSites.IDPK, AdDevicesSites.AdDevicesSiteSizeIDFK, AdDevicesSites.AdDevicesSiteRegionIDFK, AdDevicesSites.AdDevicesSiteCountyIDFK, AdDevicesSites.AdDevicesSiteCityIDFK, AdDevicesSites.AdDevicesSiteDistrictIDFK, AdDevicesSites.AdDevicesSiteStreetIDFK, AdDevicesSites.AdDevicesSiteStreetDescriptionIDFK, AdDevicesSites.AdDevicesSitePositionIDFK, AdDevicesSites.AdDevicesSiteVisibilityIDFK, AdDevicesSites.AdDevicesSiteStatusTypeIDFK, AdDevicesSites.AdDevicesSitePartnerIdentificationOperatorIDFK, AdDevicesSites.AdDevicesSitePartnerElectricitySupplierIDFK, AdDevicesSites.AdDevicesSitePartnerMaintainerIDFK, AdDevicesSites.AdDevicesSitePartnerStickerIDFK, AdDevicesSites.CadastralUnitIDFK, AdDevicesSites.MediaType, AdDevicesSites.Mark, AdDevicesSites.Amount, AdDevicesSites.Distance, AdDevicesSites.OwnLightening, AdDevicesSites.LocationDownTown, AdDevicesSites.LocationSuburb, AdDevicesSites.LocationBusinessDistrict, AdDevicesSites.LocationResidentialDistrict, AdDevicesSites.LocationIndustrialDistrict, AdDevicesSites.LocationNoBuildings, AdDevicesSites.ParkWayHighWay, AdDevicesSites.ParkWayFirstClassRoad, AdDevicesSites.ParkWayOtherRoad, AdDevicesSites.ParkWayStreet, AdDevicesSites.ParkWayAccess, AdDevicesSites.ParkWayExit, AdDevicesSites.ParkWayParkingPlace, AdDevicesSites.ParkWayPassangersOnly, AdDevicesSites.ParkWayCrossRoad, AdDevicesSites.PositionStandAlone, AdDevicesSites.NeighbourhoodPublicTransportation, AdDevicesSites.NeighbourhoodInterCityTransportation, AdDevicesSites.NeighbourhoodPostOffice, AdDevicesSites.NeighbourhoodNewsStand, AdDevicesSites.NeighbourhoodAmenities, AdDevicesSites.NeighbourhoodSportsSpot, AdDevicesSites.NeighbourhoodHealthServiceSpot, AdDevicesSites.NeighbourhoodShops, AdDevicesSites.NeighbourhoodShoppingCenter, AdDevicesSites.NeighbourhoodSuperMarket, AdDevicesSites.NeighbourhoodPetrolStation, AdDevicesSites.NeighbourhoodSchool, AdDevicesSites.NeighbourhoodBank, AdDevicesSites.NeighbourhoodRestaurant, AdDevicesSites.NeighbourhoodHotel, AdDevicesSites.RestrictionCigarettes, AdDevicesSites.RestrictionPolitics, AdDevicesSites.RestrictionSpirits, AdDevicesSites.RestrictionSex, AdDevicesSites.RestrictionOther, AdDevicesSites.RestrictionNote, AdDevicesSites.SpotMapFile, AdDevicesSites.SpotPhotoFile, AdDevicesSites.SourcePhotoTimeStamp, AdDevicesSites.SourceMapTimeStamp, AdDevicesSites.Price, AdDevicesSites.WebPrice, AdDevicesSites.CadastralUnitCode, AdDevicesSites.BuildingNumber, AdDevicesSites.ParcelNumber, AdDevicesSites.GPSLatitude, AdDevicesSites.GPSLongitude, AdDevicesSites.GPSHeight, AdDevicesSites.MechanicalOpticalCoordinates, AdDevicesSites.Deleted, AdDevicesSites.Protected, AdDevicesSites.DateCreated, AdDevicesSites.DateLastModified, AdDevicesSites.DateDeleted, AdDevicesSites.CreatedByUserIDFK, AdDevicesSites.LastModifiedByUserIDFK, AdDevicesSites.DeletedByUserIDFK, AdDevicesSites.PhotoLastModificationDate, AdDevicesSites.MapLastModificationDate, AdDevicesSites.DateLastImported, AdDevicesSiteRegions.Name AS AdDevicesSiteRegionName, AdDevicesSiteCounties.Name AS AdDevicesSiteCountyName, AdDevicesSiteCities.Name AS AdDevicesSiteCityName, AdDevicesSiteStreets.Name AS AdDevicesSiteStreetName, AdDevicesSiteDistricts.Name AS AdDevicesSiteDistrictName, AdDevicesSiteStreetDescriptions.Name_cs AS AdDevicesSiteStreetDescriptionName_cs, AdDevicesSiteStreetDescriptions.Name_en AS AdDevicesSiteStreetDescriptionName_en, AdDevicesSiteSizes.Name AS AdDevicesSiteSizeName, SUBSTRING(AdDevicesSiteVisibilities.Name_cs, 3) AS AdDevicesSiteVisibilityName_cs, SUBSTRING(AdDevicesSiteVisibilities.Name_en, 3) AS AdDevicesSiteVisibilityName_en, AdDevicesSitePositions.Name_cs AS AdDevicesSitePositionName_cs, AdDevicesSitePositions.Name_en AS AdDevicesSitePositionName_en, AdDevicesSiteStatusTypes.Name_cs AS AdDevicesSiteStatusTypeName_cs, AdDevicesSiteStatusTypes.Name_en AS AdDevicesSiteStatusTypeName_en, PartnerIdentificationsOperator.Name AS PartnerIdentificationOperatorName, PartnersElectricitySupplier.Name AS PartnerElectricitySupplierName, PartnersMaintainer.Name AS PartnerMaintainerName, PartnersSticker.Name AS PartnerStickerName, CadastralUnits.Code AS CadastralUnitCodeNative, CadastralUnits.Name AS CadastralUnitName FROM AdDevicesSites LEFT JOIN AdDevicesSiteRegions ON AdDevicesSites.AdDevicesSiteRegionIDFK = AdDevicesSiteRegions.IDPK LEFT JOIN AdDevicesSiteCounties ON AdDevicesSites.AdDevicesSiteCountyIDFK = AdDevicesSiteCounties.IDPK LEFT JOIN AdDevicesSiteCities ON AdDevicesSites.AdDevicesSiteCityIDFK = AdDevicesSiteCities.IDPK LEFT JOIN AdDevicesSiteStreets ON AdDevicesSites.AdDevicesSiteStreetIDFK = AdDevicesSiteStreets.IDPK LEFT JOIN AdDevicesSiteStreetDescriptions ON AdDevicesSites.AdDevicesSiteStreetDescriptionIDFK
Re: [PERFORM] One tuple per transaction
-Original Message- From: Josh Berkus [mailto:[EMAIL PROTECTED] Sent: Sunday, March 13, 2005 12:05 AM To: Tambet Matiisen Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] One tuple per transaction Tambet, In one of our applications we have a database function, which recalculates COGS (cost of good sold) for certain period. This involves deleting bunch of rows from one table, inserting them again in correct order and updating them one-by-one (sometimes one row twice) to reflect current state. The problem is, that this generates an enormous amount of tuples in that table. Sounds like you have an application design problem ... how about re-writing your function so it's a little more sensible? I agree, that I have violated the no 1 rule of transactions - don't make the transaction last too long. But imagine a situation, where a table is updated twice in transaction. Why? Perhaps programmer felt, that the code is more modular in this way. Now if you have tons of those transactions, the I/O throughput is twice as big as it could be, because every transaction creates two tuples instead of one. One tuple per transaction could allow the programmer to keep his modular code and benefit from the increased performance. Tambet ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] How to read query plan
Hi Ragnar, Ragnar Hafsta wrote: [snip output of EXPLAIN ANALYZE] for those of us who have not yet reached the level where one can infer it from the query plan, how abour showing us the actual query too ? I thought it will be sufficient to show me where the main bottleneck is. And in fact, the query is rather lengthy. But I have included it in the response to John. So sorry for the incompletness. but as an example of what to look for, consider the first few lines (reformatted): Merge Right Join (cost=9868.84..9997.74 rows=6364 width=815) (actual time=9982.022..10801.216 rows=6364 loops=1) Merge Cond: (outer.idpk = inner.cadastralunitidfk) - Index Scan using cadastralunits_pkey on cadastralunits (cost=0.00..314.72 rows=13027 width=31) (actual time=0.457..0.552 rows=63 loops=1) - Sort (cost=9868.84..9884.75 rows=6364 width=788) (actual time=9981.405..10013.708 rows=6364 loops=1) notice that the index scan is expected to return 13027 rows, but actually returns 63. this might influence the a choice of plan. Yes, the situation in this scenario is that the table of CadastralUnits contains all units from country but the AdDevices in this case are only from the 63 CadastralUnits. So the result - 63 rows - is just this little subset. Up to that, not all AdDevices have CadastralUnitIDFK set to an IDPK that exists in CadastralUnits but to zero (= no CadastralUnit set). gnari Miroslav ulc begin:vcard fn;quoted-printable:Miroslav =C5=A0ulc n;quoted-printable:=C5=A0ulc;Miroslav org:StartNet s.r.o. adr;quoted-printable;quoted-printable:;;Vrchlick=C3=A9ho 161/5;Praha 5;;150 00;=C4=8Cesk=C3=A1 republika email;internet:[EMAIL PROTECTED] title:CEO tel;work:+420 257 225 602 tel;cell:+420 603 711 413 x-mozilla-html:TRUE url:http://www.startnet.cz version:2.1 end:vcard ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] How to read query plan
Miroslav ulc wrote: Hi John, thank you for your response. How about a quick side track. Have you played around with your shared_buffers, maintenance_work_mem, and work_mem settings? What version of postgres are you using? The above names changed in 8.0, and 8.0 also has some perfomance improvements over the 7.4 series. What is your hardware? Are you testing this while there is load on the system, or under no load. Are you re-running the query multiple times, and reporting the later speeds, or just the first time? (If nothing is loaded into memory, the first run is easily 10x slower than later ones.) Just some background info. If you have set these to reasonable values, we probably don't need to spend much time here, but it's just one of those things to check. John =:- signature.asc Description: OpenPGP digital signature
Re: [PERFORM] How to read query plan
Miroslav ulc wrote: Hi John, thank you for your response. I will comment on things separately. John Arbash Meinel wrote: ... These external tables contain information that are a unique parameter of the AdDevice (like Position, Region, County, City etc.), in some containing localized description of the property attribute. Some of them could be moved into the main table but that would create a redundancy, some of them cannot be moved into the main table (like information about Partners which is definitely another object with respect to AdDevices). I think the names of the tables are self-explanatory so it should be clear what each table stores. Is this design incorrect? It's actually more of a question as to why you are doing left outer joins, rather than simple joins. Are the tables not fully populated? If so, why not? How are you using this information? Why is it useful to get back rows that don't have all of their information filled out? Why is it useful to have so many columns returned? It seems like it most cases, you are only going to be able to use *some* of the information, why not create more queries that are specialized, rather than one get everything query. In fact, we only need about 30 records at a time but LIMIT can speed-up the query only when looking for the first 30 records. Setting OFFSET slows the query down. Have you thought about using a cursor instead of using limit + offset? This may not help the overall time, but it might let you split up when the time is spent. BEGIN; DECLARE cursor_name CURSOR FOR SELECT ... FROM ...; FETCH FORWARD 30 FROM cursor_name; FETCH FORWARD 30 FROM cursor_name; ... END; I don't really know what you are looking for, but you are joining against enough tables, that I think this query is always going to be slow. In MySQL the query was not so slow and I don't see any reason why there should be large differences in SELECT speed. But if the design of the tables is incorrect, we will correct it. In the other post I asked about your postgres settings. The defaults are pretty stingy, so that *might* be an issue. From what I can tell, you have 1 table which has 6364 rows, and you are grabbing all of those rows, and then outer joining it with about 11 other tables. Here are the exact numbers: AdDevicesSites - 6364 AdDevicesSiteRegions - 15 AdDevicesSiteCounties - 110 AdDevicesSiteCities - 124 AdDevicesSiteStreets - 2858 AdDevicesSiteStreetDescriptions - 4585 AdDevicesSiteDistricts - 344 AdDevicesSiteSizes - 110 AdDevicesSiteVisibilities - 4 AdDevicesSitePositions - 3 AdDevicesSiteStatusTypes - 5 PartnerIdentifications - 61 Partners - 61 CadastralUnits - 13027 And if I understand correctly, you consider all of these to be outer joins. Meaning you want *all* of AdDevicesSites, and whatever info goes along with it, but there are no restrictions as to what rows you want. You want everything you can get. Do you actually need *everything*? You mention only needing 30, what for? I would actually guess that the most expensive parts of the plan are the NESTED LOOPS which when they go to materialize have to do a sequential scan, and they get executed 6364 times. It looks like the other tables are small (only 3-5 rows), so it takes about 0.05 ms for each seqscan, the problem is that because you are doing it 6k times, it ends up taking about 300ms of your time. You could try setting set enable_nestloop to off. I don't know that it will be faster, but it could be. I have tried that and it resulted in about 2 sec slowdown :-( Generally, the optimizer *does* select the best query plan. As long as it has accurate statistics, which it seems to in this case. In general, though, it seems like you should be asking a different question, rather than trying to optimize the query that you have. You mean how should I improve the design to make the query faster? There is one possibility if we don't find anything nicer. Which is to create a lazy materialized view. Basically, you run this query, and store it in a table. Then when you want to do the SELECT, you just do that against the unrolled table. You can then create triggers, etc to keep the data up to date. Here is a good documentation of it: http://jonathangardner.net/PostgreSQL/materialized_views/matviews.html It is basically a way that you can un-normalize data, in a safe way. Also, another thing that you can do, is instead of using a cursor, you can create a temporary table with the results of the query, and create a primary key which is just a simple counter. Then instead of doing limit + offset, you can select * where id 0 and id 30; ... select * where id 30 and id 60; etc. It still requires the original query to be run, though, so it is not necessarily optimal for you. Can you post the original SQL statement, and maybe describe what you are trying to do? I hope the explanation above is clear and sufficient :-) John =:- Unfortunately, I don't really see any obvious problems with your query in the way that you are using it.
Re: [PERFORM] How to read query plan
John Arbash Meinel wrote: How about a quick side track. Have you played around with your shared_buffers, maintenance_work_mem, and work_mem settings? I have tried to set shared_buffers to 48000 now but no speedup (11,098.813 ms third try). The others are still default. I'll see documentation and will play with the other parameters. What version of postgres are you using? 8.0.1 The above names changed in 8.0, and 8.0 also has some perfomance improvements over the 7.4 series. What is your hardware? My dev notebook Acer TravelMate 292LMi $ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 9 model name : Intel(R) Pentium(R) M processor 1500MHz stepping: 5 cpu MHz : 1495.485 cache size : 1024 KB fdiv_bug: no hlt_bug : no f00f_bug: no coma_bug: no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr mce cx8 sep mtrr pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 tm pbe est tm2 bogomips: 2957.31 $ cat /proc/meminfo MemTotal: 516136 kB MemFree: 18024 kB Buffers: 21156 kB Cached: 188868 kB SwapCached: 24 kB Active: 345596 kB Inactive: 119344 kB HighTotal: 0 kB HighFree:0 kB LowTotal: 516136 kB LowFree: 18024 kB SwapTotal: 1004020 kB SwapFree: 1003996 kB Dirty: 4 kB Writeback: 0 kB Mapped: 343676 kB Slab:18148 kB CommitLimit: 1262088 kB Committed_AS: 951536 kB PageTables: 2376 kB VmallocTotal: 516056 kB VmallocUsed: 90528 kB VmallocChunk: 424912 kB IDE disc. # hdparm -Tt /dev/hda /dev/hda: Timing cached reads: 1740 MB in 2.00 seconds = 870.13 MB/sec Timing buffered disk reads: 40 MB in 3.30 seconds = 12.10 MB/sec Are you testing this while there is load on the system, or under no load. The load is low. This is few seconds after I have run the EXPLAIN ANALYZE. # cat /proc/loadavg 0.31 0.51 0.33 1/112 6909 Are you re-running the query multiple times, and reporting the later speeds, or just the first time? (If nothing is loaded into memory, the first run is easily 10x slower than later ones.) The times changes only little. First run was about 13 sec, second about 10 sec, third about 11 sec etc. Just some background info. If you have set these to reasonable values, we probably don't need to spend much time here, but it's just one of those things to check. Sure you are right. I'll try the other parameters. John =:- Miroslav begin:vcard fn;quoted-printable:Miroslav =C5=A0ulc n;quoted-printable:=C5=A0ulc;Miroslav org:StartNet s.r.o. adr;quoted-printable;quoted-printable:;;Vrchlick=C3=A9ho 161/5;Praha 5;;150 00;=C4=8Cesk=C3=A1 republika email;internet:[EMAIL PROTECTED] title:CEO tel;work:+420 257 225 602 tel;cell:+420 603 711 413 x-mozilla-html:TRUE url:http://www.startnet.cz version:2.1 end:vcard ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] How to read query plan
John Arbash Meinel wrote: It's actually more of a question as to why you are doing left outer joins, rather than simple joins. Are the tables not fully populated? If so, why not? Some records do not consist of full information (they are collected from different sources which use different approach to the data collection) so using INNER JOIN would cause some records wouldn't be displayed which is unacceptable. How are you using this information? Why is it useful to get back rows that don't have all of their information filled out? Each row contains main information which are important. The other information are also important but may be missing. Information are display on lists of 30 rows or on a card. When using filter the query is much faster but the case without filter has these results. Why is it useful to have so many columns returned? It seems like it most cases, you are only going to be able to use *some* of the information, why not create more queries that are specialized, rather than one get everything query. Many of the columns are just varchar(1) (because of the migration from MySQL enum field type) so the record is not so long as it could seem. These fields are just switches (Y(es) or N(o)). The problem is users can define their own templates and in different scenarios there might be displayed different information so reducing the number of fields would mean in some cases it wouldn't work as expected. But if we couldn't speed the query up, we will try to improve it other way. Is there any serious reason not to use so much fields except memory usage? It seems to me that it shouldn't have a great impact on the speed in this case. Have you thought about using a cursor instead of using limit + offset? This may not help the overall time, but it might let you split up when the time is spent. .. No. I come from MySQL world where these things are not common (at least when using MyISAM databases). The other reason (if I understand it well) is that the retrieval of the packages of 30 records is not sequential. Our app is web based and we use paging. User can select page 1 and then page 10, then go backward to page 9 etc. And if I understand correctly, you consider all of these to be outer joins. Meaning you want *all* of AdDevicesSites, and whatever info goes along with it, but there are no restrictions as to what rows you want. You want everything you can get. Do you actually need *everything*? You mention only needing 30, what for? For display of single page consisting of 30 rows. The reason I query all rows is that this is one of the filters users can use. User can display just bigboards or billboards (or specify more advanced filters) but he/she can also display AdDevices without any filter (page by page). Before I select the 30 row, I need to order them by a key and after that select the records, so this is also the reason why to ask for all rows. The key for sorting might be different for each run. There is one possibility if we don't find anything nicer. Which is to create a lazy materialized view. Basically, you run this query, and store it in a table. Then when you want to do the SELECT, you just do that against the unrolled table. You can then create triggers, etc to keep the data up to date. Here is a good documentation of it: http://jonathangardner.net/PostgreSQL/materialized_views/matviews.html It is basically a way that you can un-normalize data, in a safe way. Also, another thing that you can do, is instead of using a cursor, you can create a temporary table with the results of the query, and create a primary key which is just a simple counter. Then instead of doing limit + offset, you can select * where id 0 and id 30; ... select * where id 30 and id 60; etc. It still requires the original query to be run, though, so it is not necessarily optimal for you. These might be the other steps in case we cannot speed-up the query. I would prefer to speed the query up :-) Unfortunately, I don't really see any obvious problems with your query in the way that you are using it. The problem is that you are not applying any selectivity, so postgres has to go to all the tables, and get all the rows, and then try to logically merge them together. It is doing a hash merge, which is generally one of the faster ones and it seems to be doing the right thing. I would be curious to see how mysql was handling this query, to see if there was something different it was trying to do. I'm also curious how much of a difference there was. In fact, on MySQL I didn't see any slow reactions so I didn't measure and inspect it. But I can try it if I figure out how to copy the database from PostgreSQL to MySQL. John =:- Thank you for your time and help. Miroslav begin:vcard fn;quoted-printable:Miroslav =C5=A0ulc n;quoted-printable:=C5=A0ulc;Miroslav org:StartNet s.r.o. adr;quoted-printable;quoted-printable:;;Vrchlick=C3=A9ho 161/5;Praha 5;;150 00;=C4=8Cesk=C3=A1 republika
Re: [PERFORM] How to read query plan
John Arbash Meinel [EMAIL PROTECTED] writes: How about a quick side track. Have you played around with your shared_buffers, maintenance_work_mem, and work_mem settings? Indeed. The hash joins seem unreasonably slow considering how little data they are processing (unless this is being run on some ancient toaster...). One thought that comes to mind is that work_mem may be set so small that the hashes are forced into multiple batches. Another question worth asking is what are the data types of the columns being joined on. If they are character types, what locale and encoding is the database using? Are you re-running the query multiple times, and reporting the later speeds, or just the first time? (If nothing is loaded into memory, the first run is easily 10x slower than later ones.) That cost would be paid during the bottom-level scans though. The thing that strikes me here is that nearly all of the cost is being spent joining. What version of postgres are you using? And what's the platform (hardware and OS)? regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] How to read query plan
Miroslav ulc wrote: John Arbash Meinel wrote: ... Many of the columns are just varchar(1) (because of the migration from MySQL enum field type) so the record is not so long as it could seem. These fields are just switches (Y(es) or N(o)). The problem is users can define their own templates and in different scenarios there might be displayed different information so reducing the number of fields would mean in some cases it wouldn't work as expected. But if we couldn't speed the query up, we will try to improve it other way. Is there any serious reason not to use so much fields except memory usage? It seems to me that it shouldn't have a great impact on the speed in this case. Is there a reason to use varchar(1) instead of char(1). There probably is 0 performance difference, I'm just curious. Have you thought about using a cursor instead of using limit + offset? This may not help the overall time, but it might let you split up when the time is spent. .. No. I come from MySQL world where these things are not common (at least when using MyISAM databases). The other reason (if I understand it well) is that the retrieval of the packages of 30 records is not sequential. Our app is web based and we use paging. User can select page 1 and then page 10, then go backward to page 9 etc. Well, with cursors you can also do FETCH ABSOLUTE 1 FROM cursor_name, which sets the cursor position, and then you can FETCH FORWARD 30. I honestly don't know how the performance will be, but it is something that you could try. And if I understand correctly, you consider all of these to be outer joins. Meaning you want *all* of AdDevicesSites, and whatever info goes along with it, but there are no restrictions as to what rows you want. You want everything you can get. Do you actually need *everything*? You mention only needing 30, what for? For display of single page consisting of 30 rows. The reason I query all rows is that this is one of the filters users can use. User can display just bigboards or billboards (or specify more advanced filters) but he/she can also display AdDevices without any filter (page by page). Before I select the 30 row, I need to order them by a key and after that select the records, so this is also the reason why to ask for all rows. The key for sorting might be different for each run. How are you caching the information in the background in order to support paging? Since you aren't using limit/offset, and you don't seem to be creating a temporary table, I assume you have a layer inbetween the web server and the database (or possibly inside the webserver) which keeps track of current session information. Is that true? These might be the other steps in case we cannot speed-up the query. I would prefer to speed the query up :-) Naturally fast query comes first. I just have the feeling it is either a postgres configuration problem, or an intrinsic problem to postgres. Given your constraints, there's not much that we can change about the query itself. In fact, on MySQL I didn't see any slow reactions so I didn't measure and inspect it. But I can try it if I figure out how to copy the database from PostgreSQL to MySQL. I figured you still had a copy of the MySQL around to compare to. You probably don't need to spend too much time on it yet. John =:- signature.asc Description: OpenPGP digital signature
Re: [PERFORM] How to read query plan
Tom Lane wrote: John Arbash Meinel [EMAIL PROTECTED] writes: How about a quick side track. Have you played around with your shared_buffers, maintenance_work_mem, and work_mem settings? Indeed. The hash joins seem unreasonably slow considering how little data they are processing (unless this is being run on some ancient toaster...). One thought that comes to mind is that work_mem may be set so small that the hashes are forced into multiple batches. I've just tried to uncomment the settings for these parameters with with no impact on the query speed. shared_buffers = 48000 # min 16, at least max_connections*2, 8KB each work_mem = 1024 # min 64, size in KB maintenance_work_mem = 16384# min 1024, size in KB max_stack_depth = 2048 # min 100, size in KB Another question worth asking is what are the data types of the columns being joined on. If they are character types, what locale and encoding is the database using? I have checked this and there are some JOINs smallint against integer. Is that problem? I would use smallint for IDPKs of some smaller tables but the lack of SMALLSERIAL and my laziness made me use SERIAL instead which is integer. That cost would be paid during the bottom-level scans though. The thing that strikes me here is that nearly all of the cost is being spent joining. What version of postgres are you using? And what's the platform (hardware and OS)? I've already posted the hardware info. OS is Linux (Gentoo) with kernel 2.6.11. regards, tom lane Miroslav begin:vcard fn;quoted-printable:Miroslav =C5=A0ulc n;quoted-printable:=C5=A0ulc;Miroslav org:StartNet s.r.o. adr;quoted-printable;quoted-printable:;;Vrchlick=C3=A9ho 161/5;Praha 5;;150 00;=C4=8Cesk=C3=A1 republika email;internet:[EMAIL PROTECTED] title:CEO tel;work:+420 257 225 602 tel;cell:+420 603 711 413 x-mozilla-html:TRUE url:http://www.startnet.cz version:2.1 end:vcard ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] How to read query plan
John Arbash Meinel wrote: Is there a reason to use varchar(1) instead of char(1). There probably is 0 performance difference, I'm just curious. No, not at all. I'm just not used to char(). Well, with cursors you can also do FETCH ABSOLUTE 1 FROM cursor_name, which sets the cursor position, and then you can FETCH FORWARD 30. I honestly don't know how the performance will be, but it is something that you could try. This is science for me at this moment :-) For display of single page consisting of 30 rows. The reason I query all rows is that this is one of the filters users can use. User can display just bigboards or billboards (or specify more advanced filters) but he/she can also display AdDevices without any filter (page by page). Before I select the 30 row, I need to order them by a key and after that select the records, so this is also the reason why to ask for all rows. The key for sorting might be different for each run. How are you caching the information in the background in order to support paging? Since you aren't using limit/offset, and you don't seem to be creating a temporary table, I assume you have a layer inbetween the web server and the database (or possibly inside the webserver) which keeps track of current session information. Is that true? I just need three information: 1) used filter (stored in session, identified by filter index in query string) 2) page length (static predefined) 3) what page to display (in query string) In fact, on MySQL I didn't see any slow reactions so I didn't measure and inspect it. But I can try it if I figure out how to copy the database from PostgreSQL to MySQL. I figured you still had a copy of the MySQL around to compare to. You probably don't need to spend too much time on it yet. It's not so simple because there are some differences between MySQL and PostgreSQL in how they handle case sensitivity etc. The database table structures are not the same too because of different data types support and data values support. John =:- Miroslav begin:vcard fn;quoted-printable:Miroslav =C5=A0ulc n;quoted-printable:=C5=A0ulc;Miroslav org:StartNet s.r.o. adr;quoted-printable;quoted-printable:;;Vrchlick=C3=A9ho 161/5;Praha 5;;150 00;=C4=8Cesk=C3=A1 republika email;internet:[EMAIL PROTECTED] title:CEO tel;work:+420 257 225 602 tel;cell:+420 603 711 413 x-mozilla-html:TRUE url:http://www.startnet.cz version:2.1 end:vcard ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] How to read query plan
=?windows-1250?Q?Miroslav_=8Aulc?= [EMAIL PROTECTED] writes: I've just tried to uncomment the settings for these parameters with with no impact on the query speed. shared_buffers = 48000 # min 16, at least max_connections*2, 8KB each work_mem = 1024 # min 64, size in KB maintenance_work_mem = 16384# min 1024, size in KB max_stack_depth = 2048 # min 100, size in KB Hmm. Given the small size of the auxiliary tables, you'd think they'd fit in 1MB work_mem no problem. But try bumping work_mem up to 10MB just to see if it makes a difference. (BTW, you do know that altering the .conf file doesn't in itself do anything? You have to SIGHUP the postmaster to make it notice the change ... and for certain parameters such as shared_buffers, you actually have to stop and restart the postmaster. You can use the SHOW command to verify whether a change has taken effect.) I have checked this and there are some JOINs smallint against integer. Is that problem? That probably explains why some of the joins are merges instead of hashes --- hash join doesn't work across datatypes. Doesn't seem like it should be a huge problem though. I was more concerned about the possibility of slow locale-dependent string comparisons. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] How to read query plan
Tom Lane wrote: =?windows-1250?Q?Miroslav_=8Aulc?= [EMAIL PROTECTED] writes: shared_buffers = 48000 # min 16, at least max_connections*2, 8KB each work_mem = 1024 # min 64, size in KB maintenance_work_mem = 16384# min 1024, size in KB max_stack_depth = 2048 # min 100, size in KB Hmm. Given the small size of the auxiliary tables, you'd think they'd fit in 1MB work_mem no problem. But try bumping work_mem up to 10MB just to see if it makes a difference. (BTW, you do know that altering the .conf file doesn't in itself do anything? You have to SIGHUP the postmaster to make it notice the change ... and for certain parameters such as shared_buffers, you actually have to stop and restart the postmaster. You can use the SHOW command to verify whether a change has taken effect.) I've tried to set work_mem to 10240, restarted postmaster and tried the EXPLAIN ANALYZE but there is only cca 200 ms speedup. I have checked this and there are some JOINs smallint against integer. Is that problem? That probably explains why some of the joins are merges instead of hashes --- hash join doesn't work across datatypes. Doesn't seem like it should be a huge problem though. I was more concerned about the possibility of slow locale-dependent string comparisons. There are only JOINs number against number. I've tried to change one of the fields from smallint to integer but there was no speedup. regards, tom lane Miroslav begin:vcard fn;quoted-printable:Miroslav =C5=A0ulc n;quoted-printable:=C5=A0ulc;Miroslav org:StartNet s.r.o. adr;quoted-printable;quoted-printable:;;Vrchlick=C3=A9ho 161/5;Praha 5;;150 00;=C4=8Cesk=C3=A1 republika email;internet:[EMAIL PROTECTED] title:CEO tel;work:+420 257 225 602 tel;cell:+420 603 711 413 x-mozilla-html:TRUE url:http://www.startnet.cz version:2.1 end:vcard ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] How to read query plan
=?windows-1250?Q?Miroslav_=8Aulc?= [EMAIL PROTECTED] writes: Is the data confidential? If you'd be willing to send me a pg_dump off-list, I'd like to replicate this test and try to see where the time is going. Thank you very much for your offer. The data are partially confidental so I hashed some of the text information and changed some values (not the values for the JOINs) so I could send it to you. I've checked the EXPLAIN ANALYZE if anything changed and the result is merely the same (maybe cca 1 sec slower - maybe because the hash caused the text data to be longer). No problem; thank you for supplying the test case. What I find is rather surprising: most of the runtime can be blamed on disassembling and reassembling tuples during the join steps. Here are the hot spots according to gprof: --- 1.277.388277/103737 ExecScan [16] 2.93 17.02 19092/103737 ExecNestLoop cycle 2 [14] 3.91 22.70 25456/103737 ExecMergeJoin cycle 2 [13] 7.81 45.40 50912/103737 ExecHashJoin cycle 2 [12] [9] 86.3 15.92 92.50 103737 ExecProject [9] 7.65 76.45 8809835/9143692 ExecEvalVar [10] 3.424.57 103737/103775 heap_formtuple [17] 0.030.24 12726/143737 ExecMakeFunctionResultNoSets [24] 0.020.12 103737/290777 ExecStoreTuple [44] 0.010.00 2/2 ExecEvalFunc [372] 0.000.00 2/22 ExecMakeFunctionResult [166] --- 0.000.00 42/9143692 ExecEvalFuncArgs [555] 0.050.51 59067/9143692 ExecHashGetHashValue [32] 0.242.38 274748/9143692 ExecMakeFunctionResultNoSets [24] 7.65 76.45 8809835/9143692 ExecProject [9] [10]69.57.94 79.34 9143692 ExecEvalVar [10] 79.340.00 8750101/9175517 nocachegetattr [11] --- I think the reason this is popping to the top of the runtime is that the joins are so wide (an average of ~85 columns in a join tuple according to the numbers above). Because there are lots of variable-width columns involved, most of the time the fast path for field access doesn't apply and we end up going to nocachegetattr --- which itself is going to be slow because it has to scan over so many columns. So the cost is roughly O(N^2) in the number of columns. As a short-term hack, you might be able to improve matters if you can reorder your LEFT JOINs to have the minimum number of columns propagating up from the earlier join steps. In other words make the later joins add more columns than the earlier, as much as you can. This is actually good news, because before 8.0 we had much worse problems than this with extremely wide tuples --- there were O(N^2) behaviors all over the place due to the old List algorithms. Neil Conway's rewrite of the List code got rid of those problems, and now we can see the places that are left to optimize. The fact that there seems to be just one is very nice indeed. Since ExecProject operations within a nest of joins are going to be dealing entirely with Vars, I wonder if we couldn't speed matters up by having a short-circuit case for a projection that is only Vars. Essentially it would be a lot like execJunk.c, except able to cope with two input tuples. Using heap_deformtuple instead of retail extraction of fields would eliminate the O(N^2) penalty for wide tuples. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] cpu_tuple_cost
I have forgotten this : CREATE OR REPLACE FUNCTION date_to_yearmonth_dec(TIMESTAMP) RETURNS VARCHAR AS' BEGIN RETURN extract(year FROM $1) || extract(month FROM $1)-1; END'LANGUAGE plpgsql IMMUTABLE; ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Postgres on RAID5
Hi Arshavir Grigorian, 0. If possible move to 8.0.1 - bgwriter help you 1. Create RAID1 for redo and place drives on separate SCSI channel 2. Update postgresql.conf: shared_buffers = 1-5 work_mem = 10-30 maintenance_work_mem = 10-30 max_fsm_pages = 150 max_fsm_relations = 16000 wal_buffers = 32 checkpoint_segments = 32 # 16MB each !! checkpoint_timeout = 600 checkpoint_warning = 60 effective_cache_size = 128000 random_page_cost = 3 default_statistics_target = 100 log_min_error_statement = warning log_min_duration_statement = 1000 # for logging long SQL 3. If possible migrate from RAID5 to RAID10. 4. Add (if need) 2 new drive for OS and use ALL 20x9GB drive for DB storage. 5. Remove CDROM from work configuration and start use this scsi channel. Best regards, Alexander Kirpa ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Postgres on RAID5
Hi, I have a RAID5 array (mdadm) with 14 disks + 1 spare. This partition has an Ext3 filesystem which is used by Postgres. Currently we are loading a 50G database on this server from a Postgres dump (copy, not insert) and are experiencing very slow write performance (35 records per second). Top shows that the Postgres process (postmaster) is being constantly put into D state for extended periods of time (2-3 seconds) which I assume is because it's waiting for disk io. I have just started gathering system statistics and here is what sar -b shows: (this is while the db is being loaded - pg_restore) tpsrtps wtps bread/s bwrtn/s 01:35:01 PM275.77 76.12199.66709.59 2315.23 01:45:01 PM287.25 75.56211.69706.52 2413.06 01:55:01 PM281.73 76.35205.37711.84 2389.86 02:05:01 PM282.83 76.14206.69720.85 2418.51 02:15:01 PM284.07 76.15207.92707.38 2443.60 02:25:01 PM265.46 75.91189.55708.87 2089.21 02:35:01 PM285.21 76.02209.19709.58 2446.46 Average: 280.33 76.04204.30710.66 2359.47 This is a Sun e450 with dual TI UltraSparc II processors and 2G of RAM. It is currently running Debian Sarge with a 2.4.27-sparc64-smp custom compiled kernel. Postgres is installed from the Debian package and uses all the configuration defaults. I am also copying the pgsql-performance list. Thanks in advance for any advice/pointers. Arshavir Following is some other info that might be helpful. /proc/scsi# mdadm -D /dev/md1 /dev/md1: Version : 00.90.00 Creation Time : Wed Feb 23 17:23:41 2005 Raid Level : raid5 Array Size : 123823616 (118.09 GiB 126.80 GB) Device Size : 8844544 (8.43 GiB 9.06 GB) Raid Devices : 15 Total Devices : 17 Preferred Minor : 1 Persistence : Superblock is persistent Update Time : Thu Feb 24 10:05:38 2005 State : active Active Devices : 15 Working Devices : 16 Failed Devices : 1 Spare Devices : 1 Layout : left-symmetric Chunk Size : 64K UUID : 81ae2c97:06fa4f4d:87bfc6c9:2ee516df Events : 0.8 Number Major Minor RaidDevice State 0 8 640 active sync /dev/sde 1 8 801 active sync /dev/sdf 2 8 962 active sync /dev/sdg 3 8 1123 active sync /dev/sdh 4 8 1284 active sync /dev/sdi 5 8 1445 active sync /dev/sdj 6 8 1606 active sync /dev/sdk 7 8 1767 active sync /dev/sdl 8 8 1928 active sync /dev/sdm 9 8 2089 active sync /dev/sdn 10 8 224 10 active sync /dev/sdo 11 8 240 11 active sync /dev/sdp 12 650 12 active sync /dev/sdq 13 65 16 13 active sync /dev/sdr 14 65 32 14 active sync /dev/sds 15 65 48 15 spare /dev/sdt # dumpe2fs -h /dev/md1 dumpe2fs 1.35 (28-Feb-2004) Filesystem volume name: none Last mounted on: not available Filesystem UUID: 1bb95bd6-94c7-4344-adf2-8414cadae6fc Filesystem magic number: 0xEF53 Filesystem revision #:1 (dynamic) Filesystem features: has_journal dir_index needs_recovery large_file Default mount options:(none) Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 15482880 Block count: 30955904 Reserved block count: 1547795 Free blocks: 28767226 Free inodes: 15482502 First block: 0 Block size: 4096 Fragment size:4096 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 16384 Inode blocks per group: 512 Filesystem created: Wed Feb 23 17:27:13 2005 Last mount time: Wed Feb 23 17:45:25 2005 Last write time: Wed Feb 23 17:45:25 2005 Mount count: 2 Maximum mount count: 28 Last checked: Wed Feb 23 17:27:13 2005 Check interval: 15552000 (6 months) Next check after: Mon Aug 22 18:27:13 2005 Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 128 Journal inode:8 Default directory hash: tea Directory Hash Seed: c35c0226-3b52-4dad-b102-f22feb773592 Journal backup: inode blocks # lspci | grep SCSI :00:03.0 SCSI storage controller: LSI Logic / Symbios Logic 53c875 (rev 14) :00:03.1 SCSI storage controller: LSI Logic / Symbios Logic 53c875 (rev 14) :00:04.0 SCSI storage controller: LSI Logic / Symbios Logic 53c875
Re: [PERFORM] adding 'limit' leads to very slow query
I continue to be stumped by this. You are right that I should have listed the estimates provided by explain... basically for the select where bar = 41, it's estimating there will be 40,000 rows instead of 7, out of what's actuallly 5 million records in the table. So far I've tried increase statistics for the bar column from the default 10 to 100 (vacuum analyzing after) and the explain-plan hasn't changed. I also notice that afterward, the pg_stats record for the bar column still only lists the top 5 values of bar (out of 68 unique values in the table). Are there any other settings I could try to improve the detail of the statistics? By the way, I think I do have a workaround for this particular query: select * from (select * from foo where barId = 412 order by id desc) as tempview limit 25; This query uses the bar index and completes instantly. However, I feel like I should find the heart of the problem, since bad statistics could end up affecting other plans, right? - Mike On Mon, 7 Mar 2005 23:03:43 -0800 (PST), Stephan Szabo [EMAIL PROTECTED] wrote: On Mon, 7 Mar 2005, Michael McFarland wrote: I'm trying to understand why a particular query is slow, and it seems like the optimizer is choosing a strange plan. See this summary: * I have a large table, with an index on the primary key 'id' and on a field 'foo'. select count(*) from foo; 1,000,000 select count(*) from foo where bar = 41; 7 * This query happens very quickly. explain select * from foo where barId = 412 order by id desc; Sort () Sort key= id - Index scan using bar_index on foo () Index cond: barId = 412 But this query takes forever explain select * from foo where barId = 412 order by id desc limit 25; Limit () - Index scan backward using primarykey_index Filter: barID = 412 You didn't show the row estimates, but I'd guess that it's expecting either that ther are more rows that match barId=412 than there actually are (which may be solvable by raising the statistics target on the column and re-analyzing) such that going backwards on id in order to make 25 matching rows isn't a bad plan or that barId and id are correlated which is unfortunately not going to be recognized right now. -- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/ ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] One tuple per transaction
Tambet Matiisen [EMAIL PROTECTED] writes Hi! In one of our applications we have a database function, which recalculates COGS (cost of good sold) for certain period. This involves deleting bunch of rows from one table, inserting them again in correct order and updating them one-by-one (sometimes one row twice) to reflect current state. The problem is, that this generates an enormous amount of tuples in that table. If I'm correct, the dead tuples must be scanned also during table and index scan, so a lot of dead tuples slows down queries considerably, especially when the table doesn't fit into shared buffers any more. And as I'm in transaction, I can't VACUUM to get rid of those tuples. In one occasion the page count for a table went from 400 to 22000 at the end. Not exactly. The dead tuple in the index will be scanned the first time (and its pointed heap tuple as well), then we will mark it dead, then next time we came here, we will know that the index tuple actually points to a uesless tuple, so we will not scan its pointed heap tuple. All this made me wonder, why is new tuple created after every update? One tuple per transaction should be enough, because you always commit or rollback transaction as whole. And my observations seem to indicate, that new index tuple is created after column update even if this column is not indexed. This is one cost of MVCC. A good thing of MVCC is there is no conflict between read and write - maybe some applications need this. A reference could be found here: http://www.postgresql.org/docs/8.0/static/storage-page-layout.html#HEAPTUPLEHEADERDATA-TABLE One tuple per transaction would save a loads of I/O bandwidth, so I believe there must be a reason why it isn't implemented as such. Or were my assumptions wrong, that dead tuples must be read from disk? Tambet ---(end of broadcast)--- TIP 8: explain analyze is your friend ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] more execution time
why this query needs more time? Its very slow thx //QUERY select coalesce(personaldetails.masterid::numeric,personaldetails.id) + (coalesce(personaldetails.id::numeric,0)/100) as sorting, floor(coalesce(personaldetails.masterid::numeric,personaldetails.id) + (coalesce(personaldetails.id::numeric,0)/100)) as ppid, personaldetails.id as pid, personaldetails.masterid, coalesce(personaldetails.prefix,'') || '' || coalesce(personaldetails.firstname,' ') || ' ' || coalesce(personaldetails.lastname,) as fullname, personaldetails.regtypeid, personaldetails.regdate, personaldetails.regprice, coalesce(regtypes.regtype,' ') || ' ' || coalesce(regtypes.subregtype,' ') as regtypetitle, regtypes.regtype, regtypes.subregtype, regtypedates.title, balance('MASTER-REGISTRATION',personaldetails.id) as balance, coalesce(pd2.prefix,' ') || ' ' || coalesce(pd2.firstname,' ') || ' ' || coalesce(pd2.lastname,' ') as accfullname, coalesce(rt2.regtype,) || ' ' || coalesce(rt2.subregtype,' ') as accregtypetitle, pd2.id as accid, pd2.regtypeid as accregtypeid, pd2.regdate as accregdate, pd2.regprice as accregprice, rt2.regtype as accregtype, rt2.subregtype as accsubregtype, rd2.title as acctitle, balance('MASTER-REGISTRATION',pd2.id) as accbalance, case when coalesce(balance('REGISTRATION',personaldetails.id),0)=0 then 1 else 0 end as balancestatus from personaldetails left outer join regtypes on regtypes.id=personaldetails.regtypeid left outer join regtypedates on regtypes.dateid=regtypedates.id left outer join personaldetails pd2 on personaldetails.id=pd2.masterid left outer join regtypes rt2 on rt2.id=pd2.regtypeid left outer join regtypedates rd2 on rt2.dateid=rd2.id where personaldetails.masterid is null / RESULT STATISTICS Total query runtime: 348892 ms. Data retrieval runtime: 311 ms. 763 rows retrieved. EXPLAIN QUERY Hash Left Join (cost=109.32..109.95 rows=5 width=434) Hash Cond: (outer.dateid = inner.id) - Merge Left Join (cost=108.27..108.46 rows=5 width=409) Merge Cond: (outer.regtypeid = inner.id) - Sort (cost=106.19..106.20 rows=5 width=347) Sort Key: pd2.regtypeid - Hash Left Join (cost=90.11..106.13 rows=5 width=347) Hash Cond: (outer.id = inner.masterid) - Hash Left Join (cost=45.49..45.71 rows=5 width=219) Hash Cond: (outer.dateid = inner.id) - Merge Left Join (cost=44.44..44.63 rows=5 width=194) Merge Cond: (outer.regtypeid = inner.id) - Sort (cost=42.36..42.37 rows=5 width=132) Sort Key: personaldetails.regtypeid - Seq Scan on personaldetails (cost=0.00..42.30 rows=5 width=132) Filter: (masterid IS NULL) - Sort (cost=2.08..2.16 rows=31 width=66) Sort Key: regtypes.id - Seq Scan on regtypes (cost=0.00..1.31 rows=31 width=66) - Hash (cost=1.04..1.04 rows=4 width=33) - Seq Scan on regtypedates (cost=0.00..1.04 rows=4 width=33) - Hash (cost=42.30..42.30 rows=930 width=132) - Seq Scan on personaldetails pd2 (cost=0.00..42.30 rows=930 width=132) - Sort (cost=2.08..2.16 rows=31 width=66) Sort Key: rt2.id - Seq Scan on regtypes rt2 (cost=0.00..1.31 rows=31 width=66) - Hash (cost=1.04..1.04 rows=4 width=33) - Seq Scan on regtypedates rd2 (cost=0.00..1.04 rows=4 width=33) ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Postgres on RAID5
Arshavir Grigorian [EMAIL PROTECTED] writes: Hi, I have a RAID5 array (mdadm) with 14 disks + 1 spare. This partition has an Ext3 filesystem which is used by Postgres. People are going to suggest moving to RAID1+0. I'm unconvinced that RAID5 across 14 drivers shouldn't be able to keep up with RAID1 across 7 drives though. It would be interesting to see empirical data. One thing that does scare me is the Postgres transaction log and the ext3 journal both sharing these disks with the data. Ideally both of these things should get (mirrored) disks of their own separate from the data files. But 2-3s pauses seem disturbing. I wonder whether ext3 is issuing a cache flush on every fsync to get the journal pushed out. This is a new linux feature that's necessary with ide but shouldn't be necessary with scsi. It would be interesting to know whether postgres performs differently with fsync=off. This would even be a reasonable mode to run under for initial database loads. It shouldn't make much of a difference with hardware like this though. And you should be aware that running under this mode in production would put your data at risk. -- greg ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PERFORM] How to read query plan
I wrote: Since ExecProject operations within a nest of joins are going to be dealing entirely with Vars, I wonder if we couldn't speed matters up by having a short-circuit case for a projection that is only Vars. Essentially it would be a lot like execJunk.c, except able to cope with two input tuples. Using heap_deformtuple instead of retail extraction of fields would eliminate the O(N^2) penalty for wide tuples. Actually, we already had a pending patch (from Atsushi Ogawa) that eliminates that particular O(N^2) behavior in another way. After applying it, I get about a factor-of-4 reduction in the runtime for Miroslav's example. ExecEvalVar and associated routines are still a pretty good fraction of the runtime, so it might still be worth doing something like the above, but it'd probably be just a marginal win instead of a big win. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] cpu_tuple_cost
Daniel Schuchardt [EMAIL PROTECTED] writes: i have a query plan who is bad with standard cpu_tuple_costs and good if I raise cpu_tuple_costs. Is it is a good practice to raise them if i want to force postgres to use indexes more often? Reducing random_page_cost is usually the best way to get the planner to favor indexscans more. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[PERFORM] column name is LIMIT
Guys, I am having a problem firing queries on one of the tables which is having limit as the column name. If a run an insert/select/update command on that table i get the below error. ERROR: syntax error at or near limit at character 71 Any Help would be realyl great to solve the problem. postgresql 7.4.5 and linux OS -- Best, Gourish Singbal ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] column name is LIMIT
On Mon, 14 Mar 2005 06:14 pm, Gourish Singbal wrote: Guys, I am having a problem firing queries on one of the tables which is having limit as the column name. If a run an insert/select/update command on that table i get the below error. ERROR: syntax error at or near limit at character 71 select limit from limit_table WHERE limit 50 LIMIT 2; You need to quote the field name, and make sure the case is correct. Any Help would be realyl great to solve the problem. postgresql 7.4.5 and linux OS You should probably upgrade to 7.4.7 Regards Russell Smith. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] column name is LIMIT
Put around the column name, eg: insert into limit values (1, 2,3 ); Chris Gourish Singbal wrote: Guys, I am having a problem firing queries on one of the tables which is having limit as the column name. If a run an insert/select/update command on that table i get the below error. ERROR: syntax error at or near limit at character 71 Any Help would be realyl great to solve the problem. postgresql 7.4.5 and linux OS ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq