[PERFORM] Wrong plan sequential scan instead of an index one
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi all, take a look at those plans: test=# explain analyze SELECT COUNT(id) FROM t_oa_2_00_card WHERE pvcp in (select id from l_pvcp where value ilike '%pi%'); QUERY PLAN - --- Aggregate (cost=154279.01..154279.01 rows=1 width=8) (actual time=4010.094..4010.096 rows=1 loops=1) -> Hash IN Join (cost=2.22..153835.49 rows=177404 width=8) (actual time=2.908..4001.814 rows=7801 loops=1) Hash Cond: ("outer".pvcp = "inner".id) -> Seq Scan on t_oa_2_00_card (cost=0.00..147670.82 rows=877682 width=12) (actual time=0.030..2904.522 rows=877682 loops=1) -> Hash (cost=2.17..2.17 rows=19 width=4) (actual time=0.093..0.093 rows=1 loops=1) -> Seq Scan on l_pvcp (cost=0.00..2.17 rows=19 width=4) (actual time=0.066..0.081 rows=1 loops=1) Filter: (value ~~* '%pi%'::text) Total runtime: 4010.413 ms (8 rows) test=# explain analyze SELECT COUNT(id) FROM t_oa_2_00_card WHERE pvcp in (select id from l_pvcp where value ilike 'pi'); QUERY PLAN - -- Aggregate (cost=93540.82..93540.83 rows=1 width=8) (actual time=55.333..55.334 rows=1 loops=1) -> Nested Loop (cost=84.60..93447.44 rows=37348 width=8) (actual time=2.730..46.770 rows=7801 loops=1) -> HashAggregate (cost=2.18..2.22 rows=4 width=4) (actual time=0.089..0.092 rows=1 loops=1) -> Seq Scan on l_pvcp (cost=0.00..2.17 rows=4 width=4) (actual time=0.065..0.081 rows=1 loops=1) Filter: (value ~~* 'pi'::text) -> Bitmap Heap Scan on t_oa_2_00_card (cost=82.42..23216.95 rows=11548 width=12) (actual time=2.633..29.566 rows=7801 loops=1) Recheck Cond: (t_oa_2_00_card.pvcp = "outer".id) -> Bitmap Index Scan on i3_t_oa_2_00_card (cost=0.00..82.42 rows=11548 width=0) (actual time=2.050..2.050 rows=7801 loops=1) Index Cond: (t_oa_2_00_card.pvcp = "outer".id) Total runtime: 55.454 ms (10 rows) Isn't too much choose a sequential scan due to 19 estimated rows when with 4 estimated does a correct index scan ? Regards Gaetano Mendola -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.5 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGDNlB7UpzwH2SGd4RAjY8AJ9yrIaQe297m3Lh7+ZVM4i9hoqlYQCeJFGL z00RLwJ5yR/7bOT2TVx+JVA= =1lOI -END PGP SIGNATURE- ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Wrong plan sequential scan instead of an index one
Hi all, take a look at those plans: Try changing random_page_cost from the default 4 to 2 in postgresql.conf: random_page_cost = 2 The default in postgresql is somewhat conservative. This setting indicates for postgresql how fast your disks are, the lower the faster. Could this setting be changed to 2 as default rather than 4? regards Claus ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Wrong plan sequential scan instead of an index one
Gaetano Mendola wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi all, take a look at those plans: test=# explain analyze SELECT COUNT(id) FROM t_oa_2_00_card WHERE pvcp in (select id from l_pvcp where value ilike '%pi%'); -> Hash IN Join (cost=2.22..153835.49 rows=177404 width=8) (actual time=2.908..4001.814 rows=7801 loops=1) Hash Cond: ("outer".pvcp = "inner".id) Isn't too much choose a sequential scan due to 19 estimated rows when with 4 estimated does a correct index scan ? I don't think it's the matches on l_pvcp that's the problem, it's the fact that it thinks its getting 177404 rows matching the IN. Now, why 19 rows from the subquery should produce such a large estimate in the outer query I'm not sure. Any strange distribution of values on pvcp? -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Wrong plan sequential scan instead of an index one
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Claus Guttesen wrote: >> Hi all, >> take a look at those plans: > > Try changing random_page_cost from the default 4 to 2 in postgresql.conf: > > random_page_cost = 2 > > The default in postgresql is somewhat conservative. This setting > indicates for postgresql how fast your disks are, the lower the > faster. > > Could this setting be changed to 2 as default rather than 4? I have tuned that number already at 2.5, lowering it to 2 doesn't change the plan. Regards Gaetano Mendola -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.5 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGDOGa7UpzwH2SGd4RAjvaAKDAbz/vxwyOBPCILGpw8rBSvTFMtACfRPBe yMge0RFfww0ef7xrGBLal7o= =k+RM -END PGP SIGNATURE- ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Wrong plan sequential scan instead of an index one
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Richard Huxton wrote: > Gaetano Mendola wrote: >> -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 >> >> Hi all, take a look at those plans: >> >> >> test=# explain analyze SELECT COUNT(id) FROM t_oa_2_00_card WHERE >> pvcp in (select id from l_pvcp where value ilike '%pi%'); > >> -> Hash IN Join (cost=2.22..153835.49 rows=177404 width=8) (actual >> time=2.908..4001.814 rows=7801 loops=1) Hash Cond: ("outer".pvcp = >> "inner".id) > >> Isn't too much choose a sequential scan due to 19 estimated rows when >> with 4 estimated does a correct index scan ? > > I don't think it's the matches on l_pvcp that's the problem, it's the > fact that it thinks its getting 177404 rows matching the IN. > > Now, why 19 rows from the subquery should produce such a large estimate > in the outer query I'm not sure. Any strange distribution of values on > pvcp? I don't know what do you mean for strange, this is the distribution: test=# select count(*) from t_oa_2_00_card; count - 877682 (1 row) test=# select count(*), pvcp from t_oa_2_00_card group by pvcp; count | pvcp - ---+-- 13 | 2 | 94 57 | 93 250 | 90 8158 | 89 4535 | 88 3170 | 87 13711 | 86 5442 | 85 2058 | 84 44 | 83 1 | 82 4 | 80 1 | 79 14851 | 78 12149 | 77 149 | 76 9 | 75 4 | 74 2 | 73 5 | 72 28856 | 71 12847 | 70 8183 | 69 11246 | 68 9232 | 67 14433 | 66 13970 | 65 3616 | 64 2996 | 63 7801 | 62 3329 | 61 949 | 60 35168 | 59 18752 | 58 1719 | 57 1031 | 56 1585 | 55 2125 | 54 9007 | 53 22060 | 52 2800 | 51 5629 | 50 16970 | 49 8254 | 48 11448 | 47 20253 | 46 3637 | 45 13876 | 44 19002 | 43 17940 | 42 5022 | 41 24478 | 40 2374 | 39 4885 | 38 3779 | 37 3532 | 36 11783 | 35 15843 | 34 14546 | 33 29171 | 32 5048 | 31 13411 | 30 6746 | 29 375 | 28 9244 | 27 10577 | 26 36096 | 25 3827 | 24 29497 | 23 20362 | 22 8068 | 21 2936 | 20 661 | 19 8224 | 18 3016 | 17 7731 | 16 8792 | 15 4486 | 14 3 | 13 6859 | 12 4576 | 11 13377 | 10 14578 |9 6991 |8 52714 |7 6477 |6 11445 |5 24690 |4 10522 |3 2917 |2 34694 |1 (92 rows) I think that estimate is something like: 877682 / 92 * 19 Regards Gaetano Mendola -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.5 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGDONZ7UpzwH2SGd4RAhs3AKCYWgyn3vkzDvhWl/tF1TRs/nDT7QCeJDZu k9hQ0WBS1cFHcCjIs3jca0Y= =RIDE -END PGP SIGNATURE- ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Wrong plan sequential scan instead of an index one
Gaetano Mendola wrote: Richard Huxton wrote: Now, why 19 rows from the subquery should produce such a large estimate in the outer query I'm not sure. Any strange distribution of values on pvcp? I don't know what do you mean for strange, this is the distribution: test=# select count(*) from t_oa_2_00_card; count - 877682 (1 row) test=# select count(*), pvcp from t_oa_2_00_card group by pvcp; count | pvcp - ---+-- (92 rows) I think that estimate is something like: 877682 / 92 * 19 So if you actually had 19 matches for '%pi%' it might be a sensible plan then. I'm afraid I don't know of any way to improve PG's prediction on how many matches you'll get for a substring pattern though. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Wrong plan sequential scan instead of an index one
I don't know about postgres, but in oracle it could be better to write: SELECT COUNT(distinct c.id) FROM t_oa_2_00_card c,l_pvcp l WHERE l.value ilike '%pi%' and c.pvcp=l.id; or SELECT COUNT(c.id) FROM t_oa_2_00_card c, (select distinct id from l_pvcp where value ilike '%pi%') l WHERE c.pvcp=l.id; depending how many rows, what kind of rows, ... are in l_pvcp table. having index in t_oa_2_00_card.pvcp can slow queries in oracle. Ismo On Fri, 30 Mar 2007, Gaetano Mendola wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Richard Huxton wrote: > > Gaetano Mendola wrote: > >> -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 > >> > >> Hi all, take a look at those plans: > >> > >> > >> test=# explain analyze SELECT COUNT(id) FROM t_oa_2_00_card WHERE > >> pvcp in (select id from l_pvcp where value ilike '%pi%'); > > > >> -> Hash IN Join (cost=2.22..153835.49 rows=177404 width=8) (actual > >> time=2.908..4001.814 rows=7801 loops=1) Hash Cond: ("outer".pvcp = > >> "inner".id) > > > >> Isn't too much choose a sequential scan due to 19 estimated rows when > >> with 4 estimated does a correct index scan ? > > > > I don't think it's the matches on l_pvcp that's the problem, it's the > > fact that it thinks its getting 177404 rows matching the IN. > > > > Now, why 19 rows from the subquery should produce such a large estimate > > in the outer query I'm not sure. Any strange distribution of values on > > pvcp? > > I don't know what do you mean for strange, this is the distribution: > > test=# select count(*) from t_oa_2_00_card; > count > - > 877682 > (1 row) > > test=# select count(*), pvcp from t_oa_2_00_card group by pvcp; > count | pvcp > - ---+-- > 13 | > 2 | 94 > 57 | 93 >250 | 90 > 8158 | 89 > 4535 | 88 > 3170 | 87 > 13711 | 86 > 5442 | 85 > 2058 | 84 > 44 | 83 > 1 | 82 > 4 | 80 > 1 | 79 > 14851 | 78 > 12149 | 77 >149 | 76 > 9 | 75 > 4 | 74 > 2 | 73 > 5 | 72 > 28856 | 71 > 12847 | 70 > 8183 | 69 > 11246 | 68 > 9232 | 67 > 14433 | 66 > 13970 | 65 > 3616 | 64 > 2996 | 63 > 7801 | 62 > 3329 | 61 >949 | 60 > 35168 | 59 > 18752 | 58 > 1719 | 57 > 1031 | 56 > 1585 | 55 > 2125 | 54 > 9007 | 53 > 22060 | 52 > 2800 | 51 > 5629 | 50 > 16970 | 49 > 8254 | 48 > 11448 | 47 > 20253 | 46 > 3637 | 45 > 13876 | 44 > 19002 | 43 > 17940 | 42 > 5022 | 41 > 24478 | 40 > 2374 | 39 > 4885 | 38 > 3779 | 37 > 3532 | 36 > 11783 | 35 > 15843 | 34 > 14546 | 33 > 29171 | 32 > 5048 | 31 > 13411 | 30 > 6746 | 29 >375 | 28 > 9244 | 27 > 10577 | 26 > 36096 | 25 > 3827 | 24 > 29497 | 23 > 20362 | 22 > 8068 | 21 > 2936 | 20 >661 | 19 > 8224 | 18 > 3016 | 17 > 7731 | 16 > 8792 | 15 > 4486 | 14 > 3 | 13 > 6859 | 12 > 4576 | 11 > 13377 | 10 > 14578 |9 > 6991 |8 > 52714 |7 > 6477 |6 > 11445 |5 > 24690 |4 > 10522 |3 > 2917 |2 > 34694 |1 > (92 rows) > > > I think that estimate is something like: 877682 / 92 * 19 > > > Regards > Gaetano Mendola > > > > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.2.5 (MingW32) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org > > iD8DBQFGDONZ7UpzwH2SGd4RAhs3AKCYWgyn3vkzDvhWl/tF1TRs/nDT7QCeJDZu > k9hQ0WBS1cFHcCjIs3jca0Y= > =RIDE > -END PGP SIGNATURE- > > ---(end of broadcast)--- > TIP 9: In versions below 8.0, the planner will ignore your desire to >choose an index scan if your joining column's datatypes do not >match > ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Wrong plan sequential scan instead of an index one
On Fri, Mar 30, 2007 at 12:08:26PM +0200, Gaetano Mendola wrote: > Claus Guttesen wrote: > > Try changing random_page_cost from the default 4 to 2 in postgresql.conf: > > > > random_page_cost = 2 > > I have tuned that number already at 2.5, lowering it to 2 doesn't change > the plan. The following 19-fold overestimate is influencing the rest of the plan: -> Seq Scan on l_pvcp (cost=0.00..2.17 rows=19 width=4) (actual time=0.066..0.081 rows=1 loops=1) Filter: (value ~~* '%pi%'::text) Have you tried increasing the statistics target on l_pvcp.value? I ran your queries against canned data in 8.2.3 and better statistics resulted in more accurate row count estimates for this and other parts of the plan. I don't recall if estimates for non-leading-character matches in earlier versions can benefit from better statistics. -- Michael Fuhr ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Wrong plan sequential scan instead of an index one
On Fri, Mar 30, 2007 at 04:46:11AM -0600, Michael Fuhr wrote: > Have you tried increasing the statistics target on l_pvcp.value? > I ran your queries against canned data in 8.2.3 and better statistics > resulted in more accurate row count estimates for this and other > parts of the plan. I don't recall if estimates for non-leading-character > matches in earlier versions can benefit from better statistics. This might work only in 8.2. I see the following in the Release Notes: * Improve the optimizer's selectivity estimates for LIKE, ILIKE, and regular expression operations (Tom) -- Michael Fuhr ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Wrong plan sequential scan instead of an index one
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Michael Fuhr wrote: > On Fri, Mar 30, 2007 at 12:08:26PM +0200, Gaetano Mendola wrote: >> Claus Guttesen wrote: >>> Try changing random_page_cost from the default 4 to 2 in postgresql.conf: >>> >>> random_page_cost = 2 >> I have tuned that number already at 2.5, lowering it to 2 doesn't change >> the plan. > > The following 19-fold overestimate is influencing the rest of the > plan: > > -> Seq Scan on l_pvcp (cost=0.00..2.17 rows=19 width=4) (actual > time=0.066..0.081 rows=1 loops=1) > Filter: (value ~~* '%pi%'::text) > > Have you tried increasing the statistics target on l_pvcp.value? > I ran your queries against canned data in 8.2.3 and better statistics > resulted in more accurate row count estimates for this and other > parts of the plan. I don't recall if estimates for non-leading-character > matches in earlier versions can benefit from better statistics. > test=# alter table l_pvcp alter column value set statistics 1000; ALTER TABLE test=# analyze l_pvcp; ANALYZE test=# explain analyze SELECT COUNT(id) FROM t_oa_2_00_card WHERE pvcp in (select id from l_pvcp where value ilike '%pi%'); QUERY PLAN - --- Aggregate (cost=154321.83..154321.84 rows=1 width=8) (actual time=4948.627..4948.628 rows=1 loops=1) -> Hash IN Join (cost=2.22..153877.08 rows=177898 width=8) (actual time=2.262..4940.395 rows=7801 loops=1) Hash Cond: ("outer".pvcp = "inner".id) -> Seq Scan on t_oa_2_00_card (cost=0.00..147695.25 rows=880125 width=12) (actual time=0.040..3850.074 rows=877682 loops=1) -> Hash (cost=2.17..2.17 rows=19 width=4) (actual time=0.073..0.073 rows=1 loops=1) -> Seq Scan on l_pvcp (cost=0.00..2.17 rows=19 width=4) (actual time=0.052..0.067 rows=1 loops=1) Filter: (value ~~* '%pi%'::text) Total runtime: 4948.717 ms (8 rows) and nothing changed. Regards Gaetano Mendola -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.5 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGDPVS7UpzwH2SGd4RAp+DAJ9Z5HdDcKx9rOQDbm+uAdb8uEc8OgCgjGmM Z351j5icCHT4yMOLEu3ZcJY= =CY1c -END PGP SIGNATURE- ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Wrong plan sequential scan instead of an index one
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Michael Fuhr wrote: > On Fri, Mar 30, 2007 at 04:46:11AM -0600, Michael Fuhr wrote: >> Have you tried increasing the statistics target on l_pvcp.value? >> I ran your queries against canned data in 8.2.3 and better statistics >> resulted in more accurate row count estimates for this and other >> parts of the plan. I don't recall if estimates for non-leading-character >> matches in earlier versions can benefit from better statistics. > > This might work only in 8.2. I see the following in the Release Notes: > > * Improve the optimizer's selectivity estimates for LIKE, ILIKE, > and regular expression operations (Tom) I will try same select on a 8.2 ( that one was a 8.1 ) and I'll let you know. Regards Gaetano Mendola -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.5 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGDPXk7UpzwH2SGd4RAsQcAKCs5sh3mYuE2TMdbtdxxgSOs989JACglT1H 44s1hJZJ5upBzIPwLigoxa4= =Aas2 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Wrong plan sequential scan instead of an index one
Gaetano Mendola wrote: The match 19 for '%pi%' is estimated, the real matches are: test=# select id from l_pvcp where value ilike '%pi%'; id - 62 (1 row) test=# select id from l_pvcp where value ilike 'pi'; id - 62 (1 row) so one row in both cases, that's why I expect for both same plan. Ah, but it's got no way of knowing what matches you'll get for '%anything%'. There's no easy way to get statistics for matching substrings. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Shared buffers, db transactions commited, and write IO on Solaris
You are right in that the page size constraint is lifted in that directio cuts out the VM filesystem cache. However, the Solaris kernel still issues io ops in terms of its logical block size (which we have at the default 8K). It can issue io ops for fragments as small as 1/8th of the block size, but Postgres issues its io requests in terms of the block size which means that io ops from Postgres will be in 8K chunks which is exactly what we see when we look at our system io stats. In fact, if any io request is made that isn't a multiple of 512 bytes (the disk sector size), the file system switches back to the buffered io. Oh, yes, of course! yes, you still need to respect multiple of 512 bytes block size on read and write - sorry, I was tired :) Then it's seems to be true - default XLOG block size is 8K, means for every even small auto-committed transaction we should write 8K?... Is there any reason to use so big default block size?... Probably it may be a good idea to put it as 'initdb' parameter? and have such value per database server? Rgds, -Dimitri > > However, to understand TX number mystery I think the only possible > solution > is to reproduce a small live test: > > (I'm sure you're aware you can mount/unmount forcedirectio > dynamically?) > > during stable workload do: > > # mount -o remount,logging /path_to_your_filesystem > > and check if I/O volume is increasing as well TX numbers > than come back: > > # mount -o remount,forcedirectio /path_to_your_filesystem > > and see if I/O volume is decreasing as well TX numbers... That's an excellent idea and I'll run it by the rest of our team tomorrow. erik jones <[EMAIL PROTECTED]> software developer 615-296-0838 emma(r) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Wrong plan sequential scan instead of an index one
Gaetano Mendola wrote: Michael Fuhr wrote: On Fri, Mar 30, 2007 at 04:46:11AM -0600, Michael Fuhr wrote: Have you tried increasing the statistics target on l_pvcp.value? I ran your queries against canned data in 8.2.3 and better statistics resulted in more accurate row count estimates for this and other parts of the plan. I don't recall if estimates for non-leading-character matches in earlier versions can benefit from better statistics. This might work only in 8.2. I see the following in the Release Notes: * Improve the optimizer's selectivity estimates for LIKE, ILIKE, and regular expression operations (Tom) I will try same select on a 8.2 ( that one was a 8.1 ) and I'll let you know. You will also need to set statistics for the column to at least 100 to trigger the improved selectivity estimate if memory serves. Not enough time to check the code, but Tom could better advise. Best regards -- Matteo Beccati http://phpadsnew.com http://phppgads.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Shared buffers, db transactions commited, and write IO on Solaris
On Mar 30, 2007, at 8:14 AM, Dimitri wrote: You are right in that the page size constraint is lifted in that directio cuts out the VM filesystem cache. However, the Solaris kernel still issues io ops in terms of its logical block size (which we have at the default 8K). It can issue io ops for fragments as small as 1/8th of the block size, but Postgres issues its io requests in terms of the block size which means that io ops from Postgres will be in 8K chunks which is exactly what we see when we look at our system io stats. In fact, if any io request is made that isn't a multiple of 512 bytes (the disk sector size), the file system switches back to the buffered io. Oh, yes, of course! yes, you still need to respect multiple of 512 bytes block size on read and write - sorry, I was tired :) Then it's seems to be true - default XLOG block size is 8K, means for every even small auto-committed transaction we should write 8K?... Is there any reason to use so big default block size?... Probably it may be a good idea to put it as 'initdb' parameter? and have such value per database server? I believe it's because that is a pretty normal Unix kernal block size and you want the two to match. erik jones <[EMAIL PROTECTED]> software developer 615-296-0838 emma(r)
Re: [PERFORM] Shared buffers, db transactions commited, and write IO on Solaris
The problem is while your goal is to commit as fast as possible - it's pity to vast I/O operation speed just keeping common block size... Let's say if your transaction modification entering into 512K - you'll be able to write much more 512K blocks per second rather 8K per second (for the same amount of data)... Even we rewrite probably several times the same block with incoming transactions - it still costs on traffic, and we will process slower even H/W can do better. Don't think it's good, no? ;) Rgds, -Dimitri On 3/30/07, Erik Jones <[EMAIL PROTECTED]> wrote: On Mar 30, 2007, at 8:14 AM, Dimitri wrote: >> >> You are right in that the page size constraint is lifted in that >> directio cuts out the VM filesystem cache. However, the Solaris >> kernel still issues io ops in terms of its logical block size (which >> we have at the default 8K). It can issue io ops for fragments as >> small as 1/8th of the block size, but Postgres issues its io requests >> in terms of the block size which means that io ops from Postgres will >> be in 8K chunks which is exactly what we see when we look at our >> system io stats. In fact, if any io request is made that isn't a >> multiple of 512 bytes (the disk sector size), the file system >> switches back to the buffered io. > > Oh, yes, of course! yes, you still need to respect multiple of 512 > bytes block size on read and write - sorry, I was tired :) > > Then it's seems to be true - default XLOG block size is 8K, means for > every even small auto-committed transaction we should write 8K?... Is > there any reason to use so big default block size?... > > Probably it may be a good idea to put it as 'initdb' parameter? and > have such value per database server? I believe it's because that is a pretty normal Unix kernal block size and you want the two to match. erik jones <[EMAIL PROTECTED]> software developer 615-296-0838 emma(r) ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Weird performance drop
> From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > Vincenzo Romano > > Is there any "workaround"? > > In my opinion the later the query planner decisions are taken the more > effective they can be. > It could be an option for the function (body) to delay any > query planner > decision. I think a possible workaround is to use a plpgsql function and the execute statement. The docs will have more info. Dave ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Wrong plan sequential scan instead of an index one
Richard Huxton writes: > Ah, but it's got no way of knowing what matches you'll get for > '%anything%'. There's no easy way to get statistics for matching substrings. 8.2 actually tries the match on the most-common-values list, if said list is big enough (I think the threshold is stats target = 100). Not sure if that will help here, though. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Shared buffers, db transactions commited, and write IO on Solaris
On Mar 30, 2007, at 10:05 AM, Kenneth Marshall wrote: On Fri, Mar 30, 2007 at 04:25:16PM +0200, Dimitri wrote: The problem is while your goal is to commit as fast as possible - it's pity to vast I/O operation speed just keeping common block size... Let's say if your transaction modification entering into 512K - you'll be able to write much more 512K blocks per second rather 8K per second (for the same amount of data)... Even we rewrite probably several times the same block with incoming transactions - it still costs on traffic, and we will process slower even H/W can do better. Don't think it's good, no? ;) Rgds, -Dimitri With block sizes you are always trading off overhead versus space efficiency. Most OS write only in 4k/8k to the underlying hardware regardless of the size of the write you issue. Issuing 16 512byte writes has much more overhead than 1 8k write. On the light transaction end, there is no real benefit to a small write and it will slow performance for high throughput environments. It would be better to, and I think that someone is looking into, batching I/O. Ken True, and really, considering that data is only written to disk by the bgwriter and at checkpoints, writes are already somewhat batched. Also, Dimitri, I feel I should backtrack a little and point out that it is possible to have postgres write in 512byte blocks (at least for UFS which is what's in my head right now) if you set the systems logical block size to 4K and fragment size to 512 bytes and then set postgres's BLCKSZ to 512bytes. However, as Ken has just pointed out, what you gain in space efficiency you lose in performance so if you're working with a high traffic database this wouldn't be a good idea. erik jones <[EMAIL PROTECTED]> software developer 615-296-0838 emma(r)
[PERFORM] scalablility problem
Hi all, When I run multiple TPC-H queries (DBT3) on postgresql, I found the system is not scalable. My machine has 8GB memory, and 4 Xeon Dual Core processor ( 8 cores in total). OS kernel is linux 2.6.9. Postgresql is 7.3.18. I run multiple q2 queries simultaneously. The results are: 1 process takes 0.65 second to finish. 2 processes take 1.07 seconds. 4 processes take 4.93 seconds. 8 processes take 16.95 seconds. For 4-process case and 8-process case, queries takes even more time than they are executed serially one after another. Because the system has 8GB memory, which is much bigger than the DB size(SF=1), and I warmed the cache before I run the test, I do not think the problem was caused by disk I/O. I think it might be caused by some contentions. But I do not know postgresql much. May anybody give me some clue to find the reasons? Thanks! Xiaoning ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] scalablility problem
> From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > Xiaoning Ding > > > Hi all, > > When I run multiple TPC-H queries (DBT3) on postgresql, I > found the system > is not scalable. My machine has 8GB memory, and 4 Xeon Dual > Core processor > ( 8 cores in total). OS kernel is linux 2.6.9. Postgresql is > 7.3.18. Is there anyway you can upgrade to 8.2? There have been a lot of performance and scalability enhancements. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] scalablility problem
Xiaoning Ding <[EMAIL PROTECTED]> writes: > When I run multiple TPC-H queries (DBT3) on postgresql, I found the system > is not scalable. My machine has 8GB memory, and 4 Xeon Dual Core processor > ( 8 cores in total). OS kernel is linux 2.6.9. Postgresql is 7.3.18. If you are not running PG 8.1 or later, it's really not worth your time to test this. Multiprocessor scalability was hardly even on the radar in 7.3 days. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] scalablility problem
Thanks guys, I update PG to 8.2.3. The results are much better now. 1 process :0.94 second 2 processes: 1.32 seconds 4 processes: 2.03 seconds 8 processes: 2.54 seconds Do you think they are good enough? BTW where can I found some info on what 8.2.3 did to improve scalability compared with pre 8.1 versions? Xiaoning Tom Lane wrote: Xiaoning Ding <[EMAIL PROTECTED]> writes: When I run multiple TPC-H queries (DBT3) on postgresql, I found the system is not scalable. My machine has 8GB memory, and 4 Xeon Dual Core processor ( 8 cores in total). OS kernel is linux 2.6.9. Postgresql is 7.3.18. If you are not running PG 8.1 or later, it's really not worth your time to test this. Multiprocessor scalability was hardly even on the radar in 7.3 days. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] scalablility problem
On Fri, 2007-03-30 at 15:25, Xiaoning Ding wrote: > Hi all, > > When I run multiple TPC-H queries (DBT3) on postgresql, I found the system > is not scalable. My machine has 8GB memory, and 4 Xeon Dual Core processor > ( 8 cores in total). OS kernel is linux 2.6.9. Postgresql is 7.3.18. I > run multiple > q2 queries simultaneously. The results are: > > 1 process takes 0.65 second to finish. > 2 processes take 1.07 seconds. > 4 processes take 4.93 seconds. > 8 processes take 16.95 seconds. > > For 4-process case and 8-process case, queries takes even more time than > they are executed serially one after another. Because the system has 8GB > memory, which is much bigger than the DB size(SF=1), and I warmed the cache > before I run the test, I do not think the problem was caused by disk I/O. You may be right, you may be wrong. What did top / vmstat have to say about IO wait / disk idle time? PostgreSQL has to commit transactions to disk. TPC-H does both business decision mostly read queries, as well as mixing in writes. If you have one hard drive, it may well be that activity is stacking up waiting on those writes. > I think it might be caused by some contentions. But I do not know postgresql > much. May anybody give me some clue to find the reasons? Others have mentioned your version of postgresql. 7.3 is quite old, as it came out at the end of 2002. Seeing as 7.3 is the standard pgsql version supported by RHEL3, and RHEL came with a 2.6.9 kernel, I'm gonna guess your OS is about that old too. pgsql 7.3 cannot take advantage of lots of shared memory, and has some issues scaling to lots of CPUs / processes. While RHEL won't be EOLed for a few more years (redhat promises 7 years I think) it's really not a great choice for getting started today. RHEL5 just released and RHEL4 is very stable. There are several things to look at to get better performance. 1: Late model PostgreSQL. Go with 8.2.3 or as a minimum 8.1.8 2: Late model Unix. 3: RAID controller with battery backed cache 4: Plenty of memory. 5: Lots of hard drives 6: 4 to 8 CPUs. Then, google postgresql performance tuning. There are three or four good tuning guides that pop up right at the top. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] scalablility problem
On Fri, 2007-03-30 at 16:38, Xiaoning Ding wrote: > Thanks guys, > > I update PG to 8.2.3. The results are much better now. > 1 process :0.94 second > 2 processes: 1.32 seconds > 4 processes: 2.03 seconds > 8 processes: 2.54 seconds > > Do you think they are good enough? > BTW where can I found some info on what 8.2.3 did to improve > scalability compared with pre 8.1 versions? Very nice, eh? I'd say look through -hackers and -perform to see some of it, but as usual, the source code is the reference. You'd be surprised how well commented it is. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Shared buffers, db transactions commited, and write IO on Solaris
Erik, > You'er welcome! However, I believe our situation is very different > from what you're testing if I understand you correctly. Are you > saying that you're entire database will fit in memory? If so, then > these are very different situations as there is no way ours could > ever do that. In fact, I'm not sure that forcedirectio would really > net you any gain in that situation as the IO service time will be > basically nil if the filesystem cache doesn't have to page which I > would think is why your seeing what you are. Even more interesting. I guess we've been doing too much work with benchmark workloads, which tend to be smaller databases. Thing is, there's *always* I/O for a read/write database. If nothing else, updates have to be synched to disk. Anyway ... regarding the mystery transactions ... are you certain that it's not your application? I can imagine that, if your app has a fairly tight retry interval for database non-response, that I/O sluggishness could result in commit attempts spinning out of control. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] scalablility problem
Quoth [EMAIL PROTECTED] (Xiaoning Ding): > When I run multiple TPC-H queries (DBT3) on postgresql, I found the system > is not scalable. My machine has 8GB memory, and 4 Xeon Dual Core processor > ( 8 cores in total). OS kernel is linux 2.6.9. Postgresql is 7.3.18. I > I think it might be caused by some contentions. But I do not know postgresql > much. May anybody give me some clue to find the reasons? Two primary issues: 1. You're running a horrendously ancient version of PostgreSQL. The 7.3 series is Really Old. Newer versions have *enormous* improvements that are likely to be *enormously* relevant. Upgrade to 8.2. 2. There are known issues with the combination of Xeon processors and PAE memory addressing; that sort of hardware tends to be *way* less speedy than the specs would suggest. There have been "context switching" issues on this sort of hardware that are enormously worsened if you're running on a version of PostgreSQL that is 4 major releases out of date. -- wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','gmail.com'). http://cbbrowne.com/info/x.html I am not a number! I am a free man! ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] scalablility problem
On 30.03.2007, at 19:18, Christopher Browne wrote: 2. There are known issues with the combination of Xeon processors and PAE memory addressing; that sort of hardware tends to be *way* less speedy than the specs would suggest. That is not true as the current series of processors (Woodcrest and the like) are also called Xeon. You probably mean the Pentium IV era Xeons. cug ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster