Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
For archives, Tom commited the patch yesterday: http://archives.postgresql.org/pgsql-committers/2007-11/msg00552.php -- Guillaume ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
Tom, On Nov 27, 2007 3:58 AM, Tom Lane <[EMAIL PROTECTED]> wrote: > Please try this patch on your real app (not the dumbed-down test case) > and see what it does for you ... If I disable the cache in the application, the most visited page generates 175 SQL queries, mix of simple and more complicated queries so it's quite "real life" (it's a page which aggregates quite a lot of data - available in production here: http://www.fra.cityvox.fr/guide_lyon/AccueilVille ). The average of 5 runs for this page gives: 1.108 second for 8.2, 1.054 second for 8.3devel with your cache lookup patch. Which gives us 8.3devel+cache patch a bit faster (~5%) than 8.2 on a quite representative set of queries, even with no concurrency at all. That's very good news. Without your last patch but with the set of patches you commited these last days (basically latest CVS tip), 8.3devel is nearly as fast as 8.2 to generate this page but consistently a bit slower. For historical purposes, here are the results of my simple and dumb test case (pgbench -c 10 -n 1 with the custom queries I mentioned earlier): 8.2: 1480 tps 2007-11-01: 1200 tps 2007-11-27: 1420 tps (after your set of patches to simplify the planner for simple queries) 2007-11-27 + cache lookup patch: 2260 tps That said, I checked nearly each of the 175 queries and I still find a few queries a bit slower. I don't know if it's worth digging but as you already found a couple of problems, I prefer mentioning it. For example, after simplification of a query with a few joins which is slower in 8.3devel+patch, I have the following results (consistent over several runs of the query): ** 8.2 ** => SELECT NP.codejour FROM newsplanning NP WHERE NP.codelang = 'FRA' AND date_trunc('day',NP.dfin)>= date_trunc('day',now()) AND date_trunc('day',NP.ddeb) <= date_trunc('day',now()) AND NP.codeth = 'THE' and NP.niveau = 1; Time: 5.659 ms => EXPLAIN ANALYZE SELECT NP.codejour FROM newsplanning NP WHERE NP.codelang = 'FRA' AND date_trunc('day',NP.dfin)>= date_trunc('day',now()) AND date_trunc('day',NP.ddeb) <= date_trunc('day',now()) AND NP.codeth = 'THE' and NP.niveau = 1; QUERY PLAN -- Index Scan using idx_newsplanning_numnews_codelang_dfin_ddeb_codeth_niveau on newsplanning np (cost=0.00..752.95 rows=50 width=4) (actual time=4.478..5.029 rows=4 loops=1) Index Cond: (((codelang)::text = 'FRA'::text) AND ((codeth)::text = 'THE'::text) AND (niveau = 1)) Filter: ((date_trunc('day'::text, dfin) >= date_trunc('day'::text, now())) AND (date_trunc('day'::text, ddeb) <= date_trunc('day'::text, now( Total runtime: 5.064 ms ** 8.3devel with cache lookup ** =# SELECT NP.codejour FROM newsplanning NP WHERE NP.codelang = 'FRA' AND date_trunc('day',NP.dfin)>= date_trunc('day',now()) AND date_trunc('day',NP.ddeb) <= date_trunc('day',now()) AND NP.codeth = 'THE' and NP.niveau = 1; Time: 5.932 ms EXPLAIN ANALYZE SELECT NP.codejour FROM newsplanning NP WHERE NP.codelang = 'FRA' AND date_trunc('day',NP.dfin)>= date_trunc('day',now()) AND date_trunc('day',NP.ddeb) <= date_trunc('day',now()) AND NP.codeth = 'THE' and NP.niveau = 1; QUERY PLAN -- Index Scan using idx_newsplanning_numnews_codelang_dfin_ddeb_codeth_niveau on newsplanning np (cost=0.00..679.35 rows=46 width=4) (actual time=4.884..5.467 rows=4 loops=1) Index Cond: (((codelang)::text = 'FRA'::text) AND ((codeth)::text = 'THE'::text) AND (niveau = 1)) Filter: ((date_trunc('day'::text, dfin) >= date_trunc('day'::text, now())) AND (date_trunc('day'::text, ddeb) <= date_trunc('day'::text, now( Total runtime: 5.501 ms (4 rows) The schema of the table is the following: => \d newsplanning Table "public.newsplanning" Column |Type | Modifiers ---+-+ idplanning| integer | not null numnews | integer | not null ddeb | timestamp without time zone | not null dfin | timestamp without time zone | not null codeth| character varying(3)| not null niveau| integer | not null ordre | integer | codelang | character varying(3)| not null ddermodif | timestamp without time zone | logindermodif | character varying(20) | codejour | integer | not null default 1 Indexes: "pk_newsplanning" PRIMARY KEY, btree (idplanning) "idx_newsplanning_ddeb_dfin_numnews_niveau_codelang" btree (ddeb, dfin, numnews, niveau, codelang) "idx_newsplanning_numnews_codelang_dfin_dd
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > But I think there must be an action that we can take for 8.3 and that > > much runtime should not be given away easily. ISTM that we can win back > > the losses Guillaume has identified, plus gain a little more even. > > Perhaps some sanity could be restored to this discussion by pointing out > that the 2007-01-01 code *also* clocks in at 37% spent in > oper_select_candidate. IOW it's been like this for a very long time. > I'm not interested in destabilizing 8.3 with panicky last-minute patches. > > > So how about we have a cache-of-one: > > Cache-of-one has exactly the same difficulty as cache-of-many, other > than the table lookup itself, which is a solved problem (hashtable). > You still have to determine how you identify the cached value and what > events require a cache flush. Nor do I see any particular reason to > assume that a cache of only one operator would be of any use for > real-world apps, as opposed to toy examples. Seems like anytime a function like that takes 37%, there is something wrong. Are we sure there isn't a bug in there somewhere? As far as a cache, could we create a simple cache that remembered the last X lookups and cleared the cache anytime a cache invalidation message came in? -- Bruce Momjian <[EMAIL PROTECTED]>http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Mon, 2007-11-26 at 18:18 -0500, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > Here's where I am: > > > Basic test was to replace call to oper_select_candidate with a single > > item that was fed by a hardcoded value for varchar equality operator. > > Well, that confirms what we knew from gprof, but surely you aren't > proposing that as a usable patch. gprof might not have translated into a usable gain, but clearly it can. That's not a proposed patch, just showing my results. > > What I'm actually proposing is a patch implementing a oper_select_hook > > function pointer, which allows the user to do anything they want. > > Why in the world would that be a good idea? Short answer: it makes it go faster? You asked. ;-) Long answer: We all agree the operator cache is the best answer, yet don't wish to delay the project or make it less robust. The best answer is a plugin approach that lets users take the risk and make the gain. We can't hardcode it for everybody because that runs completely against the grain of Postgres. Including this as a plugin allows people to make their own decisions about cacheing/hardcoding. If you are the unlucky owner of a database with a heavy read workload and lots of VARCHAR keys then you're going to want this. The plugin allows writing a one-slot cache that is never flushed. If you choose to override the operators then you'd need to reconnect. It also allows some performance tuning in other cases too, so having it as a general case makes sense. The overhead of implementing it this way is very close to zero and the code path doesn't even get called in the integers-as-keys cases. I don't really like all of this, but that much gain is too much for me to ignore. Better ideas eagerly accepted, and encouraged. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
Simon Riggs <[EMAIL PROTECTED]> writes: > Here's where I am: > Basic test was to replace call to oper_select_candidate with a single > item that was fed by a hardcoded value for varchar equality operator. Well, that confirms what we knew from gprof, but surely you aren't proposing that as a usable patch. > What I'm actually proposing is a patch implementing a oper_select_hook > function pointer, which allows the user to do anything they want. Why in the world would that be a good idea? regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Mon, 2007-11-26 at 09:55 -0500, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > But I think there must be an action that we can take for 8.3 and that > > much runtime should not be given away easily. ISTM that we can win back > > the losses Guillaume has identified, plus gain a little more even. > > Perhaps some sanity could be restored to this discussion by pointing out > that the 2007-01-01 code *also* clocks in at 37% spent in > oper_select_candidate. IOW it's been like this for a very long time. [I'm replying to the wrong message, I know.] Here's where I am: Basic test was to replace call to oper_select_candidate with a single item that was fed by a hardcoded value for varchar equality operator. This is the oper_cache.v1.patch enclosed; the 5 line patch. Test results were - w/o patch ~10,500 tps with pgbench_varchar.sql - with patch~15,500 tps with pgbench_varchar.sql (**big gain**) - w/o patch ~16,250 tps with pgbench_integer.sql - with patch~16,250 tps with pgbench_integer.sql Tables are standard pgbench, varchar test table created using: create table av as select aid::varchar, bid, abalance, filler from accounts; create unique index av_pkey on av (aid); The impact of calling oper_select_candidate() is big enough that it will affect any query that is read only and has 1 or 2 predicates when at least one of them is a VARCHAR_col = const query. What I'm actually proposing is a patch implementing a oper_select_hook function pointer, which allows the user to do anything they want. I'm just re-writing that as a plugin now, but the backend patch is included here for discussion. oper_select_hook.v1.patch -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com Index: src/backend/parser/parse_oper.c === RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/parser/parse_oper.c,v retrieving revision 1.98 diff -c -r1.98 parse_oper.c *** src/backend/parser/parse_oper.c 22 Nov 2007 19:40:25 - 1.98 --- src/backend/parser/parse_oper.c 26 Nov 2007 16:28:42 - *** *** 28,33 --- 28,34 #include "utils/syscache.h" #include "utils/typcache.h" + static Oid operOidCache = InvalidOid; static Oid binary_oper_exact(List *opname, Oid arg1, Oid arg2); static FuncDetailCode oper_select_candidate(int nargs, *** *** 516,534 /* No operators found? Then fail... */ if (clist != NULL) { ! /* ! * Unspecified type for one of the arguments? then use the other ! * (XXX this is probably dead code?) ! */ ! Oid inputOids[2]; ! if (rtypeId == InvalidOid) ! rtypeId = ltypeId; ! else if (ltypeId == InvalidOid) ! ltypeId = rtypeId; ! inputOids[0] = ltypeId; ! inputOids[1] = rtypeId; ! fdresult = oper_select_candidate(2, inputOids, clist, &operOid); } } --- 517,543 /* No operators found? Then fail... */ if (clist != NULL) { ! if (OidIsValid(operOidCache)) ! operOid = operOidCache; ! else ! { ! /* ! * Unspecified type for one of the arguments? then use the other ! * (XXX this is probably dead code?) ! */ ! Oid inputOids[2]; ! ! if (rtypeId == InvalidOid) ! rtypeId = ltypeId; ! else if (ltypeId == InvalidOid) ! ltypeId = rtypeId; ! inputOids[0] = ltypeId; ! inputOids[1] = rtypeId; ! fdresult = oper_select_candidate(2, inputOids, clist, &operOid); ! if (ltypeId == 1043 && rtypeId == 705) ! operOidCache = operOid; ! } } } \set naccounts 10 * :scale \setrandom aid 1 :naccounts SELECT abalance FROM accounts WHERE aid = :aid; \set naccounts 10 * :scale \setrandom aid 1 :naccounts SELECT abalance FROM av WHERE aid = ':aid'; Index: src/backend/parser/parse_oper.c === RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/parser/parse_oper.c,v retrieving revision 1.98 diff -c -r1.98 parse_oper.c *** src/backend/parser/parse_oper.c 22 Nov 2007 19:40:25 - 1.98 --- src/backend/parser/parse_oper.c 26 Nov 2007 23:08:09 - *** *** 28,33 --- 28,34 #include "utils/syscache.h" #include "utils/typcache.h" + oper_select_hook_type oper_select_hook = NULL; static Oid binary_oper_exact(List *opname, Oid arg1, Oid arg2); static FuncDetailCode oper_select_candidate(int nargs, *** *** 528,534 ltypeId = rtypeId; inputOids[0] = ltypeId; inputOids[1] = rtypeId; ! fdresult = oper_select_candidate(2, inputOids, clist, &operOid); } } --- 529,544 ltypeId = rtypeId; inputOids[0] = ltypeId; inputOids[1] = rtypeId; ! ! /* ! * Allow user defined operator selection, allowing hard-coding ! * cacheing or other mechanisms of operator selection for ! * improved performance in certain circumstances ! */ ! if (oper_select_hook) ! fdresult =
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
"Kevin Grittner" <[EMAIL PROTECTED]> writes: > Are we likely to see the 3% or the 7% performance degradation with > version 8.3? Probably not, since it sounds like your queries are typically not as trivial as the ones in Guillame's test case. IOW there will be some slowdown but it's likely to be in the noise for you, and also very likely made up by improvements elsewhere. The test case is in the unfortunate position of not being helped materially by *any* of the work we've done for 8.3. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
>>> On Mon, Nov 26, 2007 at 1:04 PM, in message <[EMAIL PROTECTED]>, "Guillaume Smet" <[EMAIL PROTECTED]> wrote: > On Nov 26, 2007 6:48 PM, Tom Lane <[EMAIL PROTECTED]> wrote: >> In the test case Guillame provided, every single WHERE clause happens >> to be of the form >> varchar_column = 'unknown-type literal' >> and there are no other operators used in the SELECT lists; but I can >> hardly believe that this is representative of any significant number >> of real-world applications. Even pgbench uses more than one operator. > > Sure. The application uses a lot of other operators (timestamp > comparison, cube operators, LIKE, boolean comparison...). But as the > primary keys are all varchar in this application, the operator varchar > = unknown is by far the most used. > > If we can't find a solution for upstream 8.3, I'll be happy to apply a > local patch for this customer (if I'm sure it doesn't do any harm in > their case). > > Usually, people use integer flavoured primary keys so they probably > won't be hit by this problem as strong as we are. I've been watching this with some concern, as we only use synthetic keys when there is no naturally occurring set of columns which uniquely identify a row. In our circuit court software (where we have the most users), there are 377 permanent tables, mostly with compound keys: keycolcount | count -+--- 1 |91 2 | 129 3 |88 4 |46 5 |20 6 | 3 (6 rows) We have a lot of varchar columns within those keys, although almost all are actually declared via a domain: basetype | count --+--- date |34 int | 146 numeric | 8 smallint | 365 varchar | 362 (5 rows) Complex queries usually join on at least one varchar column and one smallint column. Selection criteria usually involve comparing to a character string literal, sometimes a date literal, and almost always a smallint. Are we likely to see the 3% or the 7% performance degradation with version 8.3? Is the small patch likely to help with this usage pattern? I'm still trying to get 8.3 performance testing onto our schedule before the end (fast approaching, I know) of the beta. Is there anything in particular that would be helpful for me to look at regarding this particular issue in those tests? -Kevin ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 26, 2007 6:48 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > In the test case Guillame provided, every single WHERE clause happens > to be of the form > varchar_column = 'unknown-type literal' > and there are no other operators used in the SELECT lists; but I can > hardly believe that this is representative of any significant number > of real-world applications. Even pgbench uses more than one operator. Sure. The application uses a lot of other operators (timestamp comparison, cube operators, LIKE, boolean comparison...). But as the primary keys are all varchar in this application, the operator varchar = unknown is by far the most used. If we can't find a solution for upstream 8.3, I'll be happy to apply a local patch for this customer (if I'm sure it doesn't do any harm in their case). Usually, people use integer flavoured primary keys so they probably won't be hit by this problem as strong as we are. -- Guillaume ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
Gregory Stark <[EMAIL PROTECTED]> writes: > A 5-line patch which improves performance by 40% for any case sounds amazing, > but how fragile is that gain? The kind of thing which would be worryign is if > runing a query which uses both varchar and some other ambiguous operator > causes it to lose all its gain. Yeah, exactly. If we're going to risk anything like this at all, the cache-of-one restriction is simply not acceptable (especially given that the part of the coding it would eliminate is the simplest and easiest-to-get-right part). In the test case Guillame provided, every single WHERE clause happens to be of the form varchar_column = 'unknown-type literal' and there are no other operators used in the SELECT lists; but I can hardly believe that this is representative of any significant number of real-world applications. Even pgbench uses more than one operator. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
"Simon Riggs" <[EMAIL PROTECTED]> writes: > I've written up my suggestion as a 5 line patch, with measured > performance improvement of about 40% for the varchar case. A 5-line patch which improves performance by 40% for any case sounds amazing, but how fragile is that gain? The kind of thing which would be worryign is if runing a query which uses both varchar and some other ambiguous operator causes it to lose all its gain. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 26, 2007 5:58 PM, Simon Riggs <[EMAIL PROTECTED]> wrote: > I've written up my suggestion as a 5 line patch, with measured > performance improvement of about 40% for the varchar case. > > It isn't a great long term fix, but I don't want to delay 8.3 either > with the full fix. Can we see the patch? Thanks. -- Guillaume ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Mon, 2007-11-26 at 09:55 -0500, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > But I think there must be an action that we can take for 8.3 and that > > much runtime should not be given away easily. ISTM that we can win back > > the losses Guillaume has identified, plus gain a little more even. > > Perhaps some sanity could be restored to this discussion by pointing out > that the 2007-01-01 code *also* clocks in at 37% spent in > oper_select_candidate. IOW it's been like this for a very long time. > I'm not interested in destabilizing 8.3 with panicky last-minute patches. I think this is worth taking action on, IMHO. I've written up my suggestion as a 5 line patch, with measured performance improvement of about 40% for the varchar case. It isn't a great long term fix, but I don't want to delay 8.3 either with the full fix. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
Simon Riggs <[EMAIL PROTECTED]> writes: > But I think there must be an action that we can take for 8.3 and that > much runtime should not be given away easily. ISTM that we can win back > the losses Guillaume has identified, plus gain a little more even. Perhaps some sanity could be restored to this discussion by pointing out that the 2007-01-01 code *also* clocks in at 37% spent in oper_select_candidate. IOW it's been like this for a very long time. I'm not interested in destabilizing 8.3 with panicky last-minute patches. > So how about we have a cache-of-one: Cache-of-one has exactly the same difficulty as cache-of-many, other than the table lookup itself, which is a solved problem (hashtable). You still have to determine how you identify the cached value and what events require a cache flush. Nor do I see any particular reason to assume that a cache of only one operator would be of any use for real-world apps, as opposed to toy examples. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 26, 2007 11:59 AM, Peter Eisentraut <[EMAIL PROTECTED]> wrote: > I have seen many applications where prepared queries caused stale plans and > poor performance. We have in many cases achieved great performance gains by > turning off prepared queries globally, for example in the driver layer. It > had once gotten to a point where "uses prepared statements" was on par > with "never touched postgresql.conf" and "never heard of VACUUM" as the worst > performance sins. I will gladly revisit this with 8.3, but I am not very > optimistic. +1 on this point as I explained it earlier. -- Guillaume ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
Am Samstag, 24. November 2007 schrieb Simon Riggs: > In many cases, 100% of queries are unprepared. I have seen many applications where prepared queries caused stale plans and poor performance. We have in many cases achieved great performance gains by turning off prepared queries globally, for example in the driver layer. It had once gotten to a point where "uses prepared statements" was on par with "never touched postgresql.conf" and "never heard of VACUUM" as the worst performance sins. I will gladly revisit this with 8.3, but I am not very optimistic. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Sun, 2007-11-25 at 19:35 -0500, Tom Lane wrote: > The cost of resolving ambiguous operators has been an issue for a long > time, of course, but it seems particularly bad in this case --- gprof > blames 37% of the runtime on oper_select_candidate(). It might be time > to think about caching the results of operator searches somehow. Too > late for 8.3 though. Wow: 37%. "varchar_column = const" is a very, very common predicate. 37% is enough to still be visible for a wide range of queries, not just the very simple. I think queries with WHERE clauses like ("int4_column = int4_const" AND "varchar_column = const") will also be noticeably affected this. So even when we have integer keys, we will still get slowed down by an checks to an additional status column. Caching is the right way around this, though as you point out, that is not an option for 8.3. But I think there must be an action that we can take for 8.3 and that much runtime should not be given away easily. ISTM that we can win back the losses Guillaume has identified, plus gain a little more even. Can we just hard-code the varchar lookup? Ugly, but it will add almost nothing to non-varchar paths and yet speed-up the varchar lookup dramatically. I guess the objection to that will be that it prevents people from overloading the = operator for varchars to change the selectivity functions etc. So how about we have a cache-of-one: we store the best varchar = operator after the first lookup, then document that if people overload this then they must reconnect. That's an acceptable pain for the few people affected and a great benefit for the most people. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
Gregory Stark <[EMAIL PROTECTED]> writes: > "Tom Lane" <[EMAIL PROTECTED]> writes: >> But since your columns are varchar, which doesn't have any operators of its >> own, we have to go through oper_select_candidate() > I wonder whether at some point we shouldn't just eliminate this distinction > entirely. Just make "text" and "varchar" the same type and spell it "text" > when there's no typmod length restriction and "varchar(x)" when there is. I've thought about that more than once, but I'm worried that it would eliminate one of the few heavily-used cases we have for binary-compatible operations, thereby making it even harder to find performance issues for those situations. In any case, it wouldn't do anything to fix the basic problem that ambiguous-operator resolution is expensive when there are lots of similarly-named operators. We've chipped away at that with various hacks over the years, but I don't think it's ever occurred to us (or at least to me) before to try short-circuiting the entire process through a lookaside cache. We'd probably need to flush the cache on changes in pg_operator or pg_cast, but neither of those change often ... regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
"Tom Lane" <[EMAIL PROTECTED]> writes: > But since your columns are varchar, which doesn't have any operators of its > own, we have to go through oper_select_candidate() I wonder whether at some point we shouldn't just eliminate this distinction entirely. Just make "text" and "varchar" the same type and spell it "text" when there's no typmod length restriction and "varchar(x)" when there is. > 1. Datatype bloat: there are 58 "=" operators in pg_operator today, > versus 54 at the beginning of the year. That's 7% more work right > there to sort through the additional operators. That's particularly scary because it means that databases which load piles of contrib modules have that much more of an effect here. Some contrib modules create a *lot* of operators. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 26, 2007 1:54 AM, Guillaume Smet <[EMAIL PROTECTED]> wrote: > From what you say, I understand we can't even find a workaround for > 8.3 to improve the situation while waiting for a cleaner solution in > 8.4+? To explain the reasons why I'm so worried, I should have explained that all the primary keys of this particular database are varchar... So it's really the worst case for this problem. -- Guillaume ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 26, 2007 1:35 AM, Tom Lane <[EMAIL PROTECTED]> wrote: > The cost of resolving ambiguous operators has been an issue for a long > time, of course, but it seems particularly bad in this case --- gprof > blames 37% of the runtime on oper_select_candidate(). It might be time > to think about caching the results of operator searches somehow. Too > late for 8.3 though. >From what you say, I understand we can't even find a workaround for 8.3 to improve the situation while waiting for a cleaner solution in 8.4+? At least, I'm glad we finally found an explanation for this problem. Thanks. -- Guillaume ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
"Guillaume Smet" <[EMAIL PROTECTED]> writes: > Sure, it's the same queries I posted earlier. My pgbench script is the > following: > BEGIN > select libvil from vilsitelang where codelang='FRA' and codevil='LYO' > select TL.motsclesmetatags, TL.descriptifmeta, TL.motcleoverture_l, > TL.motcleoverture_c, TL.baselinetheme from themelang TL where > TL.codeth = 'ASS' and TL.codelang = 'FRA' > SELECT libvilpubwoo, codelang, codepays, petiteville FROM vilsite > WHERE codevil = 'LYO' > select libvil from vilsitelang where codelang='FRA' and codevil='LYO' > END I poked into this a bit, and it seems the extra overhead is all coming from resolving the ambiguous "=" operators. That didn't show up in my test because my query had "int4_column = int4_const" which is an exact match to a pg_operator entry. But since your columns are varchar, which doesn't have any operators of its own, we have to go through oper_select_candidate(), which is noticeably slower than before. The slowdown seems to have two causes: 1. Datatype bloat: there are 58 "=" operators in pg_operator today, versus 54 at the beginning of the year. That's 7% more work right there to sort through the additional operators. 2. Removal of pg_cast entries associated with explicit varchar coercions: when there's not a pg_cast entry for the desired coercion, find_coercion_pathway does a second catalog lookup to see if it might be an array case. That happens more often in this test case than it did at the start of the year, because I got rid of pg_cast entries that could be replaced by the generic CoerceViaIO mechanism. I'm not sure how big a hit #2 really is. Presumably the removal of the redundant entries has some distributed savings associated with it, which would partially counteract the extra lookup; but I don't have any tools that can isolate the cost of those particular SearchSysCache calls out of all the rest. In any case, #2 is specific to varchar and text while effect #1 is an issue for just about everything. The cost of resolving ambiguous operators has been an issue for a long time, of course, but it seems particularly bad in this case --- gprof blames 37% of the runtime on oper_select_candidate(). It might be time to think about caching the results of operator searches somehow. Too late for 8.3 though. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 25, 2007 7:35 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > What about the table schemas? The schema is in the dump. That's the main reason why I've sent the dump to both of you. Anyway, here it is: db=# \d themelang Table "public.themelang" Column | Type | Modifiers --++--- codeth | character varying(3) | not null codelang | character varying(3) | not null libtheme | character varying(70) | not null motsclesmetatags | character varying(500) | descriptifmeta | character varying(500) | motcleoverture_l | character varying(100) | motcleoverture_c | character varying(100) | liencourt| character varying(30) | baselinetheme| character varying(300) | ordrenewsletter | integer| Indexes: "pk_themelang" PRIMARY KEY, btree (codeth, codelang) db=# \d vilsite Table "public.vilsite" Column| Type | Modifiers -+---+ codevil | character varying(3) | not null codelang| character varying(3) | not null codepays| character varying(3) | not null regionwap | character varying(3) | codedep | character varying(3) | codetypevil | character varying(1) | not null ouverte | integer | not null codeinteret | character varying(1) | not null familleville| character varying(1) | not null codevilpostale | character varying(8) | capitale| integer | not null codemeteodirect | character varying(5) | libvilpubwoo| character varying(10) | population | integer | codeinsee | character varying(5) | petiteville | integer | not null default 0 logincommercial | character varying(20) | logincommercialprec | character varying(20) | Indexes: "pk_vilsite" PRIMARY KEY, btree (codevil) db=# \d vilsitelang Table "public.vilsitelang" Column | Type | Modifiers --+---+--- codevil | character varying(3) | not null codelang | character varying(3) | not null libvil | character varying(50) | not null trafic | integer | Indexes: "pk_vilsitelang" PRIMARY KEY, btree (codevil, codelang) Foreign-key constraints: "fk_vilsitel_vilsitela_vilsite" FOREIGN KEY (codevil) REFERENCES vilsite(codevil) ON DELETE CASCADE -- Guillaume ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
"Guillaume Smet" <[EMAIL PROTECTED]> writes: > On Nov 25, 2007 6:41 PM, Tom Lane <[EMAIL PROTECTED]> wrote: >> Would you show us the test case you're using? > Sure, it's the same queries I posted earlier. What about the table schemas? > I send a link to the data to both of you in private. I doubt that the specific data is important. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 25, 2007 6:41 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > Would you show us the test case you're using? Sure, it's the same queries I posted earlier. My pgbench script is the following: BEGIN select libvil from vilsitelang where codelang='FRA' and codevil='LYO' select TL.motsclesmetatags, TL.descriptifmeta, TL.motcleoverture_l, TL.motcleoverture_c, TL.baselinetheme from themelang TL where TL.codeth = 'ASS' and TL.codelang = 'FRA' SELECT libvilpubwoo, codelang, codepays, petiteville FROM vilsite WHERE codevil = 'LYO' select libvil from vilsitelang where codelang='FRA' and codevil='LYO' END I send a link to the data to both of you in private. -- Guillaume ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
"Guillaume Smet" <[EMAIL PROTECTED]> writes: > Using pgbench -n -S -c 10 -t 10, I also have CVS tip as fast as > CVS from january. But using my set of queries, it's not. Would you show us the test case you're using? regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
"Guillaume Smet" <[EMAIL PROTECTED]> writes: > Using pgbench -n -S -c 10 -t 10, I also have CVS tip as fast as > CVS from january. But using my set of queries, it's not. Were you ever able to send your queries? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 24, 2007 5:50 PM, Gregory Stark <[EMAIL PROTECTED]> wrote: > It would be nice to have infrastructure similar to the buldfarm running a > standard set of benchmarks every day. It would be fascinating to see the > graphs day-by-day of performance. Hopefully we wouldn't see too many dips and > just a steady increase over time. FYI, in addition to the 4 boxes donated to PostgreSQLFr (which will be used for hosting purposes AFAIK), Continuent Inc. donated 7 servers to Open Wide for PostgreSQL community usage. They will be hosted here at Open Wide (Lyon, France) as soon as our new datacenter will be available (we're a little short at rack space ATM; my target is to have them setup at end of January). My main goal for these servers is to set up a PostgreSQL benchmark lab with automatic daily benchmarks, probably simple ones at the beginning (various pgbench settings using one, two, three clients on one server to get concurrency) but I hope we will be able to set up more representative ones little by little. I can provide simple results by myself but I'm sure Mark Wong's and others' experience will be highly valuable to get more detailed results on a daily basis. I won't have the time to setup something similar to the build farm but really want us to have results soon so I'll set up something simple to begin with and if someone has the time to build a bench farm thing, I'll be happy to test it on these boxes and contribute. By the way, we will grant access to these boxes to community people without any problem. I'll give more news on this ASAP. -- Guillaume ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 25, 2007 8:06 AM, Tom Lane <[EMAIL PROTECTED]> wrote: > The TPS numbers bounce around by 1% or so on repeated trials, so I > wouldn't put too much faith in small differences. What it looks like > to me is that it's all about the stats collection overhead. The drop on > 01-17 corresponds to autovac and stats_row_level being turned on by > default. The improvement on 03-02 is the fix for the problem that the > stats collector process wanted to write the stats file way too often, > and the improvement on 04-30 comes from rate-limiting stats messages > from individual backends to the stats collector. Using pgbench -n -S -c 10 -t 10, I also have CVS tip as fast as CVS from january. But using my set of queries, it's not. It's hard to tell what is exactly causing this but the only difference between both is that mine is based on varchars and yours is based on integers so as Greg suggested it, maybe the culprit is the packed varlenas patch. I'll try to measure the overhead of this patch alone. -- Guillaume ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
"Guillaume Smet" <[EMAIL PROTECTED]> writes: > Here are some rough results: > http://people.openwide.fr/~gsmet/postgresql/postgresql_8.3_development_cycle_1.png I repeated this experiment using the "pgbench -n -S -c 10 -t 10 bench" test case that I've been looking at. The attached graph shows reported TPS for CVS pulls from first-of-the-month dates and dates bracketing selected interesting changes. Here's the raw data being plotted: 2007-01-01 9512.306699 2007-01-16 9388.317681 2007-01-17 6756.486634 2007-02-01 6457.403152 2007-03-01 6379.643242 2007-03-02 6907.368329 2007-04-01 6989.332803 2007-04-29 6908.091429 2007-04-30 9252.289116 2007-05-01 9290.111548 2007-06-01 9199.813641 2007-07-01 9162.253476 2007-08-01 9281.821046 2007-09-01 9123.663541 2007-10-01 9322.775762 2007-11-01 9148.342301 2007-11-25 9663.446883 The TPS numbers bounce around by 1% or so on repeated trials, so I wouldn't put too much faith in small differences. What it looks like to me is that it's all about the stats collection overhead. The drop on 01-17 corresponds to autovac and stats_row_level being turned on by default. The improvement on 03-02 is the fix for the problem that the stats collector process wanted to write the stats file way too often, and the improvement on 04-30 comes from rate-limiting stats messages from individual backends to the stats collector. It might be interesting to deconstruct what else happened between 01-17 and 03-01, but I think that the other month-to-month variances are probably within the noise threshold. regards, tom lane <> ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Sat, 24 Nov 2007, Guillaume Smet wrote: Here are some rough results: http://people.openwide.fr/~gsmet/postgresql/postgresql_8.3_development_cycle_1.png So the big dips were Jan->Feb and April->May. I've still got the text of the commit log Tom assembled sitting at http://developer.postgresql.org/index.php/8.3_Changelog and http://www.westnet.com/~gsmith/content/postgresql/83revhist.txt if anyone wants a relatively easy way to see what happened then. Scanning the January results, the biggest drop, I noted that 2007-01-16 was the day autovacuum was turned on by default. If you're sure it's explicitly disabled in every test that shouldn't matter, might be something worth double-checking. The other commits that looked like a lot of code changes with potential for performance changes were all from Tom, on 2007-01-03, 2007-01-08, and 2007-01-10, and 2007-01-20. If you wanted to try and nail this down further, three dates with no commits that would narrow the source of the big regression(s) that month are 2007-01-07, 2007-01-13, 2007-01-22. 2007-01-17 would be a good additional day to confirm that autovacuum wasn't responsible. packed varlenas were commited during April There were also several changes to pgbench itself that month. Useful breakpoints in that month to subdivide might be 2007-04-08 (after varlena change), 2007-04-17, 2007-04-22. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
"Guillaume Smet" <[EMAIL PROTECTED]> writes: > On Nov 24, 2007 5:16 PM, Gregory Stark <[EMAIL PROTECTED]> wrote: >> Several of the major changes in 8.3 are I/O vs CPU tradeoffs which could be >> causing a slowdown if you're measuring primarily CPU resources. I'm thinking >> of both HOT and packed varlenas. I don't know if either of these are causing >> your slowdown but it's possible. > > Here are some rough results: > http://people.openwide.fr/~gsmet/postgresql/postgresql_8.3_development_cycle_1.png > > I don't pretend that this bench is realistic but it's a first step. I > run the tests with weird numbers more than 3 times to check that they > are consistent. > > IIRC, packed varlenas were commited during April and HOT at the end of > September. Hm, so your test seems to be showing up a difference due to packed varlena? What functions or operators are you using? Could you send the schema and actual queries? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On 11/24/07, Gregory Stark <[EMAIL PROTECTED]> wrote: > > This is a conflict which will affect Postgres in the future as well. > Generally > I/O costs win over cpu costs in databases since only relatively small > systems > are cpu-bound. Large systems are typically I/O-bound. > That really depends on hardware doesn't it? I'd say that I'm more concerned with CPU than IO on high-end hardware as a generalization, especially with the scaling issues beyond 32 CPU's. Regards, Gavin
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 24, 2007 5:16 PM, Gregory Stark <[EMAIL PROTECTED]> wrote: > Several of the major changes in 8.3 are I/O vs CPU tradeoffs which could be > causing a slowdown if you're measuring primarily CPU resources. I'm thinking > of both HOT and packed varlenas. I don't know if either of these are causing > your slowdown but it's possible. Here are some rough results: http://people.openwide.fr/~gsmet/postgresql/postgresql_8.3_development_cycle_1.png I don't pretend that this bench is realistic but it's a first step. I run the tests with weird numbers more than 3 times to check that they are consistent. IIRC, packed varlenas were commited during April and HOT at the end of September. -- Guillaume ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
"Guillaume Smet" <[EMAIL PROTECTED]> writes: > And the most important point IMHO is that we must be aware of the > trade-offs we make. We might have some cases where the CPU trade-off > is not worth the I/O improvement (and probably the other case too). > We really need a test framework to be able to perform daily benchmarks > in various situations through the whole release cycle. Well the problem is that what benchmark you choose dictates what areas you need to concentrate on. The industy-standard benchmark so far has been TPC-C which is only really concerned with I/O. Published benchmarks typically have 2-4 processors and hundreds of drives... In the future TPC-E will load up a few more CPU resource hogs like relational integrity checks, but even there it's fundamentally going to be a disk-bound benchmark. > I currently have compiled a version per month from january to now to > perform my own tests (mostly CPU bound). If anyone wants me to perform > specific pgbench load (I know it's not perfect but it's the most > convenient tool we have currently), ping me. The box is only a Core2 > duo box with 2 GB of RAM and a SATA disk. So it's quite easy to be I/O > bound :). It would be nice to have infrastructure similar to the buldfarm running a standard set of benchmarks every day. It would be fascinating to see the graphs day-by-day of performance. Hopefully we wouldn't see too many dips and just a steady increase over time. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 24, 2007 5:16 PM, Gregory Stark <[EMAIL PROTECTED]> wrote: > This is a conflict which will affect Postgres in the future as well. Generally > I/O costs win over cpu costs in databases since only relatively small systems > are cpu-bound. Large systems are typically I/O-bound. It really depends on what you call a small system. In my current project, the database size is 4.6GB. So it's a small database in size which can fit in RAM. *But* it's a highly loaded database with a lot of complex queries (lots of joins, proximity queries and so on) and it's mostly CPU bound. And it's really a critical one. I'd really like to see us find a good compromise between CPU and I/O because CPU bound database aren't uncommon (especially for web usage). And they aren't less critical than I/O bound ones. And the most important point IMHO is that we must be aware of the trade-offs we make. We might have some cases where the CPU trade-off is not worth the I/O improvement (and probably the other case too). We really need a test framework to be able to perform daily benchmarks in various situations through the whole release cycle. I currently have compiled a version per month from january to now to perform my own tests (mostly CPU bound). If anyone wants me to perform specific pgbench load (I know it's not perfect but it's the most convenient tool we have currently), ping me. The box is only a Core2 duo box with 2 GB of RAM and a SATA disk. So it's quite easy to be I/O bound :). -- Guillaume ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
"Guillaume Smet" <[EMAIL PROTECTED]> writes: > Tom, from my tests, the slow down goes down from 8% to 4% but it's > still there and measurable. It's pretty consistent with the fact that > you only saw a 3% slow down in your tests. > The fact that you had only 3% overhead is still bugging me though. > I'll dig a bit further to see if I can find something interesting or > if there's something wrong with my setup. Several of the major changes in 8.3 are I/O vs CPU tradeoffs which could be causing a slowdown if you're measuring primarily CPU resources. I'm thinking of both HOT and packed varlenas. I don't know if either of these are causing your slowdown but it's possible. HOT reduces the need to vacuum and reduces table and index bloat in exchange for having to do some amount of vacuuming in the critical path. Effectively it's moving some cpu work into the critical path which if you may not be measuring in 8.2 if you weren't including vacuums in your measurements. Packed varlenas reduces space usage -- especially for database which are heavily based on text/varchar and "numeric" -- but at the expense of having to expand those fields whenever they're operated on. For a RAM-resident test that may be a net loss. This is a conflict which will affect Postgres in the future as well. Generally I/O costs win over cpu costs in databases since only relatively small systems are cpu-bound. Large systems are typically I/O-bound. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 22, 2007 7:01 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > "Guillaume Smet" <[EMAIL PROTECTED]> writes: > > I thought I could also perform a test on CVS head every month from > > December 2006 to now to see if it can give us a better idea of when > > the overhead first appeared. Ping me if you're interested in it. > > If you feel like doing that, it might be interesting just on general > principles ... I'm working on it right now. I should have some results before the end of the week-end. -- Guillaume ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 24, 2007 11:35 AM, Simon Riggs <[EMAIL PROTECTED]> wrote: > On the plus side, there are many very savvy people out there too and all > the performance features we put in are being used in serious ways. But > we must cater for both the top end and bottom end of the application > spectrum. Totally agree with Simon. PostgreSQL is my database of choice for every application because it's fast, rock solid and highly consistent. I would rather not advice people to use MySQL because their application is too simple, doesn't use prepared statements or any other reason. Moreover, AFAIK, the use of prepared statements is not always a good solution, especially when there are big variations in statistics depending on the input. And I get this overhead with more complicated queries also, queries which won't perform well if I use the same plan for all values of the parameters. And this is not an hypothetical situation as the data of this particular database are far from being equally distributed (a lot of information for big cities, a few for small cities). I must admit I'm used to see every PostgreSQL version going faster than the previous one :). Perhaps, synchronized scans or the optimization of Florian will get the database faster after all. I can't really know at this time. But they have to get my database 4% faster to compensate the current loss. Tom, from my tests, the slow down goes down from 8% to 4% but it's still there and measurable. It's pretty consistent with the fact that you only saw a 3% slow down in your tests. The fact that you had only 3% overhead is still bugging me though. I'll dig a bit further to see if I can find something interesting or if there's something wrong with my setup. -- Guillaume ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Sat, 2007-11-24 at 00:04 -0500, Tom Lane wrote: > I didn't intend to say that select-only transactions aren't interesting; > rather that there should be some minimal effort on the application side. > The cases we are testing here involve: > > 1. One query per transaction. Even with the 8.3 improvements to reduce > overhead for select-only transactions, this isn't necessarily a good > idea. > > 2. *Extremely* trivial queries --- fetching one row from one table on > the basis of its primary key --- which make you wonder why the > programmer is using a SQL database rather than ndb or some such. > Anyone who's used SQL for any length of time knows that it's better to > push more of the application logic onto the database side, but these > queries are typical of apps that think they should do most of the work. > > 3. No use of prepared statements. Duh, especially in view of #2. There > are reasons to avoid prepared statements in the case that you're issuing > commands that have some intellectual interest for the planner, but these > are not those. > > The whole thing is the worst-case scenario for a DBMS that spends any > real effort on analyzing/planning SQL commands. Well, I've been asked to tune enough applications that fall into this category that I can say this message isn't getting across anywhere near as strongly as you might have thought. There is still good reason to use an RDBMS and Postgres in particular, but still 80-90% (by number) of access would be the types of queries you mention in (2) above, even if they represent about 20-30% of total time/resources. In many cases, 100% of queries are unprepared. So your efforts are well spent. It has also made me think about further optimizations and tuning options in this area for 8.4 On the plus side, there are many very savvy people out there too and all the performance features we put in are being used in serious ways. But we must cater for both the top end and bottom end of the application spectrum. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 24, 2007 12:39 AM, Gregory Stark <[EMAIL PROTECTED]> wrote: > Surely such applications would be using prepare queries though? Surely the good ones. I know you can only save people from themselves to a certain point... I just didn't want to dismiss it completely. -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation| fax: 732.331.1301 499 Thornall Street, 2nd Floor | [EMAIL PROTECTED] Edison, NJ 08837| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
"Jonah H. Harris" <[EMAIL PROTECTED]> writes: > On Nov 23, 2007 8:52 PM, Tom Lane <[EMAIL PROTECTED]> wrote: >> So I'm satisfied with these results, particularly in view of the >> fact that what we're measuring is certainly the stupidest, least >> efficient way to use Postgres. > > Given the emerging world of data-driven content management systems and > select-mostly web applications, I'd hesitate to say that select-only > transactions aren't worth optimizing for. Regardless, glad to hear > it's almost the same. I'll run the script again and see what it > shows. Surely such applications would be using prepare queries though? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support! ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 24, 2007 12:04 AM, Tom Lane <[EMAIL PROTECTED]> wrote: > I didn't intend to say that select-only transactions aren't interesting; > rather that there should be some minimal effort on the application side. > The cases we are testing here involve: Agreed, thanks for clarifying. -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation| fax: 732.331.1301 499 Thornall Street, 2nd Floor | [EMAIL PROTECTED] Edison, NJ 08837| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
"Jonah H. Harris" <[EMAIL PROTECTED]> writes: > On Nov 23, 2007 8:52 PM, Tom Lane <[EMAIL PROTECTED]> wrote: >> So I'm satisfied with these results, particularly in view of the >> fact that what we're measuring is certainly the stupidest, least >> efficient way to use Postgres. > Given the emerging world of data-driven content management systems and > select-mostly web applications, I'd hesitate to say that select-only > transactions aren't worth optimizing for. I didn't intend to say that select-only transactions aren't interesting; rather that there should be some minimal effort on the application side. The cases we are testing here involve: 1. One query per transaction. Even with the 8.3 improvements to reduce overhead for select-only transactions, this isn't necessarily a good idea. 2. *Extremely* trivial queries --- fetching one row from one table on the basis of its primary key --- which make you wonder why the programmer is using a SQL database rather than ndb or some such. Anyone who's used SQL for any length of time knows that it's better to push more of the application logic onto the database side, but these queries are typical of apps that think they should do most of the work. 3. No use of prepared statements. Duh, especially in view of #2. There are reasons to avoid prepared statements in the case that you're issuing commands that have some intellectual interest for the planner, but these are not those. The whole thing is the worst-case scenario for a DBMS that spends any real effort on analyzing/planning SQL commands. Even given all that, I think 8.3 would look pretty good if we were throwing enough concurrent queries at it to make Florian's transaction concurrency improvements a factor. But with only 10 concurrent sessions, and pgbench's known limitations for issuing concurrent commands fast enough to keep things busy, there's no opportunity to shine in that dimension either. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 23, 2007 8:52 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > So I'm satisfied with these results, particularly in view of the > fact that what we're measuring is certainly the stupidest, least > efficient way to use Postgres. Given the emerging world of data-driven content management systems and select-mostly web applications, I'd hesitate to say that select-only transactions aren't worth optimizing for. Regardless, glad to hear it's almost the same. I'll run the script again and see what it shows. -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation| fax: 732.331.1301 499 Thornall Street, 2nd Floor | [EMAIL PROTECTED] Edison, NJ 08837| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
I wrote: > I don't think you need to --- the "read-only transaction" case built > into pgbench is probably an equivalent test. What it looks like to > me is that the EquivalenceClass mechanism has added a little bit of > overhead, which isn't actually buying much of anything in these > trivial cases. I'll look at whether it can be short-circuited. I've knocked down a few bits of low-hanging fruit. What I see with this evening's CVS HEAD is that 8.3 is roughly on par with 8.2 for the "pgbench -S -c 10" case, if you compare them with stats collection turned off. Turning stats collection on slows 8.3 by a percent or so --- but 8.2 takes a very much larger hit with stats collection on, about 25%. So I'm satisfied with these results, particularly in view of the fact that what we're measuring is certainly the stupidest, least efficient way to use Postgres. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
"Guillaume Smet" <[EMAIL PROTECTED]> writes: > I thought I could also perform a test on CVS head every month from > December 2006 to now to see if it can give us a better idea of when > the overhead first appeared. Ping me if you're interested in it. If you feel like doing that, it might be interesting just on general principles ... regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 22, 2007 12:45 PM, Guillaume Smet <[EMAIL PROTECTED]> wrote: > I thought I could also perform a test on CVS head every month from > December 2006 to now to see if it can give us a better idea of when > the overhead first appeared. Ping me if you're interested in it. If I recall correctly, I seem to recall this issue coming in around between HOT and the select-only improvement fix. Though, I'm not sure whether I had tested it previously to that. -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation| fax: 732.331.1301 499 Thornall Street, 2nd Floor | [EMAIL PROTECTED] Edison, NJ 08837| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
Tom, On Nov 22, 2007 10:29 AM, Simon Riggs <[EMAIL PROTECTED]> wrote: > Sounds comprehensive, thanks for double checking. > > Would it be possible to do these tests? > > Do you want me to perform additional tests or are you pretty sure of what the problem is? I thought I could also perform a test on CVS head every month from December 2006 to now to see if it can give us a better idea of when the overhead first appeared. Ping me if you're interested in it. -- Guillaume ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 22, 2007 5:00 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > Gregory Stark <[EMAIL PROTECTED]> writes: > > Out of curiosity have you recompiled 8.2.5 recently? That is, are they > > compiled with the same version of gcc? > > CVS tip of both branches, freshly compiled for this test. And in my case, a vanilla 8.2.5 compiled on the same box with the same compiler. -- Guillaume ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
Gregory Stark <[EMAIL PROTECTED]> writes: > "Tom Lane" <[EMAIL PROTECTED]> writes: >> The weird thing is that after a couple of hours of poking at it with >> oprofile and other sharp objects, I have no idea *why* it's slower. >> oprofile shows just about the same relative percentages for all the >> hot-spot functions in the backend. > Out of curiosity have you recompiled 8.2.5 recently? That is, are they > compiled with the same version of gcc? CVS tip of both branches, freshly compiled for this test. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
"Guillaume Smet" <[EMAIL PROTECTED]> writes: > On Nov 22, 2007 6:44 AM, Tom Lane <[EMAIL PROTECTED]> wrote: >> Are you examining only "trivial" queries? I've been able to identify a >> couple of new planner hotspots that could explain some slowdown if the >> planning time is material compared to the execution time. If you're >> seeing a slowdown on queries that run for awhile, that would be >> something else ... > Yes, I kept only queries with no join and a couple of where > conditions. As I explained previously, I can reproduce the behavior > with a single index scan on only one table (plan posted previously). > If anyone is interested I can post the content of this table (there's > nothing confidential in it so I should have the customer permission) > and a couple of instructions to reproduce the test case. I don't think you need to --- the "read-only transaction" case built into pgbench is probably an equivalent test. What it looks like to me is that the EquivalenceClass mechanism has added a little bit of overhead, which isn't actually buying much of anything in these trivial cases. I'll look at whether it can be short-circuited. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
"Tom Lane" <[EMAIL PROTECTED]> writes: > The weird thing is that after a couple of hours of poking at it with > oprofile and other sharp objects, I have no idea *why* it's slower. > oprofile shows just about the same relative percentages for all the > hot-spot functions in the backend. Out of curiosity have you recompiled 8.2.5 recently? That is, are they compiled with the same version of gcc? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Thu, 2007-11-22 at 00:30 +0100, Guillaume Smet wrote: > > Is the data identical on both systems? Guillaume, Sounds comprehensive, thanks for double checking. Would it be possible to do these tests? 1. Compare SELECT 1; This will allow us to remove planner and indexscan overheads from results, though will still include protocol and tcop stuff. 2. Compare SELECT ... WHERE values are constants This will cause the clients to reuse the plan already made, so should effectively remove planner, but not indexscan overheads from the test. 3. Change the test to look at Integers columns only for the WHERE clause, so we can remove any thought it has anything to do with text data, collation etc.. >From those tests we should be able to narrow things down to planner, executor or indexscan related. Thanks, -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 22, 2007 6:44 AM, Tom Lane <[EMAIL PROTECTED]> wrote: > Are you examining only "trivial" queries? I've been able to identify a > couple of new planner hotspots that could explain some slowdown if the > planning time is material compared to the execution time. If you're > seeing a slowdown on queries that run for awhile, that would be > something else ... Yes, I kept only queries with no join and a couple of where conditions. As I explained previously, I can reproduce the behavior with a single index scan on only one table (plan posted previously). If anyone is interested I can post the content of this table (there's nothing confidential in it so I should have the customer permission) and a couple of instructions to reproduce the test case. The case in which I used a few differents queries executes the following ones directly extracted from the application (all are index scans - and they use the exact same index on 8.2 and 8.3): select libvil from vilsitelang where codelang='FRA' and codevil='LYO' select TL.motsclesmetatags, TL.descriptifmeta, TL.motcleoverture_l, TL.motcleoverture_c, TL.baselinetheme from themelang TL where TL.codeth = 'ASS' and TL.codelang = 'FRA' SELECT libvilpubwoo, codelang, codepays, petiteville FROM vilsite WHERE codevil = 'LYO' select libvil from vilsitelang where codelang='FRA' and codevil='LYO' So as you can see, queries can't be simpler. -- Guillaume ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 22, 2007 2:17 AM, Tom Lane <[EMAIL PROTECTED]> wrote: > Another issue is that on read-only transactions there's an extra > gettimeofday() call caused by pgstat_report_tabstats, which could be a > problem on machines with slow gettimeofday(). However that shouldn't > happen if you've got track_counts turned off, so if you don't see any > difference with or without stats then it's not the problem for you. The box is a Core2 duo box so I don't think it's the case. track_counts on/off doesn't change anything to the results. -- Guillaume ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
"Guillaume Smet" <[EMAIL PROTECTED]> writes: > I'm not saying my benchmark is perfectly relevant: I made it > excessively simple on purpose. I just see a general slow down which is > quite consistent accross all the tests I did (with pgbench or the > application) and I'd really like to know if it's just my case on this > particular box or something more general. Are you examining only "trivial" queries? I've been able to identify a couple of new planner hotspots that could explain some slowdown if the planning time is material compared to the execution time. If you're seeing a slowdown on queries that run for awhile, that would be something else ... regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 21, 2007 7:22 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > FWIW, the test cases I was just comparing are entirely CPU-bound --- > vmstat says there are no disk reads happening at all. Now I only got a > 3% drop, so that may not be the same effect Guillaume is seeing. But > the whole thing is a bit upsetting seeing that we thought we'd reduced > the overhead for short read-only transactions ... A month or so ago I mentioned to Bruce that we were seeing this behavior (accidentally) but hadn't had time to focus on it or determine whether it was a tuning issue. Basically we're performing the same select-only pgbench test, but with a varying scale from 1 to 1000. In almost all cases, 8.2.5 is faster than 8.3 by about 2-5 percent. The script given to us by a customer was as follows: for scale in 1 2 5 10 20 30 40 50 75 100 200 400 800 1000; do echo "" echo "SCALE $scale" dropdb pgbench createdb pgbench pgbench -p 5432 -i -s $scale pgbench psql pgbench -c 'CHECKPOINT' pgbench -p 5432 -c 8 -t 2500 -S pgbench pgbench -p 5432 -c 8 -t 2500 -S pgbench pgbench -p 5432 -c 8 -t 2500 -S pgbench done -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation| fax: 732.331.1301 499 Thornall Street, 2nd Floor | [EMAIL PROTECTED] Edison, NJ 08837| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
I wrote: > The weird thing is that after a couple of hours of poking at it with > oprofile and other sharp objects, I have no idea *why* it's slower. > oprofile shows just about the same relative percentages for all the > hot-spot functions in the backend. However, some comparisons with gprof show that the planner is calling the hot-spot functions more than it used to, which might be enough to account for a couple percent on trivial queries like the ones being issued by pg_bench ("SELECT abalance FROM accounts WHERE aid = :aid;"). After the holiday I'll look into refactoring to try to avoid the extra calls. Another issue is that on read-only transactions there's an extra gettimeofday() call caused by pgstat_report_tabstats, which could be a problem on machines with slow gettimeofday(). However that shouldn't happen if you've got track_counts turned off, so if you don't see any difference with or without stats then it's not the problem for you. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 22, 2007 12:49 AM, Greg Smith <[EMAIL PROTECTED]> wrote: > Hard drives deliver a higher transfer rate at their inner > portions, typically the start of the disk from the operating system's > perspective. It could have been a good point if it was seq scans of large tables. It's only index scans on small tables which return only a couple of rows. They all fit in cache without any problem. For the last test I exposed in a previous email, the table is just 5480 rows and here are the sizes of the table and the index used: cityvox=> select pg_size_pretty(pg_relation_size('vilsitelang')); pg_size_pretty 232 kB (1 row) cityvox=> select pg_size_pretty(pg_relation_size('pk_vilsitelang')); pg_size_pretty 120 kB (1 row) > Not saying this is responsible for your results, just that benchmarking is > hard and there may be somthing other than what you think responsible for a > difference of this size. Sure. That's why I wanted other people advice :). I'm not saying my benchmark is perfectly relevant: I made it excessively simple on purpose. I just see a general slow down which is quite consistent accross all the tests I did (with pgbench or the application) and I'd really like to know if it's just my case on this particular box or something more general. Let's call it a call to share benchmark results for 8.3 :). We're not that far from the release and I didn't see a lot of benchmarks results around. I just wanted to add that I know that there is a lot of other things which may be faster with 8.3. What bothers me is that I don't think the other improvements will help that much this database in particular and this is by far the most critical database we're hosting here. -- Guillaume ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 22, 2007 12:59 AM, Tom Lane <[EMAIL PROTECTED]> wrote: > It's not that autovacuum is now > on by default --- turning it off made no particular difference. Tested that also a few hours ago. No difference. > It's not that stats collection is now on by default --- ditto. Same here. My 8.2 has stats collection enabled in the same way that 8.3 does. -- Guillaume ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
Greg Smith <[EMAIL PROTECTED]> writes: > On Wed, 21 Nov 2007, Guillaume Smet wrote: >> *** 8.2 *** >> tps = 853.360277 (including connections establishing) >> >> *** 8.3 *** >> tps = 784.819087 (including connections establishing) > This is an 8% drop. I've seen a larger difference than that between two > identical installations of the same version when the database is many GB > large. Hard drives deliver a higher transfer rate at their inner > portions, typically the start of the disk from the operating system's > perspective. It's not unusual for the slow parts of the disk to be 30-40% > slower than the fast ones. FWIW, the test cases I was just comparing are entirely CPU-bound --- vmstat says there are no disk reads happening at all. Now I only got a 3% drop, so that may not be the same effect Guillaume is seeing. But the whole thing is a bit upsetting seeing that we thought we'd reduced the overhead for short read-only transactions ... regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
"Guillaume Smet" <[EMAIL PROTECTED]> writes: > I took a couple of very simple read only queries executed in the pages > to create a pgbench script and I have the following results: Hmm ... I can reproduce a consistent difference of about three percent between 8.2 and HEAD. Using pgbench's built-in SELECT-only transaction after "pgbench -i -s 10 bench", I get HEAD: $ time pgbench -n -S -c 10 -t 10 bench transaction type: SELECT only scaling factor: 10 number of clients: 10 number of transactions per client: 10 number of transactions actually processed: 100/100 tps = 9399.185809 (including connections establishing) tps = 9402.305058 (excluding connections establishing) real1m46.402s user0m19.889s sys 0m23.497s 8.2: $ time pgbench -n -S -c 10 -t 10 bench82 transaction type: SELECT only scaling factor: 10 number of clients: 10 number of transactions per client: 10 number of transactions actually processed: 100/100 tps = 9729.892729 (including connections establishing) tps = 9732.769774 (excluding connections establishing) real1m42.785s user0m19.250s sys 0m23.646s Vanilla build (--enable-debug but not much else), C locale, SQL_ASCII encoding, dual Xeon/EMT on Fedora Core 6. Configuration parameters are all defaults in both cases, except I had fsync off, which shouldn't matter anyway in a read-only test. The weird thing is that after a couple of hours of poking at it with oprofile and other sharp objects, I have no idea *why* it's slower. oprofile shows just about the same relative percentages for all the hot-spot functions in the backend. strace shows that there's no particular increase in kernel calls (indeed, HEAD seems to use significantly fewer semops/selects, indicating that we had some success in reducing contention). It's not that autovacuum is now on by default --- turning it off made no particular difference. It's not that stats collection is now on by default --- ditto. Slowing down the walwriter and bgwriter doesn't help either. It's not pgbench itself --- I get about the same results if I use 8.2 pgbench with HEAD or vice versa. The best theory I can come up with is that all the new stuff added to the backend (the executable is about 12% larger than in 8.2) has resulted in some generalized slowdown just because the code is larger. But most of the added code isn't getting exercised by this test, so in theory the code bloat shouldn't be hurting us either. Weird. Given that it's only a couple percent I'm not gonna panic about it, but I would like to know where the time is going ... regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Wed, 21 Nov 2007, Guillaume Smet wrote: *** 8.2 *** tps = 853.360277 (including connections establishing) *** 8.3 *** tps = 784.819087 (including connections establishing) This is an 8% drop. I've seen a larger difference than that between two identical installations of the same version when the database is many GB large. Hard drives deliver a higher transfer rate at their inner portions, typically the start of the disk from the operating system's perspective. It's not unusual for the slow parts of the disk to be 30-40% slower than the fast ones. I've been known to mkfs all the database paritions before each test run just to remove this bias, so that the data was on exactly the same portion of the drive each time. Not saying this is responsible for your results, just that benchmarking is hard and there may be somthing other than what you think responsible for a difference of this size. I'd suggest running "select count(*) from x" on a couple of the big tables as one way to get a feel for whether the underlying disk is delivering at the same speed in both installations. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 22, 2007 12:07 AM, Gregory Williamson <[EMAIL PROTECTED]> wrote: > I've been running some fairly heavy read-only tests (5 minutes duration) > against 8.3beta2 and 8.2.5 and 8.1.10 and are getting slightly faster > numbers for 8.2.5 over 8.1 and 8.3beta2 looks consistently faster by a few > percent. This is heavily oriented to postGIS queries so your mileage may > vary. But so far I haven't seen any red flags or show stoppers from my > (limited) perspective. There are some changes to the config files but I > don't have details at hand. Thanks for your input. That's what I expected and that's why I'm a bit surprised... -- Guillaume ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Nov 21, 2007 10:09 PM, Simon Riggs <[EMAIL PROTECTED]> wrote: > Please do tests of at least 2 minutes duration. A 1.25 second test isn't > enough. I already run far longer tests. It's not a matter of running a test for long, it's just that each unique query is a little bit slower. Moreover it's not a pgbench artifact, I have the same results inside the real application (it's a PHP app). > Please confirm you have VACUUM ANALYZED each db beforehand. Yes of course. The dump was just loaded in both databases. Both databases are identical (the 8.3 db is smaller in size on the disk as expected). They are both analyzed. FYI, I also have the very same database running on a 8.1.x branch (just loaded and analyzed) and the results are more like the 8.2 ones than like the 8.3 ones. > Have you checked that the EXPLAIN ANALYZEs are essentially identical > also? I did the test before. Every plan of every query involved in the test is identical. I removed from the test the one where a different index was chosen (8.2 chooses a larger index and 8.3 chooses the good one - Tom fixed something about that not so long ago and it works fine for us too). > Is the data identical on both systems? Freshly loaded from a dump. > How do the postgresql.conf files differ? They don't differ at all, except for the new parameters introduced in 8.3 (I let them the default). > Also, do a run with SELECT 1 FROM table where col = constant; so we can > assess differences in path without cache or data being relevant. I don't think the cache is relevant as they are all index scans and queries don't return a lot of rows. The indexes fit in RAM and I run each pgbench test several times. And to answer a question Joshua asked on IRC, pgbench is the same in both tests. I use the system wide one (8.1.9 from the RH package). To be sure, here are more information: ** 8.2 ** cityvox=# show shared_buffers; shared_buffers 128MB (1 row) cityvox=# show work_mem; work_mem -- 32MB (1 row) cityvox=# show lc_collate; lc_collate - fr_FR.UTF-8 (1 row) cityvox=# select version(); version PostgreSQL 8.2.5 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.1.1 20070105 (Red Hat 4.1.1-52) (1 row) ** 8.3 ** cityvox=# ANALYZE; ANALYZE cityvox=# show shared_buffers; shared_buffers 128MB (1 row) cityvox=# show work_mem; work_mem -- 32MB (1 row) cityvox=# show lc_collate; lc_collate - fr_FR.UTF-8 (1 row) cityvox=# select version(); version --- PostgreSQL 8.3beta2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.1.1 20070105 (Red Hat 4.1.1-52) (1 row) Then I run the test longer (I run it with 1000 transactions before to have the data in cache): ** 8.2 ** [ip-dyn151 postgresql]# pgbench -h localhost -p 5482 -U cityvox -f bench.cityvox.simple.sql -t 10 -c 1 cityvox starting vacuum...end. transaction type: Custom query scaling factor: 1 number of clients: 1 number of transactions per client: 10 number of transactions actually processed: 10/10 tps = 853.100511 (including connections establishing) tps = 853.124776 (excluding connections establishing) ** 8.3 ** [ip-dyn151 postgresql]# pgbench -h localhost -p 5483 -U cityvox -f bench.cityvox.simple.sql -t 10 -c 1 cityvox starting vacuum...end. transaction type: Custom query scaling factor: 1 number of clients: 1 number of transactions per client: 10 number of transactions actually processed: 10/10 tps = 791.244011 (including connections establishing) tps = 791.268316 (excluding connections establishing) Then let's simplify the test a bit with only one query: [EMAIL PROTECTED] postgresql]# cat bench.cityvox.really.simple.sql select libvil from vilsitelang where codelang='FRA' and codevil='LYO' ** 8.2 ** cityvox=> EXPLAIN ANALYZE select libvil from vilsitelang where codelang='FRA' and codevil='LYO'; QUERY PLAN - Index Scan using pk_vilsitelang on vilsitelang (cost=0.00..4.27 rows=1 width=12) (actual time=0.028..0.031 rows=1 loops=1) Index Cond: (((codevil)::text = 'LYO'::text) AND ((codelang)::text = 'FRA'::text)) Total runtime: 0.071 ms (3 rows) [ip-dyn151 postgresql]# pgbench -h localhost -p 5482 -U cityvox -f bench.cityvox.really.simple.sql -t 10 -c 1 cityvox starting vacuum...end. transaction type: Custom query scaling factor: 1 number of clients: 1 number of transactions per client: 10 number of transactions actually processed: 10/10 tps = 3468.220041 (including connections establishing) tps = 3
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
FWIW, > > Please do tests of at least 2 minutes duration. A 1.25 second test isn't > enough. Please confirm you have VACUUM ANALYZED each db beforehand. > > Have you checked that the EXPLAIN ANALYZEs are essentially identical > also? Is the data identical on both systems? > I've been running some fairly heavy read-only tests (5 minutes duration) against 8.3beta2 and 8.2.5 and 8.1.10 and are getting slightly faster numbers for 8.2.5 over 8.1 and 8.3beta2 looks consistently faster by a few percent. This is heavily oriented to postGIS queries so your mileage may vary. But so far I haven't seen any red flags or show stoppers from my (limited) perspective. There are some changes to the config files but I don't have details at hand. Initial tests are always faster; we usually throw them out and run for real numbers starting with 3rd tests to make sure we don't jump at cache issues. For the most part we only care about performance with as much of the database in cache as we can so those initial tests aren;t of much use. (Sorry for the poor posting -- challenged mail client) HTH, Greg Williamson Senior DBA GlobeXplorer LLC, a DigitalGlobe company Confidentiality Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information and must be protected in accordance with those provisions. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. (My corporate masters made me say this.)
Re: [HACKERS] 8.3devel slower than 8.2 under read-only load
On Wed, 2007-11-21 at 20:04 +0100, Guillaume Smet wrote: > number of clients: 1 > number of transactions per client: 1000 > number of transactions actually processed: 1000/1000 > tps = 784.819087 (including connections establishing) > tps = 786.884214 (excluding connections establishing) > > All the queries are index scans (SELECT a couple of fields FROM a > table WHERE one or two conditions). I checked the plans on both 8.2 > and 8.3 and they are identical. I made several runs and numbers are > consistent. Please do tests of at least 2 minutes duration. A 1.25 second test isn't enough. Please confirm you have VACUUM ANALYZED each db beforehand. Have you checked that the EXPLAIN ANALYZEs are essentially identical also? Is the data identical on both systems? How do the postgresql.conf files differ? Please find out any differences you can, so we can rule things out. Also, do a run with SELECT 1 FROM table where col = constant; so we can assess differences in path without cache or data being relevant. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate