Re: [PERFORM] GIST optimization to limit calls to operator on sub nodes
Le 30/06/2014 16:04, Tom Lane a écrit : Pujol Mathieu mathieu.pu...@realfusio.com writes: Le 29/06/2014 22:30, Tom Lane a écrit : I don't actually understand what's being requested here that the NotConsistent case doesn't already cover. The NotConsistent case is correctly covered, the sub nodes are not tested because I know that no child could pass the consistent_test. The MaybeConsistent case is also correctly covered, all sub nodes are tested because I don't know which sub nodes will pass the consistent_test. My problem is with the FullyConsistent, because when I test a node I can know that all it's childs nodes and leaves will pass the test, so I want to notify GIST framework that it can't skip consistent test on those nodes. Like we can notify it when testing a leaf that it could skip consistent test on the row. Maybe I miss something on the API to do that. On my tests, the recheck_flag works only for leaves. Hm ... that doesn't seem like a case that'd come up often enough to be worth complicating the APIs for, unless maybe you are expecting a lot of exact-duplicate index entries. If you are, you might find that GIN is a better fit for your problem than GIST --- it's designed to be efficient for lots-of-duplicates. Another view of this is that if you can make exact satisfaction checks at upper-page entries, you're probably storing too much information in the index entries (and thereby bloating the index). The typical tradeoff in GIST indexes is something like storing bounding boxes for geometric objects --- which is necessarily lossy, but it results in small indexes that are fast to search. It's particularly important for upper-page entries to be small, so that fanout is high and you have a better chance of keeping all the upper pages in cache. If you've got a compelling example where this actually makes sense, I'd be curious to hear the details. regards, tom lane Hi, I have a table containing several millions of rows, and each row contains an unique integer as identifier. My goal is to select all rows which have an identifier that is contained into at least one range of a list. CREATE TABLE t (id int4 UNIQUE, ...) SELECT * FROM t WHERE id @@ ARRAY[...]::range[] I use a custom operator to check if an integer is contained in an array of ranges (a custom structure defined by my plugin). I build my own GIST to speedup those requests. So each node of my GIST is represented by a range (like a BVH or octree of 3D boxes). I have no duplicated keys in my index. When I run my tests I am able to quickly discard entire portions of the index which leads to great performance improvements. During GIST traversal when I test consistency of a node (isRangeOverlap(range,range[]) the test return Fully, Partially, No. So when the tests return Fully I know for sure that all subnodes will also return Fully. In practice my operator is not free in execution time, so if I could propagate the information on subnodes it will allow to save lot of computation time. This optimization could also be benefical for cube extension. I don't think that it would complicate the API, we could use existing recheck_flag. Today this value is used only for leaves node. Maybe it could be propagated by GIST API from a node to its subnodes. So if a node set it to false, it will be false for it's subnodes allowing client to use it or no. So the API could remain the same without changes for existing plugins and need only small memory to propagate this boolean. I already achieve great performance improvements with my GIST, my request is to optimize it in use cases that select several rows to limit overhead of my consistent_operator. regards, mathieu pujol -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 60 core performance with 9.3
On 27/06/14 21:19, Andres Freund wrote: On 2014-06-27 14:28:20 +1200, Mark Kirkwood wrote: My feeling is spinlock or similar, 'perf top' shows kernel find_busiest_group kernel _raw_spin_lock as the top time users. Those don't tell that much by themselves, could you do a hierarchical profile? I.e. perf record -ga? That'll at least give the callers for kernel level stuff. For more information compile postgres with -fno-omit-frame-pointer. Unfortunately this did not help - had lots of unknown symbols from postgres in the profile - I'm guessing the Ubuntu postgresql-9.3 package needs either the -dev package or to be rebuilt with the enable profile option (debug and no-omit-frame-pointer seem to be there already). However further investigation did uncover *very* interesting things. Firstly I had previously said that read only performance looked ok...this was wrong, purely based on comparison to Robert's blog post. Rebooting the 60 core box with 32 cores enabled showed that we got *better* scaling performance in the read only case and illustrated we were hitting a serious regression with more cores. At this point data is needed: Test: pgbench Options: scale 500 read only Os: Ubuntu 14.04 Pg: 9.3.4 Pg Options: max_connections = 200 shared_buffers = 10GB maintenance_work_mem = 1GB effective_io_concurrency = 10 wal_buffers = 32MB checkpoint_segments = 192 checkpoint_completion_target = 0.8 Results Clients | 9.3 tps 32 cores | 9.3 tps 60 cores +--+- 6 | 70400 | 71028 12 | 98918 | 129140 24 | 230345 | 240631 48 | 324042 | 409510 96 | 346929 | 120464 192 | 312621 | 92663 So we have anti scaling with 60 cores as we increase the client connections. Ouch! A level of urgency led to trying out Andres's 'rwlock' 9.4 branch [1] - cherry picking the last 5 commits into 9.4 branch and building a package from that and retesting: Clients | 9.4 tps 60 cores (rwlock) +-- 6 | 70189 12 | 128894 24 | 233542 48 | 422754 96 | 590796 192 | 630672 Wow - that is more like it! Andres that is some nice work, we definitely owe you some beers for that :-) I am aware that I need to retest with an unpatched 9.4 src - as it is not clear from this data how much is due to Andres's patches and how much to the steady stream of 9.4 development. I'll post an update on that later, but figured this was interesting enough to note for now. Regards Mark [1] from git://git.postgresql.org/git/users/andresfreund/postgres.git, commits: 4b82477dcaf81ad7b0c102f4b66e479a5eb9504a 10d72b97f108b6002210ea97a414076a62302d4e 67ffebe50111743975d54782a3a94b15ac4e755f fe686ed18fe132021ee5e557c67cc4d7c50a1ada f2378dc2fa5b73c688f696704976980bab90c611 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 60 core performance with 9.3
On 01/07/14 21:48, Mark Kirkwood wrote: [1] from git://git.postgresql.org/git/users/andresfreund/postgres.git, commits: 4b82477dcaf81ad7b0c102f4b66e479a5eb9504a 10d72b97f108b6002210ea97a414076a62302d4e 67ffebe50111743975d54782a3a94b15ac4e755f fe686ed18fe132021ee5e557c67cc4d7c50a1ada f2378dc2fa5b73c688f696704976980bab90c611 Hmmm, should read last 5 commits in 'rwlock-contention' and I had pasted the commit nos from my tree not Andres's, sorry, here are the right ones: 472c87400377a7dc418d8b77e47ba08f5c89b1bb e1e549a8e42b753cc7ac60e914a3939584cb1c56 65c2174469d2e0e7c2894202dc63b8fa6f8d2a7f 959aa6e0084d1264e5b228e5a055d66e5173db7d a5c3ddaef0ee679cf5e8e10d59e0a1fe9f0f1893 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 60 core performance with 9.3
On 2014-07-01 21:48:35 +1200, Mark Kirkwood wrote: On 27/06/14 21:19, Andres Freund wrote: On 2014-06-27 14:28:20 +1200, Mark Kirkwood wrote: My feeling is spinlock or similar, 'perf top' shows kernel find_busiest_group kernel _raw_spin_lock as the top time users. Those don't tell that much by themselves, could you do a hierarchical profile? I.e. perf record -ga? That'll at least give the callers for kernel level stuff. For more information compile postgres with -fno-omit-frame-pointer. Unfortunately this did not help - had lots of unknown symbols from postgres in the profile - I'm guessing the Ubuntu postgresql-9.3 package needs either the -dev package or to be rebuilt with the enable profile option (debug and no-omit-frame-pointer seem to be there already). You need to install the -dbg package. My bet is you'll see s_lock high in the profile, called mainly from the procarray and buffer mapping lwlocks. Test: pgbench Options: scale 500 read only Os: Ubuntu 14.04 Pg: 9.3.4 Pg Options: max_connections = 200 Just as an experiment I'd suggest increasing max_connections by one and two and quickly retesting - there's some cacheline alignment issues that aren't fixed yet that happen to vanish with some max_connections settings. shared_buffers = 10GB maintenance_work_mem = 1GB effective_io_concurrency = 10 wal_buffers = 32MB checkpoint_segments = 192 checkpoint_completion_target = 0.8 Results Clients | 9.3 tps 32 cores | 9.3 tps 60 cores +--+- 6 | 70400 | 71028 12 | 98918 | 129140 24 | 230345 | 240631 48 | 324042 | 409510 96 | 346929 | 120464 192 | 312621 | 92663 So we have anti scaling with 60 cores as we increase the client connections. Ouch! A level of urgency led to trying out Andres's 'rwlock' 9.4 branch [1] - cherry picking the last 5 commits into 9.4 branch and building a package from that and retesting: Clients | 9.4 tps 60 cores (rwlock) +-- 6 | 70189 12 | 128894 24 | 233542 48 | 422754 96 | 590796 192 | 630672 Wow - that is more like it! Andres that is some nice work, we definitely owe you some beers for that :-) I am aware that I need to retest with an unpatched 9.4 src - as it is not clear from this data how much is due to Andres's patches and how much to the steady stream of 9.4 development. I'll post an update on that later, but figured this was interesting enough to note for now. Cool. That's what I like (and expect) to see :). I don't think unpatched 9.4 will show significantly different results than 9.3, but it'd be good to validate that. If you do so, could you post the results in the -hackers thread I just CCed you on? That'll help the work to get into 9.5. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Volatility - docs vs behaviour?
On Mon, Jun 30, 2014 at 9:15 PM, Tom Lane t...@sss.pgh.pa.us wrote: Craig Ringer cr...@2ndquadrant.com writes: I was unaware that the planner made any attempt to catch users' errors in marking the strictness of functions. I thought it pretty much trusted the user not to lie about the mutability of functions invoked indirectly. I'm not really sure where in the inlining code to look to figure that out. It's in optimizer/util/clauses.c: /* * Additional validity checks on the expression. It mustn't return a set, * and it mustn't be more volatile than the surrounding function (this is * to avoid breaking hacks that involve pretending a function is immutable * when it really ain't). If the surrounding function is declared strict, * then the expression must contain only strict constructs and must use * all of the function parameters (this is overkill, but an exact analysis * is hard). */ if (expression_returns_set(newexpr)) goto fail; if (funcform-provolatile == PROVOLATILE_IMMUTABLE contain_mutable_functions(newexpr)) goto fail; else if (funcform-provolatile == PROVOLATILE_STABLE contain_volatile_functions(newexpr)) goto fail; As the comment says, this wasn't really coded with an eye towards catching user error. Rather, there are known use-cases where people intentionally use SQL wrapper functions to lie about the mutability of some underlying function; inlining would expose the truth of the matter and thus defeat such hacks. Now I'd be the first to agree that this isn't a terribly high-performance way of doing that, but the point here was to not change the behavior that existed before SQL inlining did. some points: *) there are several cases that look superficially immutable to the user but are really stable. Mostly this comes up with date time functions because of the database dependency. The issue I have with the status quo is that the server punishes you by mis-decorating the function (it gets treaded as volatile). *) some formulations of functions like to_char() are immutable depending on arguments (julian day for example). so if the user wraps this for purposes of indexing and then uses that same function for querying, the operation is non inlineable. *) unless you really know your stuff server inlining is completely abstracted from you, except in terms of performance Adding up the above, the way things work today is kind of a pain -- in may ways I feel that if I mark a function IMMUTABLE, the server should not overrule me. If that argument doesn't hold water, then server should at least tell you when a function is inlineable -- perhaps via \df+. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres Replaying WAL slowly
On Jun 30, 2014, at 4:57 PM, Jeff Frost j...@pgexperts.com wrote: On Jun 30, 2014, at 4:04 PM, Tom Lane t...@sss.pgh.pa.us wrote: Ah ... that's more like a number I can believe something would have trouble coping with. Did you see a noticeable slowdown with this? Now that we've seen that number, of course it's possible there was an even higher peak occurring when you saw the trouble. Perhaps there's an O(N^2) behavior in StandbyReleaseLocks, or maybe it just takes awhile to handle that many locks. Did you check whether the locks were all on temp tables of the ON COMMIT DROP persuasion? Unfortunately not, because I went for a poor man's: SELECT count(*) FROM pg_locks WHERE mode = 'AccessExclusiveLock' run in cron every minute. That said, I'd bet it was mostly ON COMMIT DROP temp tables. The unfortunate thing is I wouldn't know how to correlate that spike with the corresponding slowdown because the replica is about 5.5hrs lagged at the moment. Hopefully it will get caught up tonight and we can see if there's a correlation tomorrow. And indeed it did catch up overnight and the lag increased shortly after a correlating spike in AccessExclusiveLocks that were generated by temp table creation with on commit drop.
Re: [PERFORM] Postgres Replaying WAL slowly
Jeff Frost j...@pgexperts.com writes: On Jun 30, 2014, at 4:04 PM, Tom Lane t...@sss.pgh.pa.us wrote: Did you check whether the locks were all on temp tables of the ON COMMIT DROP persuasion? And indeed it did catch up overnight and the lag increased shortly after a correlating spike in AccessExclusiveLocks that were generated by temp table creation with on commit drop. OK, so we have a pretty clear idea of where the problem is now. It seems like there are three, not mutually exclusive, ways we might address this: 1. Local revisions inside StandbyReleaseLocks to make it perform better in the presence of many locks. This would only be likely to improve matters much if there's a fixable O(N^2) algorithmic issue; but there might well be one. 2. Avoid WAL-logging AccessExclusiveLocks associated with temp tables, on the grounds that no standby should be touching them. I'm not entirely sure that that argument is bulletproof though; in particular, even though a standby couldn't access the table's data, it's possible that it would be interested in seeing consistent catalog entries. 3. Avoid WAL-logging AccessExclusiveLocks associated with new-in-transaction tables, temp or not, on the grounds that no standby could even see such tables until they're committed. We could go a bit further and not take out any locks on a new-in-transaction table in the first place, on the grounds that other transactions on the master can't see 'em either. It sounded like Andres had taken a preliminary look at #1 and found a possible avenue for improvement, which I'd encourage him to pursue. For both #2 and the conservative version of #3, the main implementation problem would be whether the lock WAL-logging code has cheap access to the necessary information. I suspect it doesn't. The radical version of #3 might be pretty easy to do, at least to the extent of removing locks taken out during CREATE TABLE. I suspect there are some assertions or other consistency checks that would get unhappy if we manipulate relations without locks, though, so those would have to be taught about the exception. Also, we sometimes forget new-in-transaction status during relcache flush events; it's not clear if that would be a problem for this. I don't plan to work on this myself, but perhaps someone with more motivation will want to run with these ideas. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Guidelines on best indexing strategy for varying searches on 20+ columns
Thanks for the answers. Den 30/06/2014 kl. 20.04 skrev Jeff Janes jeff.ja...@gmail.com: On Wed, Jun 25, 2014 at 1:48 AM, Niels Kristian Schjødt nielskrist...@autouncle.com wrote: Hi, I’m running a search engine for cars. It’s backed by a postgresql 9.3 installation. Now I’m unsure about the best approach/strategy on doing index optimization for the fronted search. The problem: The table containing the cars holds a around 1,5 million rows. People that searches for cars needs different criteria to search by. Some search by brand/model, some by year, some by mileage, some by price and some by special equipment etc. etc. - and often they combine a whole bunch of criteria together. Of cause some, like brand/mode and price, are used more frequently than others. In total we offer: 9 category criteria like brand/model or body type, plus 5 numeric criteria like price or mileage, plus 12 boolean criteria like equipment. Lastly people can order the results by different columns (year, price, mileage and a score we create about the cars). By default we order by our own generated score. What I’ve done so far: I have analyzed the usage of the criteria “lightly”, and created a few indexes (10). Among those, are e.g. indexes on price, mileage and a combined index on brand/model. Since we are only interested in showing results for cars which is actually for sale, the indexes are made as partial indexes on a sales state column. I'd probably partition the data on whether it is for sale, and then search only the for-sale partition. Hmm okay, I already did all the indexes partial based on the for-sales state. If the queries always queries for-sale, and all indexes are partial based on those, will it then still help performance / make sense to partition the tables? Questions: 1. How would you go about analyzing and determining what columns should be indexed, and how? I'd start out with intuition about which columns are likely to be used most often, and in a selective way. And followup by logging slow queries so they can be dissected at leisure. 2. What is the best strategy when optimizing indexes for searches happening on 20 + columns, where the use and the combinations varies a lot? (To just index everything, to index some of the columns, to do combined indexes, to only do single column indexes etc. etc.) There is no magic index. Based on your description, you are going to be seq scanning your table a lot. Focus on making it as small as possible, but vertical partitioning it so that the not-for-sale entries are hived off to an historical table, and horizontally partitioning it so that large columns rarely used in the where clause are in a separate table (Ideally you would tell postgresql to aggressively toast those columns, but there is no knob with which to do that) 3. I expect that it does not make sense to index all columns? You mean individually, or jointly? Either way, probably not. 4. I expect it does not make sense to index boolean columns? In some cases it can, for example if the data distribution is very lopsided and the value with the smaller side is frequently specified. 5. Is it better to do a combined index on 5 frequently used columns rather than having individual indexes on each of them? How often are the columns specified together? If they are completely independent it probably makes little sense to index them together. 6. Would it be a goof idea to have all indexes sorted by my default sorting? You don't get to choose. An btree index is sorted by the columns specified in the index, according to the operators specified (or defaulted). Unless you mean that you want to add the default sort column to be the lead column in each index, that actually might make sense. 7. Do you have so experiences with other approaches that could greatly improve performance (e.g. forcing indexes to stay in memory etc.)? If your queries are as unstructured as you imply, I'd forget about indexes for the most part, as you will have a hard time findings ones that work. Concentrate on making seq scans as fast as possible. If most of your queries end in something like ORDER by price limit 10 then concentrate on index scans over price. You will probably want to include heuristics in your UI such that if people configure queries to download half your database, you disallow that. You will probably find that 90% of the workload comes from people who are just playing around with your website and don't actually intend to do business with you. Cheers, Jeff -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance