Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c
On Wed, Aug 10, 2016 at 10:04 AM, Dilip Kumarwrote: > This seems better, after checking at other places I found that for > invalid type we are using ERRCODE_UNDEFINED_OBJECT and for invalid > functions we are using ERRCODE_UNDEFINED_FUNCTION. So I have done the > same way. > > Updated patch attached. I found some more places where we can get similar error and updated in my v3 patch. I don't claim that it fixes all such kind of error, but at least it fixes error reported by sqlsmith plus some additional what I found in similar area. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com cache_lookup_failure_v3.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/Python adding support for multi-dimensional arrays
Hi 2016-08-03 13:54 GMT+02:00 Alexey Grishchenko: > On Wed, Aug 3, 2016 at 12:49 PM, Alexey Grishchenko < > agrishche...@pivotal.io> wrote: > >> Hi >> >> Current implementation of PL/Python does not allow the use of >> multi-dimensional arrays, for both input and output parameters. This forces >> end users to introduce workarounds like casting arrays to text before >> passing them to the functions and parsing them after, which is an >> error-prone approach >> >> This patch adds support for multi-dimensional arrays as both input and >> output parameters for PL/Python functions. The number of dimensions >> supported is limited by Postgres MAXDIM macrovariable, by default equal to >> 6. Both input and output multi-dimensional arrays should have fixed >> dimension sizes, i.e. 2-d arrays should represent MxN matrix, 3-d arrays >> represent MxNxK cube, etc. >> >> This patch does not support multi-dimensional arrays of composite types, >> as composite types in Python might be represented as iterators and there is >> no obvious way to find out when the nested array stops and composite type >> structure starts. For example, if we have a composite type of (int, text), >> we can try to return "[ [ [1,'a'], [2,'b'] ], [ [3,'c'], [4,'d'] ] ]", and >> it is hard to find out that the first two lists are lists, and the third >> one represents structure. Things are getting even more complex when you >> have arrays as members of composite type. This is why I think this >> limitation is reasonable. >> >> Given the function: >> >> CREATE FUNCTION test_type_conversion_array_int4(x int4[]) RETURNS int4[] >> AS $$ >> plpy.info(x, type(x)) >> return x >> $$ LANGUAGE plpythonu; >> >> Before patch: >> >> # SELECT * FROM test_type_conversion_array_int4(ARRAY[[1,2,3],[4,5,6]]); >> ERROR: cannot convert multidimensional array to Python list >> DETAIL: PL/Python only supports one-dimensional arrays. >> CONTEXT: PL/Python function "test_type_conversion_array_int4" >> >> >> After patch: >> >> # SELECT * FROM test_type_conversion_array_int4(ARRAY[[1,2,3],[4,5,6]]); >> INFO: ([[1, 2, 3], [4, 5, 6]], ) >> test_type_conversion_array_int4 >> - >> {{1,2,3},{4,5,6}} >> (1 row) >> >> >> -- >> Best regards, >> Alexey Grishchenko >> > > Also this patch incorporates the fix for https://www.postgresql. > org/message-id/CAH38_tkwA5qgLV8zPN1OpPzhtkNKQb30n3x > q-2NR9jUfv3qwHA%40mail.gmail.com, as they touch the same piece of code - > array manipulation in PL/Python > > I am sending review of this patch: 1. The implemented functionality is clearly benefit - passing MD arrays, pretty faster passing bigger arrays 2. I was able to use this patch cleanly without any errors or warnings 3. There is no any error or warning 4. All tests passed - I tested Python 2.7 and Python 3.5 5. The code is well commented and clean 6. For this new functionality the documentation is not necessary 7. I invite more regress tests for both directions (Python <-> Postgres) for more than two dimensions My only one objection is not enough regress tests - after fixing this patch will be ready for commiters. Good work, Alexey Thank you Regards Pavel > -- > Best regards, > Alexey Grishchenko >
Re: [HACKERS] Small issues in syncrep.c
On Tue, Aug 9, 2016 at 5:34 PM, Julien Rouhaudwrote: > Since 14e8803f1, it's not necessary to acquire the SyncRepLock to see up > to date data. But it looks like this commit didn't update all the > comment around MyProc->syncRepState, which still mention retrieving the > value without and without lock. Also, there's I think a now unneeded > test to try to retrieve again syncRepState. > > Patch attached to fix both small issues, present since 9.5. You could directly check MyProc->syncRepState and remove syncRepState. Could you add it to the next commit fest? I don't think this will get into 9.6 as this is an optimization. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Wait events monitoring future development
From: pgsql-hackers-ow...@postgresql.org > Lets put this in perspective: there's tons of companies that spend thousands > of dollars per month extra by running un-tuned systems in cloud environments. > I almost called that "waste" but in reality it should be a simple business > question: is it worth more to the company to spend resources on reducing > the AWS bill or rolling out new features? > It's something that can be estimated and a rational business decision made. > > Where things become completely *irrational* is when a developer reads > something like "plpgsql blocks with an EXCEPTION handler are more expensive" > and they freak out and spend a bunch of time trying to avoid them, without > even the faintest idea of what that overhead actually is. > More important, they haven't the faintest idea of what that overhead costs > the company, vs what it costs the company for them to spend an extra hour > trying to avoid the EXCEPTION (and probably introducing code that's far > more bug-prone in the process). > > So in reality, the only people likely to notice even something as large > as a 10% hit are those that were already close to maxing out their hardware > anyway. > > The downside to leaving stuff like this off by default is users won't > remember it's there when they need it. At best, that means they spend more > time debugging something than they need to. At worse, it means they suffer > a production outage for longer than they need to, and that can easily exceed > many months/years worth of the extra cost from the monitoring overhead. I'd rather like this way of positive thinking. It will be better to think of the event monitoring as a positive feature for (daily) proactive improvement, not only as a debugging feature which gives negative image. For example, pgAdmin4 can display 10 most time-consuming events and their solutions. The DBA initially places the database and WAL on the same volume. As the system grows and the write workload increases, the DBA can get a suggestion from pgAdmin4 that he can prepare for the system growth by placing WAL on another volume to reduce WALWriteLock wait events. This is not debugging, but proactive monitoring. > > As another idea, we can stand on the middle ground. Interestingly, MySQL > also enables their event monitoring (Performance Schema) by default, but > not all events are collected. I guess highly encountered events are not > collected by default to minimize the overhead. > > That's what we currently do with several track_* and log_*_stats GUCs, > several of which I forgot even existed until just now. Since there's question > over the actual overhead maybe that's a prudent approach for now, but I > think we should be striving to enable these things ASAP. Agreed. And as Bruce said, it may be better to be able to disable collection of some events that have visible impact on performance. Regards Takayuki Tsunakawa -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multivariate statistics (v19)
On Wed, Aug 3, 2016 at 10:58 AM, Tomas Vondrawrote: > 1) enriching the query tree with multivariate statistics info > > Right now all the stuff related to multivariate statistics estimation > happens in clausesel.c - matching condition to statistics, selection of > statistics to use (if there are multiple usable stats), etc. So pretty much > all this info is internal to clausesel.c and does not get outside. This does not seem bad to me as first sight but... > I'm starting to think that some of the steps (matching quals to stats, > selection of stats) should happen in a "preprocess" step before the actual > estimation, storing the information (which stats to use, etc.) in a new type > of node in the query tree - something like RestrictInfo. > > I believe this needs to happen sometime after deconstruct_jointree() as that > builds RestrictInfos nodes, and looking at planmain.c, right after > extract_restriction_or_clauses seems about right. Haven't tried, though. > > This would move all the "statistics selection" logic from clausesel.c, > separating it from the "actual estimation" and simplifying the code. > > But more importantly, I think we'll need to show some of the data in EXPLAIN > output. With per-column statistics it's fairly straightforward to determine > which statistics are used and how. But with multivariate stats things are > often more complicated - there may be multiple candidate statistics (e.g. > histograms covering different subsets of the conditions), it's possible to > apply them in different orders, etc. > > But EXPLAIN can't show the info if it's ephemeral and available only within > clausesel.c (and thrown away after the estimation). This gives a good reason to not do that in clauserel.c, it would be really cool to be able to get some information regarding the stats used with a simple EXPLAIN. > 2) combining multiple statistics > > I think the ability to combine multivariate statistics (covering different > subsets of conditions) is important and useful, but I'm starting to think > that the current implementation may not be the correct one (which is why I > haven't written the SGML docs about this part of the patch series yet). > > Assume there's a table "t" with 3 columns (a, b, c), and that we're > estimating query: > >SELECT * FROM t WHERE a = 1 AND b = 2 AND c = 3 > > but that we only have two statistics (a,b) and (b,c). The current patch does > about this: > >P(a=1,b=2,c=3) = P(a=1,b=2) * P(c=3|b=2) > > i.e. it estimates the first two conditions using (a,b), and then estimates > (c=3) using (b,c) with "b=2" as a condition. Now, this is very efficient, > but it only works as long as the query contains conditions "connecting" the > two statistics. So if we remove the "b=2" condition from the query, this > stops working. This is trying to make the algorithm smarter than the user, which is something I'd think we could live without. In this case statistics on (a,c) or (a,b,c) are missing. And what if the user does not want to make use of stats for (a,c) because he only defined (a,b) and (b,c)? Patch 0001: there have been comments about that before, and you have put the checks on RestrictInfo in a couple of variables of pull_varnos_walker, so nothing to say from here. Patch 0002: + + CREATE STATISTICS will create a new multivariate + statistics on the table. The statistics will be created in the in the + current database. The statistics will be owned by the user issuing + the command. + s/in the/in the/. + + Create table t1 with two functionally dependent columns, i.e. + knowledge of a value in the first column is sufficient for detemining the + value in the other column. Then functional dependencies are built on those + columns: s/detemining/determining/ + + If a schema name is given (for example, CREATE STATISTICS + myschema.mystat ...) then the statistics is created in the specified + schema. Otherwise it is created in the current schema. The name of + the table must be distinct from the name of any other statistics in the + same schema. + I would just assume that a statistics is located on the schema of the relation it depends on. So the thing that may be better to do is just: - Register the OID of the table a statistics depends on but not the schema. - Give up on those query extensions related to the schema. - Allow the same statistics name to be used for multiple tables. - Just fail if a statistics name is being reused on the table again. It may be better to complain about that even if the column list is different. - Register the dependency between the statistics and the table. +ALTER STATISTICS name OWNER TO { new_owner | CURRENT_USER | SESSION_USER } On the same line, is OWNER TO really necessary? I could have assumed that if a user is able to query the set of columns related to a statistics, he should have access to it. =# create statistics aa_a_b3 on aam (a, b) with (dependencies); ERROR:
Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c
On Tue, Aug 9, 2016 at 6:49 PM, Amit Kapilawrote: > Okay, then how about ERRCODE_UNDEFINED_OBJECT? It seems to be used in > almost similar cases. This seems better, after checking at other places I found that for invalid type we are using ERRCODE_UNDEFINED_OBJECT and for invalid functions we are using ERRCODE_UNDEFINED_FUNCTION. So I have done the same way. Updated patch attached. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com cache_lookup_failure_v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dsm_unpin_segment
On Wed, Aug 10, 2016 at 2:43 PM, Jim Nasbywrote: > On 8/9/16 6:14 PM, Thomas Munro wrote: >> The can't be static, they need to be in shared memory, because we also >> want to protect against two *different* backends pinning it. > > Right, this would strictly protect from it happening within a single > backend. Perhaps it's pointless for pin/unpin, but it seems like it would be > a good thing to have for attach/detach. Double attach already gets you: elog(ERROR, "can't attach the same segment more than once"); Double detach is conceptually like double free, and in fact actually leads to a double pfree of the backend-local dsm_segment object. It doesn't seem like the kind of thing it's easy or reasonable to try to defend against, since you have no space left in which to store the state you need to detect double-frees after you've done it once! -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Heap WARM Tuples - Design Draft
On Tue, Aug 9, 2016 at 12:06 AM, Claudio Freirewrote: > On Mon, Aug 8, 2016 at 2:52 PM, Pavan Deolasee > wrote: > > > Some heuristics and limits on amount of work done to detect duplicate > index > > entries will help too. > > I think I prefer a more thorough approach. > > Increment/decrement may get very complicated with custom opclasses, > for instance. A column-change bitmap won't know how to handle > funcional indexes, etc. > > What I intend to try, is modify btree to allow efficient search of a > key-ctid pair, by adding the ctid to the sort order. Yes, that's a good medium term plan. And this is kind of independent from the core idea. So I'll go ahead and write a patch that implements the core idea and some basic optimizations. We can then try different approaches such as tracking changed columns, tracking increment/decrement and teaching btree to skip duplicate (key, CTID) entries. Thanks, Pavan -- Pavan Deolasee http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] dsm_unpin_segment
On 8/9/16 6:14 PM, Thomas Munro wrote: Could a couple of static variables be used to ensure multiple pin/unpin and > attach/detach calls throw an assert() (or become a no-op if asserts are > disabled)? It would be nice if we could protect users from this. The can't be static, they need to be in shared memory, because we also want to protect against two *different* backends pinning it. Right, this would strictly protect from it happening within a single backend. Perhaps it's pointless for pin/unpin, but it seems like it would be a good thing to have for attach/detach. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Heap WARM Tuples - Design Draft
On 8/9/16 6:44 PM, Claudio Freire wrote: Since we can lookup all occurrences of k1=a index=0 and k2=a index=0, and in fact we probably did so already as part of the update logic That's a change from what currently happens, right? The reason I think that's important is that dropping the assumption that we can't safely re-find index entries from the heap opens up other optimizations, ones that should be significantly simpler to implement. The most obvious example being getting rid of full index scans in vacuum. While that won't help with write amplification, it would reduce the cost of vacuum enormously. Orders of magnitude wouldn't surprise me in the least. If that's indeed a prerequisite to WARM it would be great to get that groundwork laid early so others could work on other optimizations it would enable. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Heap WARM Tuples - Design Draft
On 8/9/16 6:46 PM, Claudio Freire wrote: On Tue, Aug 9, 2016 at 8:44 PM, Claudio Freirewrote: index 0 1 2 3 4 k1 .. a a a k2 .. b a a i1 ^ i2 ^ ^ lp u u r3 hot* Sorry, that should read: index 0 1 2 3 4 k1 .. a a a k2 .. b a a i1 ^ i2 ^ lp u u r3 hot* FWIW, your ascii-art is unfortunately getting bungled somewhere I think :/. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] handling unconvertible error messages
Hello, (I've recovered the lost Cc recipients so far) At Mon, 8 Aug 2016 12:52:11 +0300, Victor Wagnerwrote in <20160808125211.1361c...@fafnir.local.vm> > On Mon, 08 Aug 2016 18:28:57 +0900 (Tokyo Standard Time) > Kyotaro HORIGUCHI wrote: > > > > I don't see charset compatibility to be easily detectable, > > In the worst case we can hardcode explicit compatibility table. We could have the language lists compatible with some language-bound encodings. For example, LATIN1 (ISO/IEC 8859-1), according to Wikipedia (https://en.wikipedia.org/wiki/ISO/IEC_8859-1) According to the list, we might have the following compatibility list of locales, maybe without region. {{"UTF8", "LATIN1"}, "af", "sq", "eu", "da", "en", "fo", "en"}... and so. The biggest problem for this is at least *I* cannot confirm the validity of the list. Both about perfectness of coverage of LATIN1 over all languages in the list and omission of any possiblly coverable language. Nontheless, we could use such lists if we accept the possible imperfectness, which would eventually result in the original error (conversion failure) or excess fallback for possibly convertable languages but unfortunately the latter would be inacceptable for table data. > There is limited set of languages, which have translated error messages, > and limited (albeit wide) set of encodings, supported by PostgreSQL. So Yes, we can have a negative list already known to be incompatible. {{"UTF8", "LATIN1"}, "ru", .. er..what else?} ISO639-1 seems to have about 190 languages and most of them are apparently incompatible with LATIN1 encoding. It doesn't seem to me good to have a haphazardly made negative list. > it is possible to define complete list of encodings, compatible with > some translation. And fall back to untranslated messages if client > encoding is not in this list. > > > because locale (or character set) is not a matter of PostgreSQL > > (except for some encodings bound to one particular character > > set)... So the conversion-fallback might be a only available > > solution. > > Conversion fallback may be a solution for data. For NLS-messages I think > it is better to fall back to English (untranslated) messages than use of > transliteration or something alike. I suppose that 'fallback' means "have a try then use English if failed" so I think it is sutable rather for message, not for data, and it doesn't need any a priori information about compatibility. It seems to me that PostgreSQL refuses to ignore or conceal conversion errors and return broken or unwanted byte sequence for data. Things are different for error messages, it is preferable to be anyyhow readable than totally abandoned. > I think that for now we can assume that the best effort is already done > for the data, and think how to improve situation with messages. Is there any source to know the compatibility for any combination of language vs encoding? Maybe we need a ground for the list. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Heap WARM Tuples - Design Draft
On Tue, Aug 9, 2016 at 06:19:57PM -0500, Jim Nasby wrote: > On 8/8/16 3:19 PM, Bruce Momjian wrote: > >>What will help, and something I haven't yet applied any thoughts, is when we > >>> can turn WARM chains back to HOT by removing stale index entries. > >I can't see how we can ever do that because we have multiple indexes > >pointing to the chain, and keys that might be duplicated if we switched > >to HOT. Seems only VACUUM can fix that. > > Are these changes still predicated on being able to re-find all index > entries by key value? If so, that makes incremental vacuums practical, > perhaps eliminating a lot of these issues. No, the increment/decrement doesn't require btree lookups. > > We can't use the bits LP_REDIRECT lp_len because we need to create WARM > > chains before pruning, and I don't think walking the pre-pruned chain > > is > > worth it. (As I understand HOT, LP_REDIRECT is only created during > > pruning.) > >>> > >>> That's correct. But lp_len provides us some place to stash information > >>> from > >>> heap tuples when they are pruned. > >Right. However, I see storing information at prune time as only useful > >if you are willing to scan the chain, and frankly, I have given up on > >chain scanning (with column comparisons) as being too expensive for > >its limited value. > > What if some of this work happened asynchronously? I'm thinking something > that runs through shared_buffers in front of bgwriter. Well, we prune when we need the space --- that is the big value of page pruning, that it is fast can be done when you need it. VACUUM is all about background cleanup. -- Bruce Momjianhttp://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Notice lock waits
On 8/5/16 12:00 PM, Jeff Janes wrote: So I created a new guc, notice_lock_waits, which acts like log_lock_waits but sends the message as NOTICE so it will show up on interactive connections like psql. I would strongly prefer that this accept a log level instead of being hard-coded to NOTICE. The reason is that I find the NOTICE chatter from many DDL commands to be completely worthless (looking at you %TYPE), so I normally set client_min_messages to WARNING in DDL scripts. I can work on that patch; would it essentially be a matter of changing notice_lock_waits to int lock_wait_level? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Heap WARM Tuples - Design Draft
On Tue, Aug 9, 2016 at 8:44 PM, Claudio Freirewrote: > index 0 1 2 3 4 > k1 .. a a a > k2 .. b a a > i1 ^ > i2 ^ ^ > lp u u r3 > hot* Sorry, that should read: index 0 1 2 3 4 k1 .. a a a k2 .. b a a i1 ^ i2 ^ lp u u r3 hot* -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Heap WARM Tuples - Design Draft
On Tue, Aug 9, 2016 at 8:19 PM, Jim Nasbywrote: > On 8/8/16 3:19 PM, Bruce Momjian wrote: >>> >>> What will help, and something I haven't yet applied any thoughts, is when >>> we >>> > can turn WARM chains back to HOT by removing stale index entries. >> >> I can't see how we can ever do that because we have multiple indexes >> pointing to the chain, and keys that might be duplicated if we switched >> to HOT. Seems only VACUUM can fix that. > > > Are these changes still predicated on being able to re-find all index > entries by key value? If so, that makes incremental vacuums practical, > perhaps eliminating a lot of these issues. > > > We can't use the bits LP_REDIRECT lp_len because we need to create > > WARM > > chains before pruning, and I don't think walking the pre-pruned > > chain is > > worth it. (As I understand HOT, LP_REDIRECT is only created during > > pruning.) >>> >>> > >>> > That's correct. But lp_len provides us some place to stash information >>> > from >>> > heap tuples when they are pruned. >> >> Right. However, I see storing information at prune time as only useful >> if you are willing to scan the chain, and frankly, I have given up on >> chain scanning (with column comparisons) as being too expensive for >> its limited value. > > > What if some of this work happened asynchronously? I'm thinking something > that runs through shared_buffers in front of bgwriter. If one can find key-ctid pairs efficiently in the index, this can be done during WARM pruning (ie: during updates, when we're already doing the index lookups anyway). Suppose you have the following chain: index 0 1 2 3 4 k1 a a a a a k2 a a b a a i1 ^ i2 ^^ hot ** If versions 0-2 die, pruning can free 1 (it's HOT), and leave redirects in 0 and 2: index 0 1 2 3 4 k1 a. a a a k2 a. b a a i1 ^ i2 ^ ^ lp r2 u r3 hot* Since we can lookup all occurrences of k1=a index=0 and k2=a index=0, and in fact we probably did so already as part of the update logic, we can remove the first redirect by pointing indexes to 2: index 0 1 2 3 4 k1 .. a a a k2 .. b a a i1 ^ i2 ^ ^ lp u u r3 hot* So WARM pruning would have to happen just prior to adding a WARM tuple to be able to reuse all the index lookup work to do the index pruning with the writing. The indexam interface will get considerably more complex in order to do this, however. So perhaps it would be ok to do 2 independent lookups instead? I'm undecided yet. But one way or the other, pruning can happen early and incrementally. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Wait events monitoring future development
On 8/8/16 11:07 PM, Tsunakawa, Takayuki wrote: From: pgsql-hackers-ow...@postgresql.org > If you want to know why people are against enabling this monitoring by > default, above is the reason. What percentage of people do you think would > be willing to take a 10% performance penalty for monitoring like this? I > would bet very few, but the argument above doesn't seem to address the fact > it is a small percentage. > > In fact, the argument above goes even farther, saying that we should enable > it all the time because people will be unwilling to enable it on their own. > I have to question the value of the information if users are not willing > to enable it. And the solution proposed is to force the 10% default overhead > on everyone, whether they are currently doing debugging, whether they will > ever do this level of debugging, because people will be too scared to enable > it. (Yes, I think Oracle took this > approach.) Lets put this in perspective: there's tons of companies that spend thousands of dollars per month extra by running un-tuned systems in cloud environments. I almost called that "waste" but in reality it should be a simple business question: is it worth more to the company to spend resources on reducing the AWS bill or rolling out new features? It's something that can be estimated and a rational business decision made. Where things become completely *irrational* is when a developer reads something like "plpgsql blocks with an EXCEPTION handler are more expensive" and they freak out and spend a bunch of time trying to avoid them, without even the faintest idea of what that overhead actually is. More important, they haven't the faintest idea of what that overhead costs the company, vs what it costs the company for them to spend an extra hour trying to avoid the EXCEPTION (and probably introducing code that's far more bug-prone in the process). So in reality, the only people likely to notice even something as large as a 10% hit are those that were already close to maxing out their hardware anyway. The downside to leaving stuff like this off by default is users won't remember it's there when they need it. At best, that means they spend more time debugging something than they need to. At worse, it means they suffer a production outage for longer than they need to, and that can easily exceed many months/years worth of the extra cost from the monitoring overhead. > We can talk about this feature all we want, but if we are not willing to > be realistic in how much performance penalty the _average_ user is willing > to lose to have this monitoring, I fear we will make little progress on > this feature. OK, 10% was an overstatement. Anyway, As Amit said, we can discuss the default value based on the performance evaluation before release. As another idea, we can stand on the middle ground. Interestingly, MySQL also enables their event monitoring (Performance Schema) by default, but not all events are collected. I guess highly encountered events are not collected by default to minimize the overhead. That's what we currently do with several track_* and log_*_stats GUCs, several of which I forgot even existed until just now. Since there's question over the actual overhead maybe that's a prudent approach for now, but I think we should be striving to enable these things ASAP. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dsm_unpin_segment
On Wed, Aug 10, 2016 at 10:38 AM, Jim Nasbywrote: > On 8/9/16 1:01 PM, Robert Haas wrote: >> >> However, I don't see the need for a full-blown request >> counter here; we've had this API for several releases now and to my >> knowledge nobody has complained about the fact that you aren't >> supposed to call dsm_pin_segment() multiple times for the same >> segment. > > > Could a couple of static variables be used to ensure multiple pin/unpin and > attach/detach calls throw an assert() (or become a no-op if asserts are > disabled)? It would be nice if we could protect users from this. The can't be static, they need to be in shared memory, because we also want to protect against two *different* backends pinning it. The v2 patch protects users from this, by adding 'pinned' flag to the per-segment control object that is visible everywhere, and then doing: + if (dsm_control->item[seg->control_slot].pinned) + elog(ERROR, "cannot pin a segment that is already pinned"); ... and: + if (!dsm_control->item[i].pinned) + elog(ERROR, "cannot unpin a segment that is not pinned"); Those checks could arguably be assertions rather than errors, but I don't think that pin/unpin operations will ever be high frequency enough for it to be worth avoiding those instructions in production builds. The whole reason for pinning segments is likely to be able to reuse them repeatedly, so nailing it down is likely something you'd want to do immediately after creating it. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Heap WARM Tuples - Design Draft
On 8/8/16 3:19 PM, Bruce Momjian wrote: What will help, and something I haven't yet applied any thoughts, is when we > can turn WARM chains back to HOT by removing stale index entries. I can't see how we can ever do that because we have multiple indexes pointing to the chain, and keys that might be duplicated if we switched to HOT. Seems only VACUUM can fix that. Are these changes still predicated on being able to re-find all index entries by key value? If so, that makes incremental vacuums practical, perhaps eliminating a lot of these issues. > > We can't use the bits LP_REDIRECT lp_len because we need to create WARM > > chains before pruning, and I don't think walking the pre-pruned chain is > > worth it. (As I understand HOT, LP_REDIRECT is only created during > > pruning.) > > That's correct. But lp_len provides us some place to stash information from > heap tuples when they are pruned. Right. However, I see storing information at prune time as only useful if you are willing to scan the chain, and frankly, I have given up on chain scanning (with column comparisons) as being too expensive for its limited value. What if some of this work happened asynchronously? I'm thinking something that runs through shared_buffers in front of bgwriter. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dsm_unpin_segment
On 8/9/16 1:01 PM, Robert Haas wrote: However, I don't see the need for a full-blown request counter here; we've had this API for several releases now and to my knowledge nobody has complained about the fact that you aren't supposed to call dsm_pin_segment() multiple times for the same segment. Could a couple of static variables be used to ensure multiple pin/unpin and attach/detach calls throw an assert() (or become a no-op if asserts are disabled)? It would be nice if we could protect users from this. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slowness of extended protocol
Robert Haas: > but for some reason you can't use prepared statements, for example because > the queries are dynamically generated and . That case is analogous to -M > extended, not -M prepared. And -M extended is well-known to be SLOWER > I do not buy that "dynamically generated queries defeat server-prepared usage" argument. It is just not true (see below). Do you mean "in language X, where X != Java it is impossible to implement a query cache"? That is just ridiculus. At the end of the day, there will be a finite number of hot queries that are important. Here's relevant pgjdbc commit: https://github.com/pgjdbc/pgjdbc/pull/319 It works completely transparent to the application, and it does use server-prepared statements even though application builds "brand new" sql text every time. It is not something theoretical, but it is something that is already implemented and battle-tested. The application does build SQL texts based on the names of tables and columns that are shown in the browser, and pgjdbc uses query cache (to map user SQL to backend statement name), thus it benefits from server-prepared statements automatically. Not a single line change was required at the application side. Am I missing something? I cannot imagine a real life case when an application throws 10'000+ UNIQUE SQL texts per second at the database. Cases like "where id=1", "where id=2", "where id=3" do not count as they should be written with bind variables, thus it represents a single SQL text like "where id=$1". Robert>you have to keep sending a different query text every time Do you agree that the major part would be some hot queries, the rest will be much less frequently used ones (e.g. one time queries)? In OLTP applications the number of queries is high, and almost all the queries are reused. server-prepared to rescue here. "protocol optimization" would not be noticeable. In DWH applications the queries might be unique, however the number of queries is much less, thus the "protocol optimization" would be invisible as the query plan/process time would be much higher than the gain from "protocol optimization". Vladimir
Re: [HACKERS] per-statement-level INSTEAD OF triggers
Robert Haaswrites: > On Mon, Aug 8, 2016 at 4:40 AM, Yugo Nagata wrote: >> I'm asking out of curiosity, do anyone know why we don't have >> per-statement-level INSTEAD OF triggers? I looked into the >> standard SQL (20xx draft), but I can't find the restriction >> such that INSTEAD OF triggers must be row-level. Is there >> any technical difficulties, or other reasons for the current >> implementation? > I think one problem is that the trigger wouldn't have any way of > knowing the specifics of what the user was trying to do. It would > just know the type of operation (INSERT, UPDATE, or DELETE). I guess > that could be useful to someone, but not all that useful. It might be more useful after we get the infrastructure that Kevin's been working on to allow collecting all the updates into a tuplestore that could be passed to a statement-level trigger. Right now I tend to agree that there's little point. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] per-statement-level INSTEAD OF triggers
On Mon, Aug 8, 2016 at 4:40 AM, Yugo Nagatawrote: > I'm asking out of curiosity, do anyone know why we don't have > per-statement-level INSTEAD OF triggers? I looked into the > standard SQL (20xx draft), but I can't find the restriction > such that INSTEAD OF triggers must be row-level. Is there > any technical difficulties, or other reasons for the current > implementation? I think one problem is that the trigger wouldn't have any way of knowing the specifics of what the user was trying to do. It would just know the type of operation (INSERT, UPDATE, or DELETE). I guess that could be useful to someone, but not all that useful. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical Replication WIP
Petr Jelinek wrote: > On 09/08/16 10:13, Craig Ringer wrote: > >The only argument I can see for not using bgworkers is for the > >supervisor worker. It's a singleton that launches the per-database > >workers, and arguably is a job that the postmaster could do better. The > >current design there stems from its origins as an extension. Maybe > >worker management could be simplified a bit as a result. I'd really > >rather not invent yet another new and mostly duplicate category of > >custom workers to achieve that though. > > It is simplified compared to pglogical (there is only 2 worker types not 3). > I don't think it's job of postmaster to scan catalogs however so it can't > really start workers for logical replication. I actually modeled it more > after autovacuum (using bgworkers though) than the original extension. Yeah, it's a very bad idea to put postmaster on this task. We should definitely stay away from that. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical Replication WIP
Petr Jelinek wrote: > On 09/08/16 12:16, Craig Ringer wrote: > >Right. I think that's probably the direction we should be going > >eventually. Personally I don't think such a change should block the > >logical replication work from proceeding with bgworkers, though. > > Yeah that's why I added local max GUC that just handles the logical worker > limit within the max_worker_processes. I didn't want to also write generic > framework for managing the max workers using tags or something as part of > this, it's big enough as it is and we can always move the limit to the more > generic place once we have it. Parallel query does exactly that: the workers are allocated from the bgworkers array, and if you want more, it's on you to increase that limit (it doesn't even have the GUC for a maximum). As far as logical replication and parallel query are concerned, that's fine. We can improve this later, if it proves to be a problem. I think there are far more pressing matters to review. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oddity in EXPLAIN for foreign/custom join pushdown plans
On Mon, Aug 8, 2016 at 12:22 AM, Etsuro Fujitawrote: >>> I noticed that currently the core doesn't show any information on the >>> target >>> relations involved in a foreign/custom join in EXPLAIN, by itself. >> I think that's a feature, not a bug. > I agree with you. I'd leave that for 10.0. I don't want to change it at all, neither in 10 or any later version. >> I disagree with that. Currently, when we say that something is a join >> (Merge Join, Hash Join, Nested Loop) we mean that the executor is >> performing a join, but that's not the case here. The executor is >> performing a scan. The remote side, we suppose, is performing a join >> for us, but we are not performing a join: we are performing a scan. >> So I think the fact that it shows up in the plan as "Foreign Scan" is >> exactly right. We are scanning some foreign thing, and that thing may >> internally be doing other stuff, like joins and aggregates, but all >> we're doing is scanning it. > > Hmm. One thing I'm concerned about would be the case where direct > modification is implemented by using GetForeignUpperPaths, not > PlanDirectModify. In that case, the way things are now, we would have > "Foreign Scan" followed by an INSERT/UPDATE/DELETE query, but that seems odd > to me. I don't think there's really any problem there. But if there is, let's solve it when someone submits that patch, not now. > One thing we need to do to leave that as is would be to fix a bug that I > pointed out upthred. Let me explain about that again. The EXPLAIN command > selects relation aliases to be used in printing a query so that each > selected alias is unique, but postgres_fdw wouldn't consider the uniqueness. > Here is an example: > > postgres=# explain verbose select * from (select t1.a, t2.a from ft1 t1, ft2 > t2 where t1.a = t2.a union select t1.a, t2.a from ft1 t1, ft2 t2 where t1.a > = t2.a) as t(t1a, t2a); > QUERY PLAN > > Unique (cost=204.12..204.13 rows=2 width=8) >Output: t1.a, t2.a >-> Sort (cost=204.12..204.12 rows=2 width=8) > Output: t1.a, t2.a > Sort Key: t1.a, t2.a > -> Append (cost=100.00..204.11 rows=2 width=8) >-> Foreign Scan (cost=100.00..102.04 rows=1 width=8) > Output: t1.a, t2.a > Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2) > Remote SQL: SELECT r1.a, r2.a FROM (public.t1 r1 INNER > JOIN public.t2 r2 ON (((r1.a = r2.a >-> Foreign Scan (cost=100.00..102.04 rows=1 width=8) > Output: t1_1.a, t2_1.a > Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2) > Remote SQL: SELECT r1.a, r2.a FROM (public.t1 r1 INNER > JOIN public.t2 r2 ON (((r1.a = r2.a > (14 rows) > > The relation aliases in the Relations line in the second Foreign Scan, t1 > and t2 for ft1 and ft2, are not unique; they should be t1_1 and t2_1 > (compare the aliases in the Relations line with ones in the Output line > directly above that, created by core). The reason for that is because > postgres_fdw has created the Relations info by using rte->eref->aliasname as > relation aliases as is at path-creation time. Probably it would be a little > bit too early for postgers_fdw to do that. Couldn't postgres_fdw create > that info after query planning, for example, during ExplainForeignScan? Yes, it seems what we are doing now is not consistent with what happens for plain tables; that should probably be fixed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slowness of extended protocol
On Tue, Aug 9, 2016 at 4:50 AM, Vladimir Sitnikovwrote: > I've tried pgbench -M prepared, and it is way faster than pgbench -M simple. That's true, but it's also testing something completely different from what Shay is concerned about. -M prepared creates a prepared statement once, and then executes it many times. That is, as you say, faster. But what Shay is concerned about is the case where you are using the extended query protocol to send out-of-line parameters but for some reason you can't use prepared statements, for example because the queries are dynamically generated and you have to keep sending a different query text every time. That case is analogous to -M extended, not -M prepared. And -M extended is well-known to be SLOWER than -M simple. Here's a quick test on my laptop demonstrating this: [rhaas ~]$ pgbench -M simple -S -T 10 starting vacuum...end. transaction type: scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 duration: 10 s number of transactions actually processed: 119244 latency average: 0.084 ms tps = 11919.440321 (including connections establishing) tps = 11923.229139 (excluding connections establishing) [rhaas ~]$ pgbench -M prepared -S -T 10 starting vacuum...end. transaction type: scaling factor: 1 query mode: prepared number of clients: 1 number of threads: 1 duration: 10 s number of transactions actually processed: 192100 latency average: 0.052 ms tps = 19210.341944 (including connections establishing) tps = 19214.820999 (excluding connections establishing) [rhaas ~]$ pgbench -M extended -S -T 10 starting vacuum...end. transaction type: scaling factor: 1 query mode: extended number of clients: 1 number of threads: 1 duration: 10 s number of transactions actually processed: 104275 latency average: 0.096 ms tps = 10425.510813 (including connections establishing) tps = 10427.725239 (excluding connections establishing) Shay is not worried about the people who are getting 19.2k TPS instead of 11.9k TPS. Those people are already happy. He is worried about the people who are getting 10.4k TPS instead of 11.9k TPS. People who can't use prepared statements because their query text varies - and I have personally written multiple web applications that have exactly that profile - suffer a big penalty if they choose to use the extended query protocol to pass parameters. Here, it's about a 13% performance loss. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] phrase search TS_phrase_execute code readability patch
Hackers, The attached attempts to make comprehension of the code in "TS_phrase_execute" a bit easier. I posted similar on the "typo patch" thread of July 2nd/5th but my comments there reflected my mis-understanding of the distance operator being exact as opposed to the expected less-than-or-equal. I had to make one assumption for the attached - that "curitem->qoperator.distance" is always >= 0. In the presence of that assumption the need for the OR goes away. I don't follow why LposStart is needed so I removed it... Not compiled or in any way tested...but its not major surgery either - mostly code comments to make following the logic easier. David J. diff --git a/src/backend/utils/adt/tsvector_op.c b/src/backend/utils/adt/tsvector_op.c index ad5a254..215f58f 100644 --- a/src/backend/utils/adt/tsvector_op.c +++ b/src/backend/utils/adt/tsvector_op.c @@ -1474,14 +1474,14 @@ TS_phrase_execute(QueryItem *curitem, */ Rpos = Rdata.pos; - LposStart = Ldata.pos; while (Rpos < Rdata.pos + Rdata.npos) { /* -* We need to check all possible distances, so reset Lpos to -* guaranteed not yet satisfied position. +* For each physically positioned right-side operand iterate over each +* instance of the left-side operand to locate one at exactly the specified +* distance. As soon as a match is found move onto the next right-operand +* and continue searching starting with the next left-operand. */ - Lpos = LposStart; while (Lpos < Ldata.pos + Ldata.npos) { if (WEP_GETPOS(*Rpos) - WEP_GETPOS(*Lpos) == @@ -1490,16 +1490,16 @@ TS_phrase_execute(QueryItem *curitem, /* MATCH! */ if (data) { - /* Store position for upper phrase operator */ + /* Store the position of the right-operand */ *pos_iter = WEP_GETPOS(*Rpos); pos_iter++; /* -* Set left start position to next, because current -* one could not satisfy distance for any other right -* position +* Move onto the next right-operand, and also the next +* left-operand as the current one cannot be a match +* for any other. */ - LposStart = Lpos + 1; + Lpos++ break; } else @@ -1512,18 +1512,23 @@ TS_phrase_execute(QueryItem *curitem, } } - else if (WEP_GETPOS(*Rpos) <= WEP_GETPOS(*Lpos) || -WEP_GETPOS(*Rpos) - WEP_GETPOS(*Lpos) < + else if (WEP_GETPOS(*Rpos) - WEP_GETPOS(*Lpos) < curitem->qoperator.distance) { /* -* Go to the next Rpos, because Lpos is ahead or on less -* distance than required by current operator +* We are now within the required distance so +* continue onto the next right-operand and retry +* the current left-operand to see if the added +* distance causes it to match. */ break; } + /* +* The current left-operand is too far away so +* try the next one. +*/ Lpos++; } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slowness of extended protocol
On Tue, Aug 9, 2016 at 3:42 PM, Vladimir Sitnikov < sitnikov.vladi...@gmail.com> wrote: > Shay>But here's the more important general point. We're driver > developers, not application developers. I don't really know what > performance is "just fine" for each of my users, and what is not worth > optimizing further. Users may follow best practices, or they may not for > various reasons. > > Of course we cannot benchmark all the existing applications, however we > should at lest try to use "close to production" benchmarks. > > Let me recap: even "select 1" shows clear advantage of reusing > server-prepared statements. > My machine shows the following results for "select 1 pgbench": > simple: 38K ops/sec (~26us/op) > extended: 32K ops/sec (~31us/op) > prepared: 47K ops/sec (~21us/op) > > Note: reusing server-prepared statements shaves 10us (out of 31us), while > "brand new ParseBindExecDeallocate" message would not able to perform > better than 26us/op (that is 5 us worse than the prepared one). So it makes > much more sense investing time in "server-prepared statement reuse" at the > client side and "improving Bind/Exec performance" at the backend side. > > For more complex queries the gap (prepared vs simple) would be much bigger > since parse/validate/plan for a complex query is much harder operation than > the one for "select 1" > You seem to be misunderstanding the fundamental point here. Nobody is saying that prepared statements aren't a huge performance booster - they are. I recommend them to anyone who asks. But you're saying "let's not optimize anything else", whereas there are many programs out there *not* using prepared statements for various reasons (e.g. pgbouncer, or simply an existing codebase). If your opinion is that nothing should be done for these users, fine - nobody's forcing you to do anything. I simply don't see why *not* optimize the very widely-used single-statement execution path. > Note: I do not mean "look, prepared always win". I mean: "if your > environment does not allow reuse of prepared statements for some reason, > you lose huge amount of time on re-parsing the queries, and it is worth > fixing that obvious issue first". > Maybe it's worth fixing it, maybe not - that's going to depend on the application. Some applications may be huge/legacy and hard to change, others may depend on something like pgbouncer which doesn't allow it. Other drivers out there probably don't persist prepared statements across close/open, making prepared statements useless for short-lived scenarios. Does the Python driver persist prepared statements? Does the Go driver do so? If not the single-execution flow is very relevant for optimization. > Shay>I don't see how reusing SQL text affects security in any way. > > Reusing SQL text makes application more secure as "build SQL on the fly" > is prone to SQL injection security issues. > So reusing SQL text makes application more secure and it enables > server-prepared statements that improve performance considerably. It is a > win-win. > We've all understood that server-prepared statements are good for performance. But they aren't more or less vulnerable to SQL injection - developers can just as well concatenate user-provided strings into a prepared SQL text. [deleting more comments trying to convince that prepared statements are great for performance, which they are] Shay>Again, in a world where prepared statements aren't persisted across > connections (e.g. pgbouncer) > > pgbouncer does not properly support named statements, and that is > pbgouncer's issue. > > Here's the issue for pgbouncer project: https://github.com/ > pgbouncer/pgbouncer/issues/126#issuecomment-200900171 > The response from pgbouncer team is "all the protocol bits are there, it > is just implementation from pgbouncer that is missing". > > By the way: I do not use pgbouncer, thus there's no much interest for me > to invest time in fixing pgbouncer's issues. > Um, OK... So you're not at all bothered by the fact that the (probably) most popular PostgreSQL connection pool is incompatible with your argument? I'm trying to think about actual users and the actual software they use, so pgbouncer is very relevant. Shay>Any scenario where you open a relatively short-lived connection and > execute something once is problematic - imagine a simple web service which > needs to insert a single record into a table. > > I would assume the application does not use random string for a table name > (and columns/aliases), thus it would result in typical SQL text reuse, thus > it should trigger "server-side statement prepare" logic. In other way, that > kind of application does not need the "new ParseBindExecDeallocate message > we are talking about". > I wouldn't assume anything. Maybe the application does want to change column names (which can't be parameterized). Maybe it needs the extended protocol simply because it wants to do binary encoding, which isn't possible with the simple protocol
Re: [HACKERS] 9.6 phrase search distance specification
On Tue, Aug 9, 2016 at 12:59 PM, Ryan Pedelawrote: > > > Thanks, > > Ryan Pedela > Datalanche CEO, founder > www.datalanche.com > > On Tue, Aug 9, 2016 at 11:58 AM, Tom Lane wrote: > >> Bruce Momjian writes: >> > Does anyone know why the phrase distance "<3>" was changed from "at most >> > three tokens away" to "exactly three tokens away"? >> >> So that it would correctly support phraseto_tsquery's use of the operator >> to represent omitted words (stopwords) in a phrase. >> >> I think there's probably some use in also providing an operator that does >> "at most this many tokens away", but Oleg/Teodor were evidently less >> excited, because they didn't take the time to do it. >> >> The thread where this change was discussed is >> >> https://www.postgresql.org/message-id/flat/c19fcfec308e6ccd9 >> 52cdde9e648b505%40mail.gmail.com >> >> see particularly >> >> https://www.postgresql.org/message-id/11252.1465422251%40sss.pgh.pa.us > > > I would say that it is worth it to have a "phrase slop" operator (Apache > Lucene terminology). Proximity search is extremely useful for improving > relevance and phrase slop is one of the tools to achieve that. > > Sorry for the position of my signature Ryan
Re: [HACKERS] 9.6 phrase search distance specification
Thanks, Ryan Pedela Datalanche CEO, founder www.datalanche.com On Tue, Aug 9, 2016 at 11:58 AM, Tom Lanewrote: > Bruce Momjian writes: > > Does anyone know why the phrase distance "<3>" was changed from "at most > > three tokens away" to "exactly three tokens away"? > > So that it would correctly support phraseto_tsquery's use of the operator > to represent omitted words (stopwords) in a phrase. > > I think there's probably some use in also providing an operator that does > "at most this many tokens away", but Oleg/Teodor were evidently less > excited, because they didn't take the time to do it. > > The thread where this change was discussed is > > https://www.postgresql.org/message-id/flat/c19fcfec308e6ccd952cdde9e648b5 > 05%40mail.gmail.com > > see particularly > > https://www.postgresql.org/message-id/11252.1465422251%40sss.pgh.pa.us I would say that it is worth it to have a "phrase slop" operator (Apache Lucene terminology). Proximity search is extremely useful for improving relevance and phrase slop is one of the tools to achieve that.
Re: [HACKERS] 9.6 phrase search distance specification
On Tue, Aug 9, 2016 at 01:58:25PM -0400, Tom Lane wrote: > Bruce Momjianwrites: > > Does anyone know why the phrase distance "<3>" was changed from "at most > > three tokens away" to "exactly three tokens away"? > > So that it would correctly support phraseto_tsquery's use of the operator > to represent omitted words (stopwords) in a phrase. > > I think there's probably some use in also providing an operator that does > "at most this many tokens away", but Oleg/Teodor were evidently less > excited, because they didn't take the time to do it. > > The thread where this change was discussed is > > https://www.postgresql.org/message-id/flat/c19fcfec308e6ccd952cdde9e648b505%40mail.gmail.com > > see particularly > > https://www.postgresql.org/message-id/11252.1465422251%40sss.pgh.pa.us Ah, I know it was discussed somewhere. Thanks, the phraseto_tsquery tie-in was what I forgot. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dsm_unpin_segment
On Mon, Aug 8, 2016 at 8:53 PM, Tom Lanewrote: > Thomas Munro writes: >> Yeah, I was considering unbalanced pin/unpin requests to be a >> programming error. To be more defensive about that, how about I add a >> boolean 'pinned' to dsm_control_item, and elog(ERROR, ...) if it's not >> in the expected state when you try to pin or unpin? > > Well, what you have there is a one-bit-wide pin request counter. > I do not see why that's better than an actual counter, but if that's > what you want to do, fine. > > The larger picture here is that Robert is exhibiting a touching but > unfounded faith that extensions using this feature will contain zero bugs. That's an overstatement of my position. I think it is quite likely that extensions using this feature will have bugs, because essentially all code has bugs, but whether they are likely have the specific bug of unpinning a segment that is already unpinned is not quite so clear. That's not to say I object to Thomas's v2 patch, which will catch that mistake if it happens. Defensive programming never killed anybody, as far as I know. However, I don't see the need for a full-blown request counter here; we've had this API for several releases now and to my knowledge nobody has complained about the fact that you aren't supposed to call dsm_pin_segment() multiple times for the same segment. Therefore, I think the evidence supports the contention that it's not broken and doesn't need to be fixed. If we do decide it needs to be fixed, I think that's material for a separate patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.6 phrase search distance specification
Bruce Momjianwrites: > Does anyone know why the phrase distance "<3>" was changed from "at most > three tokens away" to "exactly three tokens away"? So that it would correctly support phraseto_tsquery's use of the operator to represent omitted words (stopwords) in a phrase. I think there's probably some use in also providing an operator that does "at most this many tokens away", but Oleg/Teodor were evidently less excited, because they didn't take the time to do it. The thread where this change was discussed is https://www.postgresql.org/message-id/flat/c19fcfec308e6ccd952cdde9e648b505%40mail.gmail.com see particularly https://www.postgresql.org/message-id/11252.1465422251%40sss.pgh.pa.us regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: xmltable - proof concept
2016-08-09 19:30 GMT+02:00 Alvaro Herrera: > Pavel Stehule wrote: > > > postgres=# SELECT xmltable.* > > postgres-#FROM (SELECT data FROM xmldata) x, > > postgres-# LATERAL xmltable('/ROWS/ROW' > > postgres(# PASSING data > > postgres(# COLUMNS id int PATH '@id', > > postgres(# country_name text PATH > > 'COUNTRY_NAME', > > postgres(# country_id text PATH > > 'COUNTRY_ID', > > postgres(# region_id int PATH > 'REGION_ID', > > postgres(# size float PATH 'SIZE', > > postgres(# unit text PATH 'SIZE/@unit', > > postgres(# premier_name text PATH > > 'PREMIER_NAME' DEFAULT 'not specified'); > > ┌┬──┬┬───┬──┬──┬ > ───┐ > > │ id │ country_name │ country_id │ region_id │ size │ unit │ > premier_name │ > > ╞╪══╪╪═══╪══╪══╪ > ═══╡ > > │ 1 │ Australia│ AU │ 3 │¤ │ ¤│ not > specified │ > > │ 2 │ China│ CN │ 3 │¤ │ ¤│ not > specified │ > > │ 3 │ HongKong │ HK │ 3 │¤ │ ¤│ not > specified │ > > │ 4 │ India│ IN │ 3 │¤ │ ¤│ not > specified │ > > │ 5 │ Japan│ JP │ 3 │¤ │ ¤│ Sinzo Abe >│ > > │ 6 │ Singapore│ SG │ 3 │ 791 │ km │ not > specified │ > > └┴──┴┴───┴──┴──┴ > ───┘ > > (6 rows) > > Nice work! > Thank you Pavel > > -- > Álvaro Herrerahttp://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services >
[HACKERS] 9.6 phrase search distance specification
Does anyone know why the phrase distance "<3>" was changed from "at most three tokens away" to "exactly three tokens away"? I looked at the thread at: https://www.postgresql.org/message-id/flat/33828354.WrrSMviC7Y%40abook and didn't see the answer. I assume if you are looking for "<3>" you would want "<2>" matches and "<1>" matches as well. -- Bruce Momjianhttp://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: xmltable - proof concept
Pavel Stehule wrote: > postgres=# SELECT xmltable.* > postgres-#FROM (SELECT data FROM xmldata) x, > postgres-# LATERAL xmltable('/ROWS/ROW' > postgres(# PASSING data > postgres(# COLUMNS id int PATH '@id', > postgres(# country_name text PATH > 'COUNTRY_NAME', > postgres(# country_id text PATH > 'COUNTRY_ID', > postgres(# region_id int PATH 'REGION_ID', > postgres(# size float PATH 'SIZE', > postgres(# unit text PATH 'SIZE/@unit', > postgres(# premier_name text PATH > 'PREMIER_NAME' DEFAULT 'not specified'); > ┌┬──┬┬───┬──┬──┬───┐ > │ id │ country_name │ country_id │ region_id │ size │ unit │ premier_name │ > ╞╪══╪╪═══╪══╪══╪═══╡ > │ 1 │ Australia│ AU │ 3 │¤ │ ¤│ not specified │ > │ 2 │ China│ CN │ 3 │¤ │ ¤│ not specified │ > │ 3 │ HongKong │ HK │ 3 │¤ │ ¤│ not specified │ > │ 4 │ India│ IN │ 3 │¤ │ ¤│ not specified │ > │ 5 │ Japan│ JP │ 3 │¤ │ ¤│ Sinzo Abe │ > │ 6 │ Singapore│ SG │ 3 │ 791 │ km │ not specified │ > └┴──┴┴───┴──┴──┴───┘ > (6 rows) Nice work! -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: xmltable - proof concept
Hi 2016-08-07 11:15 GMT+02:00 Pavel Stehule: > Hi > > I am sending a initial implementation of xmltable function: > > The code is not clean now, but it does almost of expected work. The usage > is simple. It is fast - 16K entries in 400ms. > > I invite any help with documentation and testing. > > The full ANSI/SQL, or Oracle compatible implementation is not possible due > limits of libxml2, but for typical usage it should to work well. It doesn't > need any new reserved keyword, so there should not be hard barriers for > accepting (when this work will be complete). > > Example: > > postgres=# SELECT * FROM xmldata; > ┌──┐ > │ data │ > ╞══╡ > │ ↵│ > │ ↵│ > │ AU ↵│ > │ Australia↵│ > │ 3 ↵│ > │ ↵│ > │ ↵│ > │ CN ↵│ > │ China↵│ > │ 3 ↵│ > │ ↵│ > │ ↵│ > │ HK ↵│ > │ HongKong ↵│ > │ 3 ↵│ > │ ↵│ > │ ↵│ > │ IN ↵│ > │ India↵│ > │ 3 ↵│ > │ ↵│ > │ ↵│ > │ JP ↵│ > │ Japan↵│ > │ 3Sinzo Abe↵│ > │ ↵│ > │ ↵│ > │ SG ↵│ > │ Singapore↵│ > │ 3791↵│ > │ ↵│ > │ │ > └──┘ > (1 row) > postgres=# SELECT xmltable.* > postgres-#FROM (SELECT data FROM xmldata) x, > postgres-# LATERAL xmltable('/ROWS/ROW' > postgres(# PASSING data > postgres(# COLUMNS id int PATH '@id', > postgres(# country_name text PATH > 'COUNTRY_NAME', > postgres(# country_id text PATH > 'COUNTRY_ID', > postgres(# region_id int PATH > 'REGION_ID', > postgres(# size float PATH 'SIZE', > postgres(# unit text PATH 'SIZE/@unit', > postgres(# premier_name text PATH > 'PREMIER_NAME' DEFAULT 'not specified'); > ┌┬──┬┬───┬──┬──┬ > ───┐ > │ id │ country_name │ country_id │ region_id │ size │ unit │ premier_name > │ > ╞╪══╪╪═══╪══╪══╪ > ═══╡ > │ 1 │ Australia│ AU │ 3 │¤ │ ¤│ not specified > │ > │ 2 │ China│ CN │ 3 │¤ │ ¤│ not specified > │ > │ 3 │ HongKong │ HK │ 3 │¤ │ ¤│ not specified > │ > │ 4 │ India│ IN │ 3 │¤ │ ¤│ not specified > │ > │ 5 │ Japan│ JP │ 3 │¤ │ ¤│ Sinzo Abe > │ > │ 6 │ Singapore│ SG │ 3 │ 791 │ km │ not specified > │ > └┴──┴┴───┴──┴──┴ > ───┘ > (6 rows) > > Regards > > I am sending updated version - the code is not better, but there is full functionality implemented. * xmlnamespaces, * default xmlnamespace, * ordinality column, * NOT NULL constraint, * mode without explicitly defined columns. Lot of bugs was fixed - it is ready for some playing. tests, comments, notes, comparing with other db are welcome. Some behave is based by libxml2 possibilities - so only XPath is supported. Regards Pavel > Pavel > diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c index 69bf65d..a20c576 100644 --- a/src/backend/executor/execQual.c +++ b/src/backend/executor/execQual.c @@ -58,7 +58,6 @@ #include "utils/typcache.h" #include "utils/xml.h" - /* static function decls */ static Datum ExecEvalArrayRef(ArrayRefExprState *astate, ExprContext
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
Here is new version of the patch, now it includes recommendations from Anastasia Lubennikova. I've investigated anamalous index size decrease. Most probable version appeared to be true. Cube extension, as some others, use Guttman's polynomial time split algorithm. It is known to generate "needle-like" MBBs (MBRs) for sorted data due to imbalanced splits (splitting 100 tuples as 98 to 2). Due to previous throw-to-the-end behavior of GiST this imbalance was further amplificated (most of inserts were going to bigger part after split). So GiST inserts were extremely slow for sorted data. There is no need to do exact sorting to trigger this behavior. This behavior can be fused by implementation of small-m restriction in split (AFAIR this is described in original R-tree paper from 84), which prescribes to do a split in a parts no smaller than m, where m is around 20% of a page capacity (in tuples number). After application of this patch performance for sorted data is roughly the same as performance for randomized data. If data is randomized this effect of Guttman poly-time split is not in effect; test from the start of the thread will show no statistically confident improvement by the patch. To prove performance increase in randomized case I've composed modified test https://gist.github.com/x4m/856b2e1a5db745f8265c76b9c195f2e1 This test with five runs shows following: Before patch Insert Time AVG 78 seconds STD 9.5 Afer patch Insert Time AVG 68 seconds STD 7.7 This is marginal but statistically viable performance improvement. For sorted data performance is improved by a factor of 3. Best regards, Andrey Borodin, Octonica & Ural Federal University. PageIndexTupleOverwrite v7.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No longer possible to query catalogs for index capabilities?
> "Alvaro" == Alvaro Herrerawrites: >> One idea is utils/adt/misc.c. Or we could make a new file under >> utils/adt/ though I'm not very sure what to name it. amaccess.c? >> catutils.c? If there's only ever likely to be one or two functions >> of this ilk, maybe a new file is overkill and we should just use >> misc.c. Alvaro> I like the idea of a new file; I have a hunch that it will Alvaro> grow, given that we're expanding in this area, and perhaps we Alvaro> can find some existing stuff to relocate there in the future. Alvaro> I don't think a small file is a problem, anyway. Alvaro> How about amfuncs.c? Maybe it can live in catalog/ instead of Alvaro> utils/adt? Well, the existing patch used access/index/amapi.c for the AM capability functions. There may be some merit in keeping everything together - I asked because it didn't seem at first glance that the index column property function belonged there, but on second thought there's some overlap in that in future, if indoptions ever acquires any AM-specific flags, it may be necessary for pg_index_column_has_property to call into an AM-specific function. So, here are some options: 1. Put everything in access/index/amapi.c 2. Move either all of access/index/amapi.c, or just the SQL-callable part of it (amvalidate), to utils/adt/amfuncs.c and put new stuff in there 3. put pg_index[am]_has_capability in access/index/amapi.c and pg_index_column_has_property in utils/adt/misc.c -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Set log_line_prefix and application name in test drivers
Peter Eisentrautwrites: > Here is a small patch that sets log_line_prefix and application name in > pg_regress and the TAP tests, to make analyzing the server log output > easier. How would this interact with the buildfarm's existing policies on setting log_line_prefix? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Set log_line_prefix and application name in test drivers
Here is a small patch that sets log_line_prefix and application name in pg_regress and the TAP tests, to make analyzing the server log output easier. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services >From 0595672e9272a53162eda315c15835c26fdb6d10 Mon Sep 17 00:00:00 2001 From: Peter EisentrautDate: Tue, 9 Aug 2016 11:56:49 -0400 Subject: [PATCH] Set log_line_prefix and application name in test drivers Before pg_regress runs psql, set the application name to the test name. Similarly, set the application name to the test file name in the TAP tests. Also, set a default log_line_prefix that show the application name, as well as the PID and a time stamp. That way, the server log output can be correlated to the test input files, making debugging a bit easier. --- src/test/perl/PostgresNode.pm | 1 + src/test/perl/TestLib.pm | 2 ++ src/test/regress/pg_regress.c | 1 + src/test/regress/pg_regress_main.c | 7 +++ 4 files changed, 11 insertions(+) diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm index e629373..3f2ca91 100644 --- a/src/test/perl/PostgresNode.pm +++ b/src/test/perl/PostgresNode.pm @@ -402,6 +402,7 @@ sub init open my $conf, ">>$pgdata/postgresql.conf"; print $conf "\n# Added by PostgresNode.pm\n"; print $conf "fsync = off\n"; + print $conf "log_line_prefix = '%t [%p]: [%l] %qapp=%a '\n"; print $conf "log_statement = all\n"; print $conf "port = $port\n"; diff --git a/src/test/perl/TestLib.pm b/src/test/perl/TestLib.pm index 649fd82..27fcc78 100644 --- a/src/test/perl/TestLib.pm +++ b/src/test/perl/TestLib.pm @@ -60,6 +60,8 @@ BEGIN delete $ENV{PGPORT}; delete $ENV{PGHOST}; + $ENV{PGAPPNAME} = $0; + # Must be set early $windows_os = $Config{osname} eq 'MSWin32' || $Config{osname} eq 'msys'; } diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c index 574f5b8..d5a8e16 100644 --- a/src/test/regress/pg_regress.c +++ b/src/test/regress/pg_regress.c @@ -2247,6 +2247,7 @@ regression_main(int argc, char *argv[], init_function ifunc, test_function tfunc fputs("\n# Configuration added by pg_regress\n\n", pg_conf); fputs("log_autovacuum_min_duration = 0\n", pg_conf); fputs("log_checkpoints = on\n", pg_conf); + fputs("log_line_prefix = '%t [%p]: [%l] %qapp=%a '\n", pg_conf); fputs("log_lock_waits = on\n", pg_conf); fputs("log_temp_files = 128kB\n", pg_conf); fputs("max_prepared_transactions = 2\n", pg_conf); diff --git a/src/test/regress/pg_regress_main.c b/src/test/regress/pg_regress_main.c index d9591c0..2733635 100644 --- a/src/test/regress/pg_regress_main.c +++ b/src/test/regress/pg_regress_main.c @@ -34,6 +34,7 @@ psql_start_test(const char *testname, char expectfile[MAXPGPATH]; char psql_cmd[MAXPGPATH * 3]; size_t offset = 0; + char *appnameenv; /* * Look for files in the output dir first, consistent with a vpath search. @@ -63,6 +64,9 @@ psql_start_test(const char *testname, offset += snprintf(psql_cmd + offset, sizeof(psql_cmd) - offset, "%s ", launcher); + appnameenv = psprintf("PGAPPNAME=pg_regress/%s", testname); + putenv(appnameenv); + snprintf(psql_cmd + offset, sizeof(psql_cmd) - offset, "\"%s%spsql\" -X -a -q -d \"%s\" < \"%s\" > \"%s\" 2>&1", bindir ? bindir : "", @@ -80,6 +84,9 @@ psql_start_test(const char *testname, exit(2); } + unsetenv("PGAPPNAME"); + free(appnameenv); + return pid; } -- 2.9.2 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No longer possible to query catalogs for index capabilities?
Tom Lane wrote: > Andrew Gierthwrites: > > Where'd be a good place to put that function? ruleutils? catalog/index.c ? > > > (ruleutils is way too big already) > > Agreed. catalog/index.c is not a place that implements SQL-visible > functions, so I don't like that either. > > One idea is utils/adt/misc.c. Or we could make a new file under > utils/adt/ though I'm not very sure what to name it. amaccess.c? > catutils.c? If there's only ever likely to be one or two functions > of this ilk, maybe a new file is overkill and we should just use misc.c. I like the idea of a new file; I have a hunch that it will grow, given that we're expanding in this area, and perhaps we can find some existing stuff to relocate there in the future. I don't think a small file is a problem, anyway. How about amfuncs.c? Maybe it can live in catalog/ instead of utils/adt? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slowness of extended protocol
Shay>But here's the more important general point. We're driver developers, not application developers. I don't really know what performance is "just fine" for each of my users, and what is not worth optimizing further. Users may follow best practices, or they may not for various reasons. Of course we cannot benchmark all the existing applications, however we should at lest try to use "close to production" benchmarks. Let me recap: even "select 1" shows clear advantage of reusing server-prepared statements. My machine shows the following results for "select 1 pgbench": simple: 38K ops/sec (~26us/op) extended: 32K ops/sec (~31us/op) prepared: 47K ops/sec (~21us/op) Note: reusing server-prepared statements shaves 10us (out of 31us), while "brand new ParseBindExecDeallocate" message would not able to perform better than 26us/op (that is 5 us worse than the prepared one). So it makes much more sense investing time in "server-prepared statement reuse" at the client side and "improving Bind/Exec performance" at the backend side. For more complex queries the gap (prepared vs simple) would be much bigger since parse/validate/plan for a complex query is much harder operation than the one for "select 1" Note: I do not mean "look, prepared always win". I mean: "if your environment does not allow reuse of prepared statements for some reason, you lose huge amount of time on re-parsing the queries, and it is worth fixing that obvious issue first". Shay>I don't see how reusing SQL text affects security in any way. Reusing SQL text makes application more secure as "build SQL on the fly" is prone to SQL injection security issues. So reusing SQL text makes application more secure and it enables server-prepared statements that improve performance considerably. It is a win-win. Shay>a new feature in the Npgsql dev branch which allows prepared statements to be persisted across open/close on pooled connections Do you have some benchmark results for "reusing server-prepared statements across open/close on pooled"? I would expect that feature to be a great win. Once again, I'd like to focus on real-life cases, not artificial ones. For example, the major part of my performance fixes to pgjdbc were made as a part of improving my java application that was suffering from performance issues when talking to PostgreSQL. For instance, there were queries that took 20ms to plan and 0.2ms to execute (the query is like where id=? but the SQL text was more involved). As transparent server-side statement was implemented at pgjdbc side, it shaved those 20ms by eliminating Parse messages on the hot path. In other words, it was not just "lets optimize pgjdbc". It was driven by the need to optimize the client application, and the profiling results were pointing to pgjdbc issues. Shay>Again, in a world where prepared statements aren't persisted across connections (e.g. pgbouncer) pgbouncer does not properly support named statements, and that is pbgouncer's issue. Here's the issue for pgbouncer project: https://github.com/pgbouncer/pgbouncer/issues/126#issuecomment-200900171 The response from pgbouncer team is "all the protocol bits are there, it is just implementation from pgbouncer that is missing". By the way: I do not use pgbouncer, thus there's no much interest for me to invest time in fixing pgbouncer's issues. Shay>Any scenario where you open a relatively short-lived connection and execute something once is problematic - imagine a simple web service which needs to insert a single record into a table. I would assume the application does not use random string for a table name (and columns/aliases), thus it would result in typical SQL text reuse, thus it should trigger "server-side statement prepare" logic. In other way, that kind of application does not need the "new ParseBindExecDeallocate message we are talking about". In other words, if an application is using "select name from objects where id=$1" kind of queries, the driver should be using extended protocol (Bind/Exec) behind the scenes if it does aim to get high performance. Vladimir
Re: [HACKERS] No longer possible to query catalogs for index capabilities?
Andrew Gierthwrites: > Where'd be a good place to put that function? ruleutils? catalog/index.c ? > (ruleutils is way too big already) Agreed. catalog/index.c is not a place that implements SQL-visible functions, so I don't like that either. One idea is utils/adt/misc.c. Or we could make a new file under utils/adt/ though I'm not very sure what to name it. amaccess.c? catutils.c? If there's only ever likely to be one or two functions of this ilk, maybe a new file is overkill and we should just use misc.c. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No longer possible to query catalogs for index capabilities?
> "Kevin" == Kevin Grittnerwrites: >>> Building on the has-property approach Andrew suggested, I wonder if >>> we need something like pg_index_column_has_property(indexoid, colno, >>> propertyname) with properties like "sortable", "desc", "nulls first". >> >> This seems simple enough, on the surface. Why not run with this design? Kevin> Andrew's patch, plus this, covers everything I can think of. Where'd be a good place to put that function? ruleutils? catalog/index.c ? (ruleutils is way too big already) -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c
On Mon, Aug 8, 2016 at 6:00 PM, Dilip Kumarwrote: > On Mon, Aug 8, 2016 at 5:23 PM, Amit Kapila wrote: >> >> >> Did you consider to use ERRCODE_UNDEFINED_COLUMN with error messages >> like: "type %u does not exit" or "type id %u does not exit"? Errorcode >> ERRCODE_UNDEFINED_COLUMN seems to be used for syscache lookup failure >> cases at certain places in code. > > > But I think, OBJECT will be more appropriate than COLUMN at this place. > Okay, then how about ERRCODE_UNDEFINED_OBJECT? It seems to be used in almost similar cases. > However I can change error message to "type id %u does not exit" if this > seems better ? > I think that is better. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slowness of extended protocol
On Tue, Aug 9, 2016 at 8:50 AM, Vladimir Sitnikov < sitnikov.vladi...@gmail.com> wrote: > Shay>There are many scenarios where connections are very short-lived > (think about webapps where a pooled connection is allocated per-request and > reset in between) > > Why the connection is reset in between in the first place? > In pgjdbc we do not reset per-connection statement cache, thus we easily > reuse named statements for pooled connections. > A DISCARD ALL is sent when the connection is returned to the pool, the prevent state leakage etc. You make a valid comment though - there's already a new feature in the Npgsql dev branch which allows prepared statements to be persisted across open/close on pooled connections, it's interesting to learn that it is standard behavior in pgjdbc. At least up to now, the logic was not to implcitly keep holding server-side resources across a pooled connection close, because these may become a drain on the server etc. More important, unless I'm mistaken pgbouncer also sends DISCARD ALL to clean the connection state, as will other pooling solutions. That unfortunately means that you can't always depend on prepared statements to persist after closing the connection. > Shay>There are also many scenarios where you're not necessarily going to > send the same query multiple times in a single connection lifespan, so > preparing is again out of the question. > > Can you list at least one scenario of that kind, so we can code it into > pgbench (or alike) and validate "simple vs prepared" performance? > Again, in a world where prepared statements aren't persisted across connections (e.g. pgbouncer), this scenario is extremely common. Any scenario where you open a relatively short-lived connection and execute something once is problematic - imagine a simple web service which needs to insert a single record into a table. Shay>And more generally, there's no reason for a basic, non-prepared > execution to be slower than it can be. > > That's too generic. If the performance for "end-to-end cases" is just > fine, then it is not worth optimizing further. Typical application best > practice is to reuse SQL text (for both security and performance point of > views), so in typical applications I've seen, query text was reused, thus > it naturally was handled by server-prepared logic. > I don't see how reusing SQL text affects security in any way. But here's the more important general point. We're driver developers, not application developers. I don't really know what performance is "just fine" for each of my users, and what is not worth optimizing further. Users may follow best practices, or they may not for various reasons. They may be porting code over from SqlServer, where prepare is almost never used (because SqlServer caches plans implicitly), or they may simply not be very good programmers. The API for executing non-prepared statements is there, we support it and PostgreSQL supports it - it just happens to not be very fast. Of course we can say "screw everyone not preparing statements", but that doesn't seem like a very good way to treat users. Especially since the fix isn't that hard. Let me highlight another direction: current execution of server-prepared > statement requires some copying of "parse tree" (or whatever). I bet it > would be much better investing in removal of that copying rather than > investing into "make one-time queries faster" thing. If we could make > "Exec" processing faster, it would immediately improve tons of applications. > I don't understand what exactly you're proposing here, but if you have some unrelated optimization that would speed up prepared statements, by all means that's great. It's just unrelated to this thread. > Shay>Of course we can choose a different query to benchmark instead of > SELECT 1 - feel free to propose one (or several). > > I've tried pgbench -M prepared, and it is way faster than pgbench -M > simple. > > Once again: all cases I have in mind would benefit from reusing > server-prepared statements. In other words, after some warmup the > appication would use just Bind-Execute-Sync kind of messages, and it would > completely avoid Parse/Describe ones. > Of course that's the ideal scenario. It's just not the *only* scenario for all users - they may either not have prepared statements persisting across open/close as detailed above, or their code may simply not be preparing statements at the moment. Why not help them out for free? Shay>FYI in Npgsql specifically describe isn't used to get any knowledge > about parameters - users must populate the correct parameters or query > execution fails. > > I think the main reason to describe for pgjdbc is to get result oids. > pgjdbc is not "full binary", thus it has to be careful which fields it > requests in binary format. > That indeed slows down "unknown queries", but as the query gets reused, > pgjdbc switches to server-prepared execution, and eliminates parse-describe > overheads
Re: [HACKERS] Wait events monitoring future development
On Tue, Aug 9, 2016 at 04:17:28AM +, Tsunakawa, Takayuki wrote: > From: pgsql-hackers-ow...@postgresql.org > > I used to think of that this kind of features should be enabled by default, > > because when I was working at the previous company, I had only few features > > to understand what is happening inside PostgreSQL by observing production > > databases. I needed those features enabled in the production databases when > > I was called. > > > > However, now I have another opinion. When we release the next major release > > saying 10.0 with the wait monitoring, many people will start their benchmark > > test with a configuration with *the default values*, and if they see some > > performance decrease, for example around 10%, they will be talking about > > it as the performance decrease in PostgreSQL 10.0. It means PostgreSQL will > > be facing difficult reputation. > > > > So, I agree with the features should be disabled by default for a while. > > I understand your feeling well. This is a difficult decision. Let's hope > for trivial overhead. I think the goal is that some internal tracking can be enabled by default and some internal or external tool can be turned on and off to get more fine-grained statistics about the event durations. -- Bruce Momjianhttp://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical Replication WIP
On 09/08/16 12:16, Craig Ringer wrote: On 9 August 2016 at 17:28, Masahiko Sawada> wrote: > Sure, you can go deeper down the rabbit hole here and say that we need to > add bgworker "categories" with reserved pools of worker slots for each > category. But do we really need that? If we change these processes to bgworker, we can categorize them into two, auxiliary process(check pointer and wal sender etc) and other worker process. And max_worker_processes controls the latter. Right. I think that's probably the direction we should be going eventually. Personally I don't think such a change should block the logical replication work from proceeding with bgworkers, though. It's been delayed a long time, a lot of people want it, and I think we need to focus on meeting the core requirements not getting too sidetracked on minor points. Of course, everyone's idea of what's core and what's a minor sidetrack differs ;) Yeah that's why I added local max GUC that just handles the logical worker limit within the max_worker_processes. I didn't want to also write generic framework for managing the max workers using tags or something as part of this, it's big enough as it is and we can always move the limit to the more generic place once we have it. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical Replication WIP
On 9 August 2016 at 17:28, Masahiko Sawadawrote: > > Sure, you can go deeper down the rabbit hole here and say that we need to > > add bgworker "categories" with reserved pools of worker slots for each > > category. But do we really need that? > > If we change these processes to bgworker, we can categorize them into > two, auxiliary process(check pointer and wal sender etc) and other > worker process. > And max_worker_processes controls the latter. Right. I think that's probably the direction we should be going eventually. Personally I don't think such a change should block the logical replication work from proceeding with bgworkers, though. It's been delayed a long time, a lot of people want it, and I think we need to focus on meeting the core requirements not getting too sidetracked on minor points. Of course, everyone's idea of what's core and what's a minor sidetrack differs ;) -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Logical Replication WIP
On Tue, Aug 9, 2016 at 5:13 PM, Craig Ringerwrote: > On 9 August 2016 at 15:59, Masahiko Sawada wrote: > >> >> >> The logical replication launcher process and the apply process are >> implemented as a bgworker. Isn't better to have them as an auxiliary >> process like checkpointer, wal writer? > > > I don't think so. The checkpointer, walwriter, autovacuum, etc predate > bgworkers. I strongly suspect that if they were to be implemented now they'd > use bgworkers. > > Now, perhaps we want a new bgworker "kind" for system workers or some other > minor tweaks. But basically I think bgworkers are exactly what we should be > using here. I understood. Thanks! >> >> IMO the number of logical replication connections should not be >> limited by max_worker_processes. > > > Well, they *are* worker processes... but I take your point, that that > setting has been "number of bgworkers the user can run" and it might not be > expected that logical replication would use the same space. > > max_worker_progresses isn't just a limit, it controls how many shmem slots > we allocate. > > I guess we could have a separate max_logical_workers or something, but I'm > inclined to think that adds complexity without really making things any > nicer. We'd just add them together to decide how many shmem slots to > allocate and we'd have to keep track of how many slots were used by which > types of backend. Or create a near-duplicate of the bgworker facility for > logical rep. > > Sure, you can go deeper down the rabbit hole here and say that we need to > add bgworker "categories" with reserved pools of worker slots for each > category. But do we really need that? If we change these processes to bgworker, we can categorize them into two, auxiliary process(check pointer and wal sender etc) and other worker process. And max_worker_processes controls the latter. > max_connections includes everything, both system and user backends. It's not > like we don't do this elsewhere. It's at worst a mild wart. > > The only argument I can see for not using bgworkers is for the supervisor > worker. It's a singleton that launches the per-database workers, and > arguably is a job that the postmaster could do better. The current design > there stems from its origins as an extension. Maybe worker management could > be simplified a bit as a result. I'd really rather not invent yet another > new and mostly duplicate category of custom workers to achieve that though. > > Regards, -- Masahiko Sawada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical Replication WIP
On 09/08/16 10:13, Craig Ringer wrote: On 9 August 2016 at 15:59, Masahiko Sawada> wrote: The logical replication launcher process and the apply process are implemented as a bgworker. Isn't better to have them as an auxiliary process like checkpointer, wal writer? I don't think so. The checkpointer, walwriter, autovacuum, etc predate bgworkers. I strongly suspect that if they were to be implemented now they'd use bgworkers. Now, perhaps we want a new bgworker "kind" for system workers or some other minor tweaks. But basically I think bgworkers are exactly what we should be using here. Agreed. IMO the number of logical replication connections should not be limited by max_worker_processes. Well, they *are* worker processes... but I take your point, that that setting has been "number of bgworkers the user can run" and it might not be expected that logical replication would use the same space. Again agree, I think we should ultimately go towards what PeterE suggested in https://www.postgresql.org/message-id/a2fffd92-6e59-a4eb-dd85-c5865ebca...@2ndquadrant.com The only argument I can see for not using bgworkers is for the supervisor worker. It's a singleton that launches the per-database workers, and arguably is a job that the postmaster could do better. The current design there stems from its origins as an extension. Maybe worker management could be simplified a bit as a result. I'd really rather not invent yet another new and mostly duplicate category of custom workers to achieve that though. It is simplified compared to pglogical (there is only 2 worker types not 3). I don't think it's job of postmaster to scan catalogs however so it can't really start workers for logical replication. I actually modeled it more after autovacuum (using bgworkers though) than the original extension. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical Replication WIP
On 09/08/16 09:59, Masahiko Sawada wrote: On 2016-08-05 17:00:13 +0200, Petr Jelinek wrote: as promised here is WIP version of logical replication patch. Thank you for working on this! Thanks for looking! I've applied these patches to current HEAD, but got the following error. libpqwalreceiver.c:48: error: redefinition of typedef ‘WalReceiverConnHandle’ ../../../../src/include/replication/walreceiver.h:137: note: previous declaration of ‘WalReceiverConnHandle’ was here make[2]: *** [libpqwalreceiver.o] Error 1 make[1]: *** [install-backend/replication/libpqwalreceiver-recurse] Error 2 make: *** [install-src-recurse] Error 2 After fixed this issue with attached patch, I used logical replication a little. Some random comments and questions. Interesting, my compiler does have problem. Will investigate. The logical replication launcher process and the apply process are implemented as a bgworker. Isn't better to have them as an auxiliary process like checkpointer, wal writer? IMO the number of logical replication connections should not be limited by max_worker_processes. What Craig said reflects my rationale for doing this pretty well. We need to set the publication up by at least CREATE PUBLICATION and ALTER PUBLICATION command. Can we make CREATE PUBLICATION possible to define tables as well? For example, CREATE PUBLICATION mypub [ TABLE table_name, ...] [WITH options] Agreed, that just didn't make it to the first cut to -hackers. We've been also thinking of having special ALL TABLES parameter there that would encompass whole db. -- This patch can not drop the subscription. =# drop subscription sub; ERROR: unrecognized object class: 6102 Yeah that's because of the patch 0006, I didn't finish all the dependency tracking for the pg_subscription_rel catalog that it adds (which is why I called it PoC). I expect to have this working in next version (there is still quite a bit of polish work needed in general). -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slowness of extended protocol
Shay>There are many scenarios where connections are very short-lived (think about webapps where a pooled connection is allocated per-request and reset in between) Why the connection is reset in between in the first place? In pgjdbc we do not reset per-connection statement cache, thus we easily reuse named statements for pooled connections. Shay>and the extra roundtrip that preparing entails is too much. When server-prepared statement gets reused, neither parse neither describe are used. Shay>There are also many scenarios where you're not necessarily going to send the same query multiple times in a single connection lifespan, so preparing is again out of the question. Can you list at least one scenario of that kind, so we can code it into pgbench (or alike) and validate "simple vs prepared" performance? Shay>And more generally, there's no reason for a basic, non-prepared execution to be slower than it can be. That's too generic. If the performance for "end-to-end cases" is just fine, then it is not worth optimizing further. Typical application best practice is to reuse SQL text (for both security and performance point of views), so in typical applications I've seen, query text was reused, thus it naturally was handled by server-prepared logic. Let me highlight another direction: current execution of server-prepared statement requires some copying of "parse tree" (or whatever). I bet it would be much better investing in removal of that copying rather than investing into "make one-time queries faster" thing. If we could make "Exec" processing faster, it would immediately improve tons of applications. Shay>Of course we can choose a different query to benchmark instead of SELECT 1 - feel free to propose one (or several). I've tried pgbench -M prepared, and it is way faster than pgbench -M simple. Once again: all cases I have in mind would benefit from reusing server-prepared statements. In other words, after some warmup the appication would use just Bind-Execute-Sync kind of messages, and it would completely avoid Parse/Describe ones. If a statement is indeed "one-time" statement, then I do not care much how long it would take to execute. Shay>FYI in Npgsql specifically describe isn't used to get any knowledge about parameters - users must populate the correct parameters or query execution fails. I think the main reason to describe for pgjdbc is to get result oids. pgjdbc is not "full binary", thus it has to be careful which fields it requests in binary format. That indeed slows down "unknown queries", but as the query gets reused, pgjdbc switches to server-prepared execution, and eliminates parse-describe overheads completely. Vladimir
[HACKERS] Small issues in syncrep.c
Hello, Since 14e8803f1, it's not necessary to acquire the SyncRepLock to see up to date data. But it looks like this commit didn't update all the comment around MyProc->syncRepState, which still mention retrieving the value without and without lock. Also, there's I think a now unneeded test to try to retrieve again syncRepState. Patch attached to fix both small issues, present since 9.5. -- Julien Rouhaud http://dalibo.com - http://dalibo.org diff --git a/src/backend/replication/syncrep.c b/src/backend/replication/syncrep.c index 67249d8..c3e11b8 100644 --- a/src/backend/replication/syncrep.c +++ b/src/backend/replication/syncrep.c @@ -195,17 +195,12 @@ SyncRepWaitForLSN(XLogRecPtr lsn, bool commit) ResetLatch(MyLatch); /* -* Try checking the state without the lock first. There's no -* guarantee that we'll read the most up-to-date value, so if it looks -* like we're still waiting, recheck while holding the lock. But if -* it looks like we're done, we must really be done, because once -* walsender changes the state to SYNC_REP_WAIT_COMPLETE, it will -* never update it again, so we can't be seeing a stale value in that -* case. +* Acquiring the lock is not needed, the latch ensure proper barriers. +* If it looks like we're done, we must really be done, because once +* walsender changes the state to SYNC_REP_WAIT_COMPLETE, it will never +* update it again, so we can't be seeing a stale value in that case. */ syncRepState = MyProc->syncRepState; - if (syncRepState == SYNC_REP_WAITING) - syncRepState = MyProc->syncRepState; if (syncRepState == SYNC_REP_WAIT_COMPLETE) break; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical Replication WIP
On Tue, Aug 9, 2016 at 5:13 PM, Craig Ringerwrote: > On 9 August 2016 at 15:59, Masahiko Sawada wrote: >> The logical replication launcher process and the apply process are >> implemented as a bgworker. Isn't better to have them as an auxiliary >> process like checkpointer, wal writer? > > I don't think so. The checkpointer, walwriter, autovacuum, etc predate > bgworkers. I strongly suspect that if they were to be implemented now they'd > use bgworkers. +1. We could always get them now under the umbrella of the bgworker infrastructure if this cleans up some code duplication. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical Replication WIP
On 9 August 2016 at 15:59, Masahiko Sawadawrote: > > The logical replication launcher process and the apply process are > implemented as a bgworker. Isn't better to have them as an auxiliary > process like checkpointer, wal writer? > I don't think so. The checkpointer, walwriter, autovacuum, etc predate bgworkers. I strongly suspect that if they were to be implemented now they'd use bgworkers. Now, perhaps we want a new bgworker "kind" for system workers or some other minor tweaks. But basically I think bgworkers are exactly what we should be using here. > IMO the number of logical replication connections should not be > limited by max_worker_processes. > Well, they *are* worker processes... but I take your point, that that setting has been "number of bgworkers the user can run" and it might not be expected that logical replication would use the same space. max_worker_progresses isn't just a limit, it controls how many shmem slots we allocate. I guess we could have a separate max_logical_workers or something, but I'm inclined to think that adds complexity without really making things any nicer. We'd just add them together to decide how many shmem slots to allocate and we'd have to keep track of how many slots were used by which types of backend. Or create a near-duplicate of the bgworker facility for logical rep. Sure, you can go deeper down the rabbit hole here and say that we need to add bgworker "categories" with reserved pools of worker slots for each category. But do we really need that? max_connections includes everything, both system and user backends. It's not like we don't do this elsewhere. It's at worst a mild wart. The only argument I can see for not using bgworkers is for the supervisor worker. It's a singleton that launches the per-database workers, and arguably is a job that the postmaster could do better. The current design there stems from its origins as an extension. Maybe worker management could be simplified a bit as a result. I'd really rather not invent yet another new and mostly duplicate category of custom workers to achieve that though. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Slowness of extended protocol
On Mon, Aug 8, 2016 at 6:44 PM, Vladimir Sitnikov < sitnikov.vladi...@gmail.com> wrote: > > The problem with "empty statement name" is statements with empty name can >> be reused (for instance, for batch insert executions), so the server side >> has to do a defensive copy (it cannot predict how many times this unnamed >> statement will be used). >> > That seems right. Also, part of the point here is to reduce the number of protocol messages >> needed in order to send a parameterized query - not to have to do >> Parse/Describe/Bind/Execute/Sync - since part of the slowdown comes from >> that (although I'm not sure how much). Your proposal keeps the 5 messages. >> > > As my benchmarks show, notable overhead is due to "defensive copying of > the execution plan". So I would measure first, and only then would claim > where the overhead is. > > Some more profiling is required to tell which part is a main time consumer. > Tom also pointed to the caching as the main culprit, although there seems to be some message-related overhead as well. It seems that things like process title changes may be fixable easily - do we really need a process title change on every message (as opposed to, say, execute messages only). This profiling and optimization effort can happen in parallel to the discussion on what to do with the execution plan caching. > Technically speaking, I would prefer to have a more real-life looking > example instead of SELECT 1. > Do you have something in mind? > For instance, for more complex queries, "Parse/Plan" could cost much more > than we shave by adding "a special non-cached statement name" or by > reducing "5 messages into 1". > > There's a side problem: describe message requires full roundtrip since > there are cases when client needs to know how to encode parameters. > Additional roundtrip hurts much worse than just an additional message that > is pipelined (e.g. sent in the same TCP packet). > This is true, but there doesn't seem to be anything we can do about it - if your usage relies on describe to get information on parameters (as opposed to results), you're stuck with an extra roundtrip no matter what. So it seems you have to use the extended protocol anyway. FYI in Npgsql specifically describe isn't used to get any knowledge about parameters - users must populate the correct parameters or query execution fails. > Note: it is quite easy to invent a name that is not yet used in the wild, >>> so it is safe. >>> >> >> That's problematic, how do you know what's being used in the wild and >> what isn't? The protocol has a specification, it's very problematic to get >> up one day and to change it retroactively. But again, the empty statement >> seems to already be there for that. >> > > Empty statement has different semantics, and it is wildly used. > For instance, pgjdbc uses unnamed statements a lot. > On the other hand, statement name of "!pq@#!@#42" is rather safe to use > as a special case. > Note: statement names are not typically created by humans (statement name > is not a SQL), and very little PG clients do support named statements. > IMHO this simply isn't the kind of thing one does in a serious protocol of a widely-used product, others seem to agree on this. > Sir, any new SQL keyword is what you call a "retroactively defining special semantics". > It's obvious that very little current clients do use named server-prepared statements. > Statement names are not something that is provided by the end-user in a web page, so it is not a rocket science to come up with a > statement name that is both short and "never ever used in the wild". The difference is that before the new SQL keyword is added, trying to use it results in an error. What you're proposing is taking something that already works in one way and changing its behavior. > Shay, can you come up with a real-life use case when those "I claim the statement will be used only once" is would indeed improve performance? > Or, to put it in another way: "do you have a real-life case when simple protocol is faster than extended protocol with statement reuse"? > I do have a couple of java applications and it turns out there's a huge win of reusing server-prepared statements. > There's a problem that "generic plan after 5th execution might be much worse than a specific one", however those statements are not often > and I just put hints to the SQL (limit 0, +0, CTE, those kind of things). I maintain Npgsql, which is a general-purpose database driver and not a specific application. The general .NET database API (ADO.NET), like most (all?) DB APIs, allows users to send a simple statement to the database (ExecuteReader, ExecuteScalar, ExecuteNonQuery). Every time a user uses these APIs without preparing, they pay a performance penalty because the extended protocol has more overhead than the simple one. Obviously smart use of prepared statements is a great idea, but it doesn't work everywhere. There are many scenarios where
Re: [HACKERS] Logical Replication WIP
On Sat, Aug 6, 2016 at 2:04 AM, Simon Riggswrote: > On 5 August 2016 at 16:22, Andres Freund wrote: >> On 2016-08-05 17:00:13 +0200, Petr Jelinek wrote: >>> as promised here is WIP version of logical replication patch. >> >> Yay! > > Yay2 > Thank you for working on this! I've applied these patches to current HEAD, but got the following error. libpqwalreceiver.c:48: error: redefinition of typedef ‘WalReceiverConnHandle’ ../../../../src/include/replication/walreceiver.h:137: note: previous declaration of ‘WalReceiverConnHandle’ was here make[2]: *** [libpqwalreceiver.o] Error 1 make[1]: *** [install-backend/replication/libpqwalreceiver-recurse] Error 2 make: *** [install-src-recurse] Error 2 After fixed this issue with attached patch, I used logical replication a little. Some random comments and questions. The logical replication launcher process and the apply process are implemented as a bgworker. Isn't better to have them as an auxiliary process like checkpointer, wal writer? IMO the number of logical replication connections should not be limited by max_worker_processes. -- We need to set the publication up by at least CREATE PUBLICATION and ALTER PUBLICATION command. Can we make CREATE PUBLICATION possible to define tables as well? For example, CREATE PUBLICATION mypub [ TABLE table_name, ...] [WITH options] -- This patch can not drop the subscription. =# drop subscription sub; ERROR: unrecognized object class: 6102 -- +/*- + * + * proto.c + * logical replication protocol functions + * + * Copyright (c) 2015, PostgreSQL Global Development Group + * The copyright of added files are old. And this patch has some whitespace problems. Please run "git show --check" or "git diff origin/master --check" Regards, -- Masahiko Sawada diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c index 94648c7..e4aaba4 100644 --- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c +++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c @@ -40,12 +40,12 @@ PG_MODULE_MAGIC; -typedef struct WalReceiverConnHandle { +struct WalReceiverConnHandle { /* Current connection to the primary, if any */ PGconn *streamConn; /* Buffer for currently read records */ char *recvBuf; -} WalReceiverConnHandle; +}; PGDLLEXPORT WalReceiverConnHandle *_PG_walreceirver_conn_init(WalReceiverConnAPI *wrcapi); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Declarative partitioning
What strikes me odd about these patches is RelOptInfo has remained unmodified. For a base partitioned table, I would expect it to be marked as partitioned may be indicating the partitioning scheme. Instead of that, I see that the code directly deals with PartitionDesc, PartitionKey which are part of Relation. It uses other structures like PartitionKeyExecInfo. I don't have any specific observation as to why we need such information in RelOptInfo, but lack of it makes me uncomfortable. It could be because inheritance code does not require any mark in RelOptInfo and your patch re-uses inheritance code. I am not sure. On Tue, Aug 9, 2016 at 9:17 AM, Amit Langotewrote: > On 2016/08/09 6:02, Robert Haas wrote: > > On Mon, Aug 8, 2016 at 1:40 AM, Amit Langote > > wrote: > >>> +1, if we could do it. It will need a change in the way Amit's patch > stores > >>> partitioning scheme in PartitionDesc. > >> > >> Okay, I will try to implement this in the next version of the patch. > >> > >> One thing that comes to mind is what if a user wants to apply hash > >> operator class equality to list partitioned key by specifying a hash > >> operator class for the corresponding column. In that case, we would not > >> have the ordering procedure with an hash operator class, hence any > >> ordering based optimization becomes impossible to implement. The > current > >> patch rejects a column for partition key if its type does not have a > btree > >> operator class for both range and list methods, so this issue doesn't > >> exist, however it could be seen as a limitation. > > > > Yes, I think you should expect careful scrutiny of that issue. It > > seems clear to me that range partitioning requires a btree opclass, > > that hash partitioning requires a hash opclass, and that list > > partitioning requires at least one of those things. It would probably > > be reasonable to pick one or the other and insist that list > > partitioning always requires exactly that, but I can't see how it's > > reasonable to insist that you must have both types of opclass for any > > type of partitioning. > > So because we intend to implement optimizations for list partition > metadata that presuppose existence of corresponding btree operator class, > we should just always require user to specify one (or error out if user > doesn't specify and a default one doesn't exist). That way, we explicitly > do not support specify hash equality operator for list partitioning. > > >> So, we have 3 choices for the internal representation of list > partitions: > >> > >> Choice 1 (the current approach): Load them in the same order as they > are > >> found in the partition catalog: > >> > >> Table 1: p1 {'b', 'f'}, p2 {'c', 'd'}, p3 {'a', 'e'} > >> Table 2: p1 {'c', 'd'}, p2 {'a', 'e'}, p3 {'b', 'f'} > >> > >> In this case, mismatch on the first list would make the two tables > >> incompatibly partitioned, whereas they really aren't incompatible. > > > > Such a limitation seems clearly unacceptable. We absolutely must be > > able to match up compatible partitioning schemes without getting > > confused by little details like the order of the partitions. > > Agreed. Will change my patch to adopt the below method. > > >> Choice 2: Representation with 2 arrays: > >> > >> Table 1: ['a', 'b', 'c', 'd', 'e', 'f'], [3, 1, 2, 2, 3, 1] > >> Table 2: ['a', 'b', 'c', 'd', 'e', 'f'], [2, 3, 1, 1, 2, 3] > >> > >> It still doesn't help the case of pairwise joins because it's hard to > tell > >> which value belongs to which partition (the 2nd array carries the > original > >> partition numbers). Although it might still work for tuple-routing. > > > > It's very good for tuple routing. It can also be used to match up > > partitions for pairwise joins. Compare the first arrays. If they are > > unequal, stop. Else, compare the second arrays, incrementally > > building a mapping between them and returning false if the mapping > > turns out to be non-bijective. For example, in this case, we look at > > index 0 and decide that 3 -> 2. We look at index 1 and decide 1 -> 3. > > We look at index 2 and decide 2 -> 1. We look at index 4 and find > > that we already have a mapping for 2, but it's compatible because we > > need 2 -> 1 and that's what is already there. Similarly for the > > remaining entries. This is really a pretty easy loop to write and it > > should run very quickly. > > I see, it does make sense to try to implement this way. > > >> Choice 3: Order all lists' elements for each list individually and then > >> order the lists themselves on their first values: > >> > >> Table 1: p3 {'a', 'e'}, p2 {'b', 'f'}, p1 {'c', 'd'} > >> Table 2: p2 {'a', 'e'}, p1 {'b', 'f'}, p3 {'c', 'd'} > >> > >> This representation makes pairing partitions for pairwise joining > >> convenient but for tuple-routing we still need to visit each partition > in > >> the worst case. > > > > I think this is clearly not good