Re: [HACKERS] Review of GetUserId() Usage
Jeevan, * Jeevan Chalke (jeevan.cha...@gmail.com) wrote: The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation:tested, passed I have reviewed the patch. Patch is excellent in shape and does what is expected and discussed. Also changes are straight forward too. Great, thanks! So looks good to go in. However I have one question: What is the motive for splitting the function return value from SIGNAL_BACKEND_NOPERMISSION into SIGNAL_BACKEND_NOSUPERUSER and SIGNAL_BACKEND_NOPERMISSION? Is that required for some other upcoming patches OR just for simplicity? That was done to provide a more useful error-message to the user. It's not strictly required, I'll grant, but I don't see a reason to avoid doing it either. Currently, we have combined error for both which is simply split into two. No issue as such, just curious as it does not go well with the subject. It seemed reasonable to me to improve the clarity of the error messages. You can mark this for ready for committer. Done. I've also claimed it as a committer and, barring objections, will go ahead and push it soonish. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] logical column ordering
On 2/27/15 2:37 PM, Gavin Flower wrote: Might be an idea to allow lists of columns and their starting position. ALTER TABLE customer ALTER COLUMN (job_id, type_id, account_num) SET ORDER 3; I would certainly want something along those lines because it would be *way* less verbose (and presumably a lot faster) than a slew of ALTER TABLE statements. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Invent a memory context reset/delete callback mechanism.
Jeff Davis pg...@j-davis.com writes: On Fri, 2015-02-27 at 22:17 +, Tom Lane wrote: In passing, per discussion, rearrange some boolean fields in struct MemoryContextData so as to avoid wasted padding space. For safety, this requires making allowInCritSection's existence unconditional; but I think that's a better approach than what was there anyway. I notice that this uses the bytes in MemoryContextData that I was hoping to use for the memory accounting patch (for memory-bounded HashAgg). Meh. I thought Andres' complaint about that was unfounded anyway ;-). But I don't really see why a memory accounting patch should be expected to have zero footprint. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
On 28/02/15 18:34, Jim Nasby wrote: On 2/27/15 2:49 PM, Alvaro Herrera wrote: Tomas Vondra wrote: 1) change the order of columns in SELECT * - display columns so that related ones grouped together (irrespectedly whether they were added later, etc.) FWIW, I find the ordering more important for things like \d than SELECT *. Hey, after we get this done the next step is allowing different logical ordering for different uses! ;P [...] How about CREATE COLUMN SELECTION my_col_sel (a, g, b, e) FROM TABLE my_table; Notes: 1. The column names must match those of the table 2. The COLUMN SELECTION is associated with the specified table 3. If a column gets renamed, then the COLUMN SELECTION effectively gets updated to use the new column name (This can probably be done automatically, by virtue of storing references to the appropriate column definitions) 4. Allow fewer columns in the COLUMN SELECTION than the original table 5. Allow the the same column to be duplicated (trivial, simply don't raise an error for duplicates) 6. Allow the COLUMN SELECTION name to appear instead of the list of columns after the SELECT key word (SELECT COLUMN SELECTION my_col_sel FROM my_table WHERE ... - must match table in FROM clause) If several tables are defined in the FROM clause, and 2 different tables have COLUMN SELECTION with identical names, then the COLUMN SELECTION names in the SELECT must be prefixed either the table name or its alias. Cheers, Gavin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Docs about shared memory
Thank you, it's ideal for me :) 2015-02-27 15:21 GMT+03:00 Michael Paquier michael.paqu...@gmail.com: On Fri, Feb 27, 2015 at 9:13 PM, Vadim Gribanov gribanov.vadi...@gmail.com wrote: Hi! Where i can find explanation about how postgresql works with shared memory? Perhaps this video of Bruce's presentation about the subject would help: https://www.youtube.com/watch?v=Nwk-UfjlUn8 -- Michael -- Best regard Vadim Gribanov Linkedin: https://www.linkedin.com/in/yoihito Skype: v1mk550 Github: https://github.com/yoihito
Re: [HACKERS] logical column ordering
On 28/02/15 12:21, Josh Berkus wrote: On 02/27/2015 03:06 PM, Tomas Vondra wrote: On 27.2.2015 23:48, Josh Berkus wrote: Actually, I'm going to go back on what I said. We need an API for physical column reordering, even if it's just pg_ functions. The reason is that we want to enable people writing their own physical column re-ordering tools, so that our users can figure out for us what the best reordering algorithm is. I doubt that. For example, do you realize you can only do that while the table is completely empty, and in that case you can just do a CREATE TABLE with the proper order? Well, you could recreate the table as the de-facto API, although as you point out below that still requires new syntax. But I was thinking of something which would re-write the table, just like ADD COLUMN x DEFAULT '' does now. I also doubt the users will be able to optimize the order better than users, who usually have on idea of how this actually works internally. We have a lot of power users, including a lot of the people on this mailing list. Among the things we don't know about ordering optimization: * How important is it for index performance to keep key columns adjacent? * How important is it to pack values 4 bytes, as opposed to packing values which are non-nullable? * How important is it to pack values of the same size, as opposed to packing values which are non-nullable? But if we want to allow users to define this, I'd say let's make that part of CREATE TABLE, i.e. the order of columns defines logical order, and you use something like 'AFTER' to specify physical order. CREATE TABLE test ( a INT AFTER b,-- attlognum = 1, attphysnum = 2 b INT -- attlognum = 2, attphysnum = 1 ); It might get tricky because of cycles, though. It would be a lot easier to allow the user to specific a scalar. CREATE TABLE test ( a INT NOT NULL WITH ( lognum 1, physnum 2 ) b INT WITH ( lognum 2, physnum 1 ) ... and just throw an error if the user creates duplicates or gaps. I thinks gaps should be okay. Remember BASIC? Lines numbers tended to be in 10's so you could easily slot new lines in between the existing ones - essential when using the Teletype input/output device. Though the difference here is that you would NOT want to remember the original order numbers (at least I don't think that would be worth the coding effort space). However, the key is the actual order, not the numbering. However, that might require a WARNING message to say that the columns would be essentially renumbered from 1 onwards when the table was actually created - if gaps had been left. Cheers, Gavin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Invent a memory context reset/delete callback mechanism.
* Tom Lane (t...@sss.pgh.pa.us) wrote: Jeff Davis pg...@j-davis.com writes: On Fri, 2015-02-27 at 22:17 +, Tom Lane wrote: In passing, per discussion, rearrange some boolean fields in struct MemoryContextData so as to avoid wasted padding space. For safety, this requires making allowInCritSection's existence unconditional; but I think that's a better approach than what was there anyway. I notice that this uses the bytes in MemoryContextData that I was hoping to use for the memory accounting patch (for memory-bounded HashAgg). Meh. I thought Andres' complaint about that was unfounded anyway ;-). But I don't really see why a memory accounting patch should be expected to have zero footprint. For my 2c, I agree that it's a bit ugly to waste space due to padding, but I'm all about improving the memory accounting and would feel that's well worth having a slightly larger MemoryContextData. In other words, I agree with Tom that it doesn't need to have a zero footprint, but disagree about wasting space due to padding. :D Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
* Josh Berkus (j...@agliodbs.com) wrote: On 02/27/2015 04:41 PM, Stephen Frost wrote: we can do copy of pg_hba.conf somewhere when postmaster starts or when it is reloaded. Please see my reply to Tom. There's no trivial way to reach into the postmaster from a backend- but we do get a copy of whatever the postmaster had when we forked, and the postmaster only reloads pg_hba.conf on a sighup and that sighup is passed down to the children, so we simply need to also reload the pg_hba.conf in the children when they get a sighup. That's how postgresql.conf is handled, which is what pg_settings is based off of, and I believe is the behavior folks are really looking for. I thought the patch in question just implemented reading the file from disk, and nothing else? Speaking for my uses, I would rather have just that for 9.5 than wait for something more sophisticated in 9.6. From my perspective, at least, the differences we're talking about are not enough to raise this to a 9.5-vs-9.6 issue. I can see the use cases for both (which is exactly why I suggested providing both). Having one would be better than nothing, but I foretell lots of subsequent complaints along the lines of everything looks right according to pg_hba_config, but I'm getting this error!! Now, perhaps that's the right approach to go for 9.5 since it'd more-or-less force our hand to deal with it in 9.6 properly, but, personally, I'd be happier if we moved forward with providing both because everyone agrees that it makes sense rather than waiting to see if user complaints force our hand. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
2015-02-28 3:12 GMT+01:00 Stephen Frost sfr...@snowman.net: * Josh Berkus (j...@agliodbs.com) wrote: On 02/27/2015 04:41 PM, Stephen Frost wrote: we can do copy of pg_hba.conf somewhere when postmaster starts or when it is reloaded. Please see my reply to Tom. There's no trivial way to reach into the postmaster from a backend- but we do get a copy of whatever the postmaster had when we forked, and the postmaster only reloads pg_hba.conf on a sighup and that sighup is passed down to the children, so we simply need to also reload the pg_hba.conf in the children when they get a sighup. That's how postgresql.conf is handled, which is what pg_settings is based off of, and I believe is the behavior folks are really looking for. I thought the patch in question just implemented reading the file from disk, and nothing else? Speaking for my uses, I would rather have just that for 9.5 than wait for something more sophisticated in 9.6. From my perspective, at least, the differences we're talking about are not enough to raise this to a 9.5-vs-9.6 issue. I can see the use cases for both (which is exactly why I suggested providing both). Having one would be better than nothing, but I foretell lots of subsequent complaints along the lines of everything looks right according to pg_hba_config, but I'm getting this error!! Now, perhaps that's the right approach to go for 9.5 since it'd more-or-less force our hand to deal with it in 9.6 properly, but, personally, I'd be happier if we moved forward with providing both because everyone agrees that it makes sense rather than waiting to see if user complaints force our hand. +1 Probably we can implement simple load pg_hba.conf and tab transformation early. There is a agreement and not any problem. But if we start to implement some view, then it should be fully functional without potential issues. Regards Pavel Thanks! Stephen
Re: [HACKERS] logical column ordering
On 2/27/15 2:49 PM, Alvaro Herrera wrote: Tomas Vondra wrote: 1) change the order of columns in SELECT * - display columns so that related ones grouped together (irrespectedly whether they were added later, etc.) FWIW, I find the ordering more important for things like \d than SELECT *. Hey, after we get this done the next step is allowing different logical ordering for different uses! ;P - keep columns synced with COPY - requires user interface (ALTER TABLE _ ALTER COLUMN _ SET ORDER _) Not sure about the ORDER # syntax. In ALTER ENUM we have AFTER value and such .. I'd consider that instead. +1. See also Gavin's suggestion of ALTER COLUMN (a, b, c) SET ORDER ... 2) optimization of physical order (efficient storage / tuple deforming) - more efficient order for storage (deforming) - may be done manually by reordering columns in CREATE TABLE - should be done automatically (no user interface required) A large part of it can be done automatically: for instance, not-nullable fixed length types ought to come first, because that enables the I would think that eliminating wasted space due to alignment would be more important than optimizing attcacheoff, at least for a database that doesn't fit in memory. Even if it does fit in memory I suspect memory bandwidth is more important than clock cycles. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange assertion using VACOPT_FREEZE in vacuum.c
Michael, * Michael Paquier (michael.paqu...@gmail.com) wrote: On Wed, Feb 18, 2015 at 12:09 AM, Robert Haas robertmh...@gmail.com wrote: I think it's right the way it is. The parser constructs a VacuumStmt for either a VACUUM or an ANALYZE command. That statement is checking that if you are doing an ANALYZE, you can't specify FULL or FREEZE. That makes sense, because there is no ANALYZE FULL or ANALYZE FREEZE command. Yes, the existing assertion is right. My point is that it is strange that we do not check the values of freeze parameters for an ANALYZE query, which should be set to -1 all the time. If this is thought as not worth checking, I'll drop this patch and my concerns. I'm trying to wrap my head around the reasoning for this also and not sure I'm following. In general, I don't think we protect all that hard against functions being called with tokens that aren't allowed by the parse. So, basically, this feels like it's not really the right place for these checks and if there is an existing problem then it's probably with the grammar... Does that make sense? Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Bug in pg_dump
On Sat, Feb 28, 2015 at 12:29 AM, Gilles Darold gilles.dar...@dalibo.com wrote: Le 26/02/2015 12:41, Michael Paquier a écrit : On Wed, Feb 25, 2015 at 7:46 AM, Gilles Darold gilles.dar...@dalibo.com wrote: This is a far better patch and the test to export/import of the postgis_topology extension works great for me. Thanks for the work. Attached is a patch that uses an even better approach by querying only once all the FK dependencies of tables in extensions whose data is registered as dumpable by getExtensionMembership(). Could you check that it works correctly? My test case passes but an extra check would be a good nice. The patch footprint is pretty low so we may be able to backport this patch easily. Works great too. I'm agree that this patch should be backported but I think we need to warn admins in the release note that their import/restore scripts may be broken. Of course it makes sense to mention that in the release notes, this behavior of pg_dump being broken since the creation of extensions. Since it is working on your side as well, let's put it in the hands of a committer then and I am marking it as such on the CF app. The test case I sent on this thread can be used to reproduce the problem easily with TAP tests... -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
Even if it does fit in memory I suspect memory bandwidth is more important than clock cycles. http://people.freebsd.org/~lstewart/articles/cpumemory.pdf This paper is old but the ratios should still be pretty accurate. Main memory is 240 clock cycles away and L1d is only 3. If the experiments in this paper still hold true loading the 8K block into L1d is far more expensive than the CPU processing done once the block is in cache. When one adds in NUMA to the contention on this shared resource, its not that hard for a 40 core machine to starve for memory bandwidth, and for cores to sit idle waiting for main memory. Eliminating wasted space seems far more important even when everything could fit in memory already.
Re: [HACKERS] deparsing utility commands
Alvaro, * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote: Thanks. I have adjusted the code to use ObjectAddress in all functions called by ProcessUtilitySlow; the patch is a bit tedious but seems pretty reasonable to me. As I said, there is quite a lot of churn. Thanks for doing this! Also attached are some patches to make some commands return info about a secondary object regarding the execution, which can be used to know more about what's being executed: 1) ALTER DOMAIN ADD CONSTRAINT returns (in addition to the address of the doamin) the address of the new constraint. 2) ALTER OBJECT SET SCHEMA returns (in addition to the address of the object) the address of the old schema. 3) ALTER EXTENSION ADD/DROP OBJECT returns (in addition to the address of the extension) the address of the added/dropped object. Also attached is the part of these patches that have ALTER TABLE return the AttrNumber and OID of the affected object. I think this is all mostly uninteresting, but thought I'd throw it out for public review before pushing, just in case. I have fixed many issues in the rest of the branch meanwhile, and it's looking much better IMO, but much work remains there and will leave them for later. Glad to hear that things are progressing well! Subject: [PATCH 1/7] Have RENAME routines return ObjectAddress rather than OID [...] /* * renameatt - changes the name of a attribute in a relation + * + * The returned ObjectAddress is that of the pg_class address of the column. */ -Oid +ObjectAddress renameatt(RenameStmt *stmt) This comment didn't really come across sensibly to me. I'd suggest instead: The ObjectAddress returned is that of the column which was renamed. Or something along those lines instead? The rest of this patch looked fine to me. Subject: [PATCH 2/7] deparse/core: have COMMENT return an ObjectAddress, not OID [...] Looked fine to me. Subject: [PATCH 3/7] deparse/core: have ALTER TABLE return table OID and other data [...] @@ -1844,7 +1844,7 @@ heap_drop_with_catalog(Oid relid) /* * Store a default expression for column attnum of relation rel. */ -void +Oid StoreAttrDefault(Relation rel, AttrNumber attnum, Node *expr, bool is_internal) { @@ -1949,6 +1949,8 @@ StoreAttrDefault(Relation rel, AttrNumber attnum, */ InvokeObjectPostCreateHookArg(AttrDefaultRelationId, RelationGetRelid(rel), attnum, is_internal); + + return attrdefOid; } Why isn't this returning an ObjectAddress too? It even builds one up for you in defobject. Also, the comment should be updated to reflect that it's now returning something. /* @@ -1956,8 +1958,10 @@ StoreAttrDefault(Relation rel, AttrNumber attnum, * * Caller is responsible for updating the count of constraints * in the pg_class entry for the relation. + * + * The OID of the new constraint is returned. */ -static void +static Oid StoreRelCheck(Relation rel, char *ccname, Node *expr, bool is_validated, bool is_local, int inhcount, bool is_no_inherit, bool is_internal) @@ -1967,6 +1971,7 @@ StoreRelCheck(Relation rel, char *ccname, Node *expr, List *varList; int keycount; int16 *attNos; + Oid constrOid; /* * Flatten expression to string form for storage. @@ -2018,7 +2023,7 @@ StoreRelCheck(Relation rel, char *ccname, Node *expr, /* * Create the Check Constraint */ - CreateConstraintEntry(ccname, /* Constraint Name */ + constrOid = CreateConstraintEntry(ccname, /* Constraint Name */ RelationGetNamespace(rel), /* namespace */ CONSTRAINT_CHECK, /* Constraint Type */ false,/* Is Deferrable */ @@ -2049,11 +2054,15 @@ StoreRelCheck(Relation rel, char *ccname, Node *expr, pfree(ccbin); pfree(ccsrc); + + return constrOid; } Ditto here? CreateConstraintEntry would have to be updated to return the ObjectAddress that it builds up, but there aren't all that many callers of it.. /* * Store defaults and constraints (passed as a list of CookedConstraint). * + * Each CookedConstraint struct is modified to store the new catalog tuple OID. + * * NOTE: only pre-cooked expressions will be passed this way, which is to * say expressions inherited from an existing relation. Newly parsed * expressions can be added later, by direct calls to StoreAttrDefault @@ -2065,7 +2074,7 @@ StoreConstraints(Relation rel, List *cooked_constraints, bool is_internal) int numchecks = 0; ListCell *lc;
Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
2015-02-28 1:41 GMT+01:00 Stephen Frost sfr...@snowman.net: Pavel, * Pavel Stehule (pavel.steh...@gmail.com) wrote: 2015-02-27 22:26 GMT+01:00 Tom Lane t...@sss.pgh.pa.us: Stephen Frost sfr...@snowman.net writes: Right, we also need a view (or function, or both) which provides what the *active* configuration of the running postmaster is. This is exactly what I was proposing (or what I was intending to, at least) with pg_hba_active, so, again, I think we're in agreement here. I think that's going to be a lot harder than you realize, and it will have undesirable security implications, in that whatever you do to expose the postmaster's internal state to backends will also make it visible to other onlookers; not to mention probably adding new failure modes. we can do copy of pg_hba.conf somewhere when postmaster starts or when it is reloaded. Please see my reply to Tom. There's no trivial way to reach into the postmaster from a backend- but we do get a copy of whatever the postmaster had when we forked, and the postmaster only reloads pg_hba.conf on a sighup and that sighup is passed down to the children, so we simply need to also reload the pg_hba.conf in the children when they get a sighup. That's how postgresql.conf is handled, which is what pg_settings is based off of, and I believe is the behavior folks are really looking for. It has sense for me too. Pavel Thanks, Stephen
Re: [HACKERS] Improving RLS qual pushdown
Dean, * Dean Rasheed (dean.a.rash...@gmail.com) wrote: Attached is a patch that does that, allowing restriction clause quals to be pushed down into subquery RTEs if they contain leaky functions, provided that the arglists of those leaky functions contain no Vars (such Vars would necessarily refer to output columns of the subquery, which is the data that must not be leaked). --- 1982,1989 * 2. If unsafeVolatile is set, the qual must not contain any volatile * functions. * ! * 3. If unsafeLeaky is set, the qual must not contain any leaky functions ! * that might reveal values from the subquery as side effects. I'd probably extend this comment to note that the way we figure that out is by looking for Vars being passed to non-leakproof functions. --- 1318,1347 } /* ! *Check clauses for non-leakproof functions that might leak Vars */ Might leak data in Vars (or somethhing along those lines...) /* ! * contain_leaked_vars ! * Recursively scan a clause to discover whether it contains any Var ! * nodes (of the current query level) that are passed as arguments to ! * leaky functions. * ! * Returns true if any function call with side effects may be present in the ! * clause and that function's arguments contain any Var nodes of the current ! * query level. Qualifiers from outside a security_barrier view that leak ! * Var nodes in this way should not be pushed down into the view, lest the ! * contents of tuples intended to be filtered out be revealed via the side ! * effects. */ We don't actually know anything about the function and if it actually has any side effects or not, so it might better to avoid discussing that here. How about: Returns true if any non-leakproof function is passed in data through a Var node as that function might then leak see sensetive information. Only leakproof functions are allowed to see data prior to the qualifiers which are defined in the security_barrier view and therefore we can only push down qualifiers if they are either leakproof or if they aren't passed any Vars from this query level (and therefore they are not able to see any of the data in the tuple, even if they are pushed down). --- 1408,1428 CoerceViaIO *expr = (CoerceViaIO *) node; Oid funcid; Oid ioparam; + boolleaky; boolvarlena; getTypeInputInfo(exprType((Node *) expr-arg), funcid, ioparam); ! leaky = !get_func_leakproof(funcid); ! if (!leaky) ! { ! getTypeOutputInfo(expr-resulttype, funcid, varlena); ! leaky = !get_func_leakproof(funcid); ! } ! ! if (leaky ! contain_var_clause((Node *) expr-arg)) return true; } break; That seems slightly convoluted to me. Why not simply do: bool in_leakproof, out_leakproof; getTypeInputInfo(exprType((Node *) expr-arg), funcid, ioparam); in_leakproof = get_func_leakproof(funcid); getTypeOutputInfo(expr-resulttype, funcid, varlena); out_leakproof = get_func_leakproof(funcid); if (!(in_leakproof out_leakproof) contain_var_clause((Node *))) return true; ... --- 1432,1452 ArrayCoerceExpr *expr = (ArrayCoerceExpr *) node; Oid funcid; Oid ioparam; + boolleaky; boolvarlena; getTypeInputInfo(exprType((Node *) expr-arg), funcid, ioparam); ! leaky = !get_func_leakproof(funcid); ! ! if (!leaky) ! { ! getTypeOutputInfo(expr-resulttype, funcid, varlena); ! leaky = !get_func_leakproof(funcid); ! } ! ! if (leaky ! contain_var_clause((Node *) expr-arg)) return true; } break;
Re: [HACKERS] Proposal: knowing detail of config files via SQL
Sawada, * Sawada Masahiko (sawada.m...@gmail.com) wrote: Attached patch is latest version including following changes. - This view is available to super use only - Add sourceline coulmn Alright, first off, to Josh's point- I'm definitely interested in a capability to show where the heck a given config value is coming from. I'd be even happier if there was a way to *force* where config values have to come from, but that ship sailed a year ago or so. Having this be for the files, specifically, seems perfectly reasonable to me. I could see having another function which will report where a currently set GUC variable came from (eg: ALTER ROLE or ALTER DATABASE), but having a function for which config file is this GUC set in seems perfectly reasonable to me. To that point, here's a review of this patch. diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 6df6bf2..f628cb0 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -412,6 +412,11 @@ CREATE RULE pg_settings_n AS GRANT SELECT, UPDATE ON pg_settings TO PUBLIC; +CREATE VIEW pg_file_settings AS + SELECT * FROM pg_show_all_file_settings() AS A; + +REVOKE ALL on pg_file_settings FROM public; While this generally works, the usual expectation is that functions that should be superuser-only have a check in the function rather than depending on the execute privilege. I'm certainly happy to debate the merits of that approach, but for the purposes of this patch, I'd suggest you stick an if (!superuser()) ereport(must be superuser) into the function itself and not work about setting the correct permissions on it. + ConfigFileVariable *guc; As this ends up being an array, I'd call it guc_array or something along those lines. GUC in PG-land has a pretty specific single-entity meaning. @@ -344,6 +346,21 @@ ProcessConfigFile(GucContext context) PGC_BACKEND, PGC_S_DYNAMIC_DEFAULT); } + guc_file_variables = (ConfigFileVariable *) + guc_malloc(FATAL, num_guc_file_variables * sizeof(struct ConfigFileVariable)); Uh, perhaps I missed it, but what happens on a reload? Aren't we going to realloc this every time? Seems like we should be doing a guc_malloc() the first time through but doing guc_realloc() after, or we'll leak memory on every reload. + /* + * Apply guc config parameters to guc_file_variable + */ + guc = guc_file_variables; + for (item = head; item; item = item-next, guc++) + { + guc-name = guc_strdup(FATAL, item-name); + guc-value = guc_strdup(FATAL, item-value); + guc-filename = guc_strdup(FATAL, item-filename); + guc-sourceline = item-sourceline; + } Uh, ditto and double-down here. I don't see a great solution other than looping through the previous array and free'ing each of these, since we can't depend on the memory context machinery being up and ready at this point, as I recall. diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 931993c..c69ce66 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -3561,9 +3561,17 @@ static const char *const map_old_guc_names[] = { */ static struct config_generic **guc_variables; +/* + * lookup of variables for pg_file_settings view. + */ +static struct ConfigFileVariable *guc_file_variables; + /* Current number of variables contained in the vector */ static int num_guc_variables; +/* Number of file variables */ +static int num_guc_file_variables; + I'd put these together, and add a comment explaining that guc_file_variables is an array of length num_guc_file_variables.. /* + * Return the total number of GUC File variables + */ +int +GetNumConfigFileOptions(void) +{ + return num_guc_file_variables; +} I don't see the point of this function.. +Datum +show_all_file_settings(PG_FUNCTION_ARGS) +{ + FuncCallContext *funcctx; + TupleDesc tupdesc; + int call_cntr; + int max_calls; + AttInMetadata *attinmeta; + MemoryContext oldcontext; + + if (SRF_IS_FIRSTCALL()) + { + funcctx = SRF_FIRSTCALL_INIT(); + + oldcontext = MemoryContextSwitchTo(funcctx-multi_call_memory_ctx); + + /* + * need a tuple descriptor representing NUM_PG_SETTINGS_ATTS columns + * of the appropriate types + */ + + tupdesc = CreateTemplateTupleDesc(NUM_PG_FILE_SETTINGS_ATTS, false); + TupleDescInitEntry(tupdesc, (AttrNumber) 1, name, +TEXTOID, -1, 0); + TupleDescInitEntry(tupdesc, (AttrNumber) 2, setting, +TEXTOID, -1, 0); + TupleDescInitEntry(tupdesc,
Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
2015-02-28 2:40 GMT+01:00 Tom Lane t...@sss.pgh.pa.us: Stephen Frost sfr...@snowman.net writes: I understand that there may be objections to that on the basis that it's work that's (other than for this case) basically useless, Got it in one. I'm also not terribly happy about leaving security-relevant data sitting around in backend memory 100% of the time. We have had bugs that exposed backend memory contents for reading without also granting the ability to execute arbitrary code, so I think doing this does represent a quantifiable decrease in the security of pg_hba.conf. The Stephen's proposal changes nothing in security. These data are in memory now. The only one difference is, so these data will be fresh. Regards Pavel regards, tom lane
Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
Stephen Frost sfr...@snowman.net writes: I understand that there may be objections to that on the basis that it's work that's (other than for this case) basically useless, Got it in one. I'm also not terribly happy about leaving security-relevant data sitting around in backend memory 100% of the time. We have had bugs that exposed backend memory contents for reading without also granting the ability to execute arbitrary code, so I think doing this does represent a quantifiable decrease in the security of pg_hba.conf. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
* Tom Lane (t...@sss.pgh.pa.us) wrote: Stephen Frost sfr...@snowman.net writes: I understand that there may be objections to that on the basis that it's work that's (other than for this case) basically useless, Got it in one. Meh. It's hardly all that difficult and it's not useless if the user wants to look at it. I'm also not terribly happy about leaving security-relevant data sitting around in backend memory 100% of the time. We have had bugs that exposed backend memory contents for reading without also granting the ability to execute arbitrary code, so I think doing this does represent a quantifiable decrease in the security of pg_hba.conf. How is that any different from today? The only time it's not *already* in backend memory is when the user has happened to go through and make a change and used reload (instead of restart) and then it's not so much that the security sensetive information isn't there, it's just out of date. I'm not entirely against the idea of changing how things are today, but this argument simply doesn't apply to the current state of things. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
On 02/27/2015 04:41 PM, Stephen Frost wrote: we can do copy of pg_hba.conf somewhere when postmaster starts or when it is reloaded. Please see my reply to Tom. There's no trivial way to reach into the postmaster from a backend- but we do get a copy of whatever the postmaster had when we forked, and the postmaster only reloads pg_hba.conf on a sighup and that sighup is passed down to the children, so we simply need to also reload the pg_hba.conf in the children when they get a sighup. That's how postgresql.conf is handled, which is what pg_settings is based off of, and I believe is the behavior folks are really looking for. I thought the patch in question just implemented reading the file from disk, and nothing else? Speaking for my uses, I would rather have just that for 9.5 than wait for something more sophisticated in 9.6. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Docs about shared memory
On Fri, Feb 27, 2015 at 9:13 PM, Vadim Gribanov gribanov.vadi...@gmail.com wrote: Hi! Where i can find explanation about how postgresql works with shared memory? Perhaps this video of Bruce's presentation about the subject would help: https://www.youtube.com/watch?v=Nwk-UfjlUn8 -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
All, * Tomas Vondra (tomas.von...@2ndquadrant.com) wrote: On 28.1.2015 23:01, Jim Nasby wrote: On 1/28/15 12:46 AM, Haribabu Kommi wrote: Also, what happens if someone reloads the config in the middle of running the SRF? hba entries are reloaded only in postmaster process, not in every backend. So there shouldn't be any problem with config file reload. Am i missing something? Ahh, good point. That does raise another issue though... there might well be some confusion about that. I think people are used to the varieties of how settings come into effect that perhaps this isn't an issue with pg_settings, but it's probably worth at least mentioning in the docs for a pg_hba view. I share this fear, and it's the main problem I have with this patch. Uh, yeah, agreed. Currently, the patch always does load_hba() every time pg_hba_settings is accessed, which loads the current contents of the config file into the backend process, but the postmaster still uses the original one. This makes it impossible to look at currently effective pg_hba.conf contents. Damn confusing, I guess. Indeed. At a *minimum*, we'd need to have some way to say what you're seeing isn't what's actually being used. Also, I dare to say that having a system view that doesn't actually show the system state (but contents of a config file that may not be loaded yet), is wrong. Right, we need to show what's currently active in PG-land, not just spit back whatever the on-disk contents currently are. Granted, we don't modify pg_hba.conf all that often, and it's usually followed by a SIGHUP to reload the contents, so both the backend and postmaster should see the same stuff. But the cases when I'd use this pg_hba_settings view usually turned out to be 'forgot to SIGHUP', so this would not help, actually. Agreed. What I can imagine is keeping the original list (parsed by postmaster, containing the effective pg_hba.conf contents), and keeping another list of 'current contents' within backends. And having a 'reload' flag for pg_hba_settings, determining which list to use. pg_hba_settings(reload=false) - old list (from postmaster) pg_hba_settings(reload=true) - new list (from backend) The pg_hba_settings view should use 'reload=false' (i.e. show effective contents of the hba). I don't think we actually care what the current contents are from the backend's point of view- after all, when does an individual backend ever use the contents of pg_hba.conf after it's up and running? What would make sense, to me at least, would be: pg_hba_configured() -- spits back whatever the config file has pg_hba_active() -- shows what the postmaster is using currently Or something along those lines. Note that I'd want pg_hba_configured() to throw an error if the config file can't be parsed (which would be quite handy to make sure that you didn't goof up the config..). This, combined with pg_reload_conf() would provide a good way to review changes before putting them into place. The other feature that'd be cool to have is a debugging function on top of the view, i.e. a function pg_hba_check(host, ip, db, user, pwd) showing which hba rule matched. But that's certainly nontrivial. I'm not sure that I see why, offhand, it'd be much more than trivial.. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Bug in pg_dump
On Thu, Feb 26, 2015 at 08:41:46PM +0900, Michael Paquier wrote: On Wed, Feb 25, 2015 at 7:46 AM, Gilles Darold gilles.dar...@dalibo.com wrote: This is a far better patch and the test to export/import of the postgis_topology extension works great for me. Thanks for the work. Attached is a patch that uses an even better approach by querying only once all the FK dependencies of tables in extensions whose data is registered as dumpable by getExtensionMembership(). Could you check that it works correctly? My test case passes but an extra check would be a good nice. The patch footprint is pretty low so we may be able to backport this patch easily. +1 for backporting. It's a real bug, and real people get hit by it if they're using PostGIS, one of our most popular add-ons. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
2015-02-27 17:59 GMT+01:00 Stephen Frost sfr...@snowman.net: All, * Tomas Vondra (tomas.von...@2ndquadrant.com) wrote: On 28.1.2015 23:01, Jim Nasby wrote: On 1/28/15 12:46 AM, Haribabu Kommi wrote: Also, what happens if someone reloads the config in the middle of running the SRF? hba entries are reloaded only in postmaster process, not in every backend. So there shouldn't be any problem with config file reload. Am i missing something? Ahh, good point. That does raise another issue though... there might well be some confusion about that. I think people are used to the varieties of how settings come into effect that perhaps this isn't an issue with pg_settings, but it's probably worth at least mentioning in the docs for a pg_hba view. I share this fear, and it's the main problem I have with this patch. Uh, yeah, agreed. yes, good notice. I was blind. Currently, the patch always does load_hba() every time pg_hba_settings is accessed, which loads the current contents of the config file into the backend process, but the postmaster still uses the original one. This makes it impossible to look at currently effective pg_hba.conf contents. Damn confusing, I guess. Indeed. At a *minimum*, we'd need to have some way to say what you're seeing isn't what's actually being used. Also, I dare to say that having a system view that doesn't actually show the system state (but contents of a config file that may not be loaded yet), is wrong. Right, we need to show what's currently active in PG-land, not just spit back whatever the on-disk contents currently are. Granted, we don't modify pg_hba.conf all that often, and it's usually followed by a SIGHUP to reload the contents, so both the backend and postmaster should see the same stuff. But the cases when I'd use this pg_hba_settings view usually turned out to be 'forgot to SIGHUP', so this would not help, actually. Agreed. What I can imagine is keeping the original list (parsed by postmaster, containing the effective pg_hba.conf contents), and keeping another list of 'current contents' within backends. And having a 'reload' flag for pg_hba_settings, determining which list to use. pg_hba_settings(reload=false) - old list (from postmaster) pg_hba_settings(reload=true) - new list (from backend) The pg_hba_settings view should use 'reload=false' (i.e. show effective contents of the hba). I don't think we actually care what the current contents are from the backend's point of view- after all, when does an individual backend ever use the contents of pg_hba.conf after it's up and running? What would make sense, to me at least, would be: pg_hba_configured() -- spits back whatever the config file has pg_hba_active() -- shows what the postmaster is using currently I disagree and I dislike this direction. It is looks like over engineering. * load every time is wrong, because you will see possibly not active data. * ignore reload is a attack to mental health of our users. It should to work like pg_settings. I need to see what is wrong in this moment in pg_hba.conf, not what was or what will be wrong. We can load any config files via admin contrib module - so there is not necessary repeat same functionality Regards Pavel Or something along those lines. Note that I'd want pg_hba_configured() to throw an error if the config file can't be parsed (which would be quite handy to make sure that you didn't goof up the config..). This, combined with pg_reload_conf() would provide a good way to review changes before putting them into place. The other feature that'd be cool to have is a debugging function on top of the view, i.e. a function pg_hba_check(host, ip, db, user, pwd) showing which hba rule matched. But that's certainly nontrivial. I'm not sure that I see why, offhand, it'd be much more than trivial.. Thanks! Stephen
Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
On 27.2.2015 17:59, Stephen Frost wrote: All, * Tomas Vondra (tomas.von...@2ndquadrant.com) wrote: The other feature that'd be cool to have is a debugging function on top of the view, i.e. a function pg_hba_check(host, ip, db, user, pwd) showing which hba rule matched. But that's certainly nontrivial. I'm not sure that I see why, offhand, it'd be much more than trivial ... From time to time I have to debug why are connection attempts failing, and with moderately-sized pg_hba.conf files (e.g. on database servers shared by multiple applications) that may be tricky. Identifying the rule that matched (and rejected) the connection would be helpful. But yes, that's non-trivial and out of scope of this patch. -- Tomas Vondrahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] POLA violation with \c service=
I don't understand. Why don't these patches move anything to src/common? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] POLA violation with \c service=
On Mon, Feb 23, 2015 at 05:56:12PM -0300, Alvaro Herrera wrote: David Fetter wrote: My thinking behind this was that the patch is a bug fix and intended to be back-patched, so I wanted to mess with as little infrastructure as possible. A new version of libpq seems like a very big ask for such a case. You'll recall that the original problem was that \c service=foo only worked accidentally for some pretty narrow use cases and broke without much of a clue for the rest. It turned out that the general problem was that options given to psql on the command line were not even remotely equivalent to \c, even though they were documented to be. So, in view of these arguments and those put forward by Pavel downthread, I think the attached is an acceptable patch for the master branch. It doesn't apply to back branches though; 9.4 and 9.3 have a conflict in tab-complete.c, 9.2 has additional conflicts in command.c, and 9.1 and 9.0 are problematic all over because they don't have src/common. Could you please submit patches adapted for each group of branches? Please find patches attached for each live branch. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate From 64c7d3bcfad80dc80857b2ea06c9f2b7fff8f2a5 Mon Sep 17 00:00:00 2001 From: David Fetter da...@fetter.org Date: Wed, 25 Feb 2015 13:51:17 -0800 Subject: [PATCH] From Alvaro Herrera Resolved conflicts: src/bin/psql/tab-complete.c Resolved conflicts: src/bin/psql/command.c Resolved conflicts: doc/src/sgml/ref/psql-ref.sgml src/bin/psql/command.c --- doc/src/sgml/ref/psql-ref.sgml | 61 ++--- src/bin/psql/command.c | 77 -- src/bin/psql/common.c | 36 src/bin/psql/common.h | 2 ++ src/bin/psql/tab-complete.c| 12 ++- 5 files changed, 150 insertions(+), 38 deletions(-) diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index 15bbd7a..b7f0601 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -751,33 +751,20 @@ testdb=gt; /varlistentry varlistentry -termliteral\C [ replaceable class=parametertitle/replaceable ]/literal/term -listitem -para -Sets the title of any tables being printed as the result of a -query or unset any such title. This command is equivalent to -literal\pset title replaceable -class=parametertitle/replaceable/literal. (The name of -this command derives from quotecaption/quote, as it was -previously only used to set the caption in an -acronymHTML/acronym table.) -/para -/listitem - /varlistentry - - varlistentry -termliteral\connect/literal (or literal\c/literal) literal[ replaceable class=parameterdbname/replaceable [ replaceable class=parameterusername/replaceable ] [ replaceable class=parameterhost/replaceable ] [ replaceable class=parameterport/replaceable ] ]/literal/term +termliteral\c/literal or literal\connect/literal literal [ { [ replaceable class=parameterdbname/replaceable [ replaceable class=parameterusername/replaceable ] [ replaceable class=parameterhost/replaceable ] [ replaceable class=parameterport/replaceable ] ] | replaceable class=parameterconninfo/replaceable string | replaceable class=parameterURI/replaceable } ] /literal/term listitem para Establishes a new connection to a productnamePostgreSQL/ -server. If the new connection is successfully made, the -previous connection is closed. If any of replaceable +server using positional parameters as described below, a +parameterconninfo/parameter string, or a acronymURI/acronym. If the new connection is +successfully made, the +previous connection is closed. When using positional parameters, if any of replaceable class=parameterdbname/replaceable, replaceable class=parameterusername/replaceable, replaceable class=parameterhost/replaceable or replaceable class=parameterport/replaceable are omitted or specified as literal-/literal, the value of that parameter from the -previous connection is used. If there is no previous +previous connection is used. If using positional parameters and there is no previous connection, the applicationlibpq/application default for the parameter's value is used. /para @@ -792,6 +779,42 @@ testdb=gt; mechanism that scripts are not accidentally acting on the wrong database on the other hand. /para + +para +Positional syntax:
Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
Hi, On 28.1.2015 23:01, Jim Nasby wrote: On 1/28/15 12:46 AM, Haribabu Kommi wrote: Also, what happens if someone reloads the config in the middle of running the SRF? hba entries are reloaded only in postmaster process, not in every backend. So there shouldn't be any problem with config file reload. Am i missing something? Ahh, good point. That does raise another issue though... there might well be some confusion about that. I think people are used to the varieties of how settings come into effect that perhaps this isn't an issue with pg_settings, but it's probably worth at least mentioning in the docs for a pg_hba view. I share this fear, and it's the main problem I have with this patch. Currently, the patch always does load_hba() every time pg_hba_settings is accessed, which loads the current contents of the config file into the backend process, but the postmaster still uses the original one. This makes it impossible to look at currently effective pg_hba.conf contents. Damn confusing, I guess. Also, I dare to say that having a system view that doesn't actually show the system state (but contents of a config file that may not be loaded yet), is wrong. Granted, we don't modify pg_hba.conf all that often, and it's usually followed by a SIGHUP to reload the contents, so both the backend and postmaster should see the same stuff. But the cases when I'd use this pg_hba_settings view usually turned out to be 'forgot to SIGHUP', so this would not help, actually. What I can imagine is keeping the original list (parsed by postmaster, containing the effective pg_hba.conf contents), and keeping another list of 'current contents' within backends. And having a 'reload' flag for pg_hba_settings, determining which list to use. pg_hba_settings(reload=false) - old list (from postmaster) pg_hba_settings(reload=true) - new list (from backend) The pg_hba_settings view should use 'reload=false' (i.e. show effective contents of the hba). The other feature that'd be cool to have is a debugging function on top of the view, i.e. a function pg_hba_check(host, ip, db, user, pwd) showing which hba rule matched. But that's certainly nontrivial. -- Tomas Vondrahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC idea - Simulated annealing to search for query plans
On Thu, Feb 26, 2015 at 10:59:50PM +0100, Grzegorz Parka wrote: Dear Hackers, I'm Grzegorz Parka, BSc Engineer of Technical Physics and student of Computer Science at WUT, Poland. Last year I've been a bit into evolutionary algorithms and during my research I found out about GEQO in Postgres. I also found out that there are plans to try a different attempt to find optimal query plans and thought it could be a good thing to use it as a project for GSoC. I'm interested in one of old TODO items related to the optimizer - 'Consider compressed annealing to search for query plans'. I believe this would be potentially beneficial to Postgres to check if such a heuristic could really choose better query plans than GEQO does. Judging by the results that simulated annealing gives on the travelling salesman problem, it looks like a simpler and potentially more effective way of combinatorial optimization. As deliverables of such a project I would provide a simple implementation of basic simulated annealing optimizer and some form of quantitative comparison with GEQO. I see that this may be considerably bigger than most of GSoC projects, but I would like to know your opinion. Do you think that this would be beneficial enough to make a proper GSoC project? I would also like to know if you have any additional ideas about this project. Best regards, Grzegorz Parka Hi, I think someone already mentioned it, but it would be very neat if the optimizer could be pluggable. Then many different algorithms could be evaluated more easily. Regards, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming MemoryContextResetAndDeleteChildren to MemoryContextReset
Andrew Gierth and...@tao11.riddles.org.uk writes: Tom == Tom Lane t...@sss.pgh.pa.us writes: Tom We've discussed doing $SUBJECT off and on for nearly ten years, So, this is also changing (indirectly) the effect of ReScanExprContext so that deletes child contexts too. Right, this is actually the main point so far as I'm concerned. My expanded arrays patch currently has #define ResetExprContext(econtext) \ - MemoryContextReset((econtext)-ecxt_per_tuple_memory) + MemoryContextResetAndDeleteChildren((econtext)-ecxt_per_tuple_memory) but this is a more general fix. I guess the only question is whether anything currently relies on ReScanExprContext's current behavior. I've not seen any regression test failures either with the expanded arrays patch or this one. It's conceivable that something would create a context under a short-lived expression context and expect it to still be there (though empty) after a context reset; that idea was the reason I designed MemoryContextReset this way in the first place. But fifteen years later, it's quite clear that that was a mistake and we don't actually use contexts that way. (Worth noting is that the memory context reset callback mechanism I propose nearby is sort of a second pass at expression shutdown callbacks, as well.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] star schema and the optimizer
Marc Cousin cousinm...@gmail.com writes: So I gave a look at the optimizer's code to try to understand why I got this problem. If I understand correctly, the optimizer won't do cross joins, except if it has no choice. That's right, and as you say, the planning-speed consequences of doing otherwise would be disastrous. However, all you need to help it find the right plan is some dummy join condition between the dimension tables, which will allow the join path you want to be considered. Perhaps you could do something like SELECT * FROM dim1,dim2,facts WHERE facts.dim1=dim1.a and facts.dim2=dim2.a and dim1.b=12 AND dim2.b=17 and (dim1.a+dim2.a) is not null; The details of the extra condition aren't too important as long as it mentions all the dimension tables and (a) is always true but (b) is not so obviously always true that the planner can reduce it to constant true. (Thus, for example, you might think you could do this with zero runtime cost by writing dummy(dim1.a,dim2.a) where dummy is an inlineable SQL function that just returns constant TRUE ... but that's too cute, it won't fix your problem.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MemoryContext reset/delete callbacks
On Thu, Feb 26, 2015 at 9:34 PM, Tom Lane t...@sss.pgh.pa.us wrote: Andres Freund and...@2ndquadrant.com writes: It's a bit sad to push AllocSetContextData onto four cachelines from the current three... That stuff is hot. But I don't really see a way around it right now. And it seems like it'd give us more amunition to improve things than the small loss of speed it implies. Meh. I doubt it would make any difference, especially seeing that the struct isn't going to be aligned on any special boundary. It might not make much difference, but I think avoiding unnecessary padding space inside frequently-used data structures is probably a smart idea. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
Hi It looks well, I have only one objection. I am not sure so function hba_settings should be in file guc.c - it has zero relation to GUC. Maybe hba.c file is better probably. Other opinions? 2015-02-27 7:30 GMT+01:00 Haribabu Kommi kommi.harib...@gmail.com: On Sat, Feb 7, 2015 at 8:26 PM, Pavel Stehule pavel.steh...@gmail.com wrote: Hi I am sending a review of this patch. Thanks for the review. sorry for the delay. 4. Regress tests test rules... FAILED -- missing info about new view Thanks. Corrected. My objections: 1. data type for database field should be array of name or array of text. When name contains a comma, then this comma is not escaped currently: {omega,my stupid extremly, name2,my stupid name} expected: {my stupid name,omega,my stupid extremly, name2} Same issue I see in options field Changed including the user column also. 2. Reload is not enough for content refresh - logout is necessary I understand, why it is, but it is not very friendly, and can be very stressful. It should to work with last reloaded data. At present the view data is populated from the variable parsed_hba_lines which contains the already parsed lines of pg_hba.conf file. During the reload, the hba entries are reloaded only in the postmaster process and not in every backend, because of this reason the reloaded data is not visible until the session is disconnected and connected. Instead of changing the SIGHUP behavior in the backend to load the hba entries also, whenever the call is made to the view, the pg_hba.conf file is parsed to show the latest reloaded data in the view. This way it doesn't impact the existing behavior but it may impact the performance because of file load into memory every time. your solution is ok, simply and clean. Performance for this view should not be critical and every time will be faster than login as root and searching pg_hba.conf Regards Pavel Updated patch is attached. comments? Regards, Hari Babu Fujitsu Australia
Re: [HACKERS] Index-only scans for GiST.
I add MemoryContext listCxt to avoid memory leak. listCxt is created once in gistrescan (only for index-only scan plan ) and reseted when scan of the leaf page is finished. I do not sure if the problem was completely solved, so I wait for feedback. * What's the reason for turning GISTScanOpaqueData.pageData from an array to a List? This array is field of structure GISTScanOpaqueData. Memory for that structure allocated in function gistbeginscan(). The array is static so it's declared only one time in structure: GISTSearchHeapItem pageData [BLCKSZ/sizeof(IndexTupleData)] But how could we know size of array if we don't know what data would be returned? I mean type and amount. I asked Alexander about that and he offered me to use List instead of Array. indexonlyscan_gist_2.2.patch Description: Binary data indexonlyscan_gist_docs.patch Description: Binary data test_performance.sql Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] star schema and the optimizer
On 27/02/2015 15:08, Tom Lane wrote: Marc Cousin cousinm...@gmail.com writes: So I gave a look at the optimizer's code to try to understand why I got this problem. If I understand correctly, the optimizer won't do cross joins, except if it has no choice. That's right, and as you say, the planning-speed consequences of doing otherwise would be disastrous. However, all you need to help it find the right plan is some dummy join condition between the dimension tables, which will allow the join path you want to be considered. Perhaps you could do something like SELECT * FROM dim1,dim2,facts WHERE facts.dim1=dim1.a and facts.dim2=dim2.a and dim1.b=12 AND dim2.b=17 and (dim1.a+dim2.a) is not null; No I can't. I cannot rewrite the query at all, in my context. What do you mean by disastrous ? I've given it a few tries here, and with 8 joins (same model, 7 dimensions), planning time is around 100ms. At least in my context, it's well worth the planning time, to save minutes of execution. I perfectly understand that it's not something that should be by default, that would be crazy. But in a datawarehouse, it seems to me that accepting one, or even a few seconds of planning time to save minutes of execution is perfectly legetimate. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
On 26.2.2015 23:36, Tom Lane wrote: Jim Nasby jim.na...@bluetreble.com writes: On 2/26/15 4:01 PM, Alvaro Herrera wrote: Josh Berkus wrote: Oh, I didn't realize there weren't commands to change the LCO. Without at least SQL syntax for LCO, I don't see why we'd take it; this sounds more like a WIP patch. The reason for doing it this way is that changing the underlying architecture is really hard, without having to bear an endless hackers bike shed discussion about the best userland syntax to use. It seems a much better approach to do the actually difficult part first, then let the rest to be argued to death by others and let those others do the easy part (and take all the credit along with that); that way, that discussion does not kill other possible uses that the new architecture allows. +1. This patch is already several years old; lets not delay it further. I tend to agree with that, but how are we going to test things if there's no mechanism to create a table in which the orderings are different? Physical or logical orderings? Physical ordering is still determined by the CREATE TABLE command, so you can do either CREATE TABLE order_1 ( a INT, b INT ); or (to get the reverse order) CREATE TABLE order_2 ( b INT, a INT ); Logical ordering may be updated directly in pg_attribute catalog, by tweaking the attlognum column UPDATE pg_attribute SET attlognum = 10 WHERE attrelid = 'order_1'::regclass AND attname = 'a'; Of course, this does not handle duplicities, ranges and such, so that needs to be done manually. But I think inventing something like ALTER TABLE order_1 ALTER COLUMN a SET lognum = 11; should be straightforward. But IMHO getting the internals working properly first is more important. -- Tomas Vondrahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
On 27.2.2015 19:23, Alvaro Herrera wrote: Tomas Vondra wrote: Physical ordering is still determined by the CREATE TABLE command, In theory, you should be able to UPDATE attphysnum in pg_attribute too when the table is empty, and new tuples inserted would follow the specified ordering. Good idea - that'd give you descriptors with (attnum != attphysnum) which might trigger some bugs. -- Tomas Vondrahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
On 02/27/2015 10:35 AM, Stephen Frost wrote: From time to time I have to debug why are connection attempts failing, and with moderately-sized pg_hba.conf files (e.g. on database servers shared by multiple applications) that may be tricky. Identifying the rule that matched (and rejected) the connection would be helpful. To clarify, I was trying to say that writing that function didn't seem very difficult to me. I definitely think that *having* that function would be very useful. But yes, that's non-trivial and out of scope of this patch. For my 2c, I view this as somewhat up to the author. I wouldn't complain if it was included in a new version of this patch as I don't think it'd add all that much complexity and it'd be very nice, but I certainly think this patch could go in without that too. Also, a testing function could be written in userspace after this feature is added. I can imagine how to write one as a UDF. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] star schema and the optimizer
I wrote: I had actually thought that we'd fixed this type of problem in recent versions, and that you should be able to get a plan that would look like Nestloop - scan dim1 - Nestloop - scan dim2 - indexscan fact table using dim1.a and dim2.b After closer study, I think this is an oversight in commit e2fa76d80ba571d4de8992de6386536867250474, which quoth +It can be useful for the parameter value to be passed down through +intermediate layers of joins, for example: + + NestLoop + - Seq Scan on A + Hash Join + Join Condition: B.Y = C.W + - Seq Scan on B + - Index Scan using C_Z_IDX on C + Index Condition: C.Z = A.X + +If all joins are plain inner joins then this is unnecessary, because +it's always possible to reorder the joins so that a parameter is used +immediately below the nestloop node that provides it. But in the +presence of outer joins, join reordering may not be possible, and then +this option can be critical. Before version 9.2, Postgres used ad-hoc This reasoning overlooked the fact that if we need parameters from more than one relation, and there's no way to join those relations to each other directly, then we have to allow passing the dim1 parameter down through the join to dim2. The attached patch seems to fix it (modulo the need for some updates in the README, and maybe a regression test). Could you see if this produces satisfactory plans for you? regards, tom lane diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c index e6aa21c..ce812b0 100644 *** a/src/backend/optimizer/path/joinpath.c --- b/src/backend/optimizer/path/joinpath.c *** try_nestloop_path(PlannerInfo *root, *** 291,299 if (required_outer !bms_overlap(required_outer, param_source_rels)) { ! /* Waste no memory when we reject a path here */ ! bms_free(required_outer); ! return; } /* --- 291,320 if (required_outer !bms_overlap(required_outer, param_source_rels)) { ! /* ! * We override the param_source_rels heuristic to accept nestloop ! * paths in which the outer rel satisfies some but not all of the ! * inner path's parameterization. This is necessary to get good plans ! * for star-schema scenarios, in which a parameterized path for a ! * fact table may require parameters from multiple dimension ! * tables that will not get joined directly to each other. We can ! * handle that by stacking nestloops that have the dimension tables on ! * the outside; but this breaks the rule the param_source_rels ! * heuristic is based on, that parameters should not be passed down ! * across joins unless there's a join-order-constraint-based reason to ! * do so. So, we should consider partial satisfaction of ! * parameterization as another reason to allow such paths. ! */ ! Relids outerrelids = outer_path-parent-relids; ! Relids innerparams = PATH_REQ_OUTER(inner_path); ! ! if (!(bms_overlap(innerparams, outerrelids) ! bms_nonempty_difference(innerparams, outerrelids))) ! { ! /* Waste no memory when we reject a path here */ ! bms_free(required_outer); ! return; ! } } /* -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
On Fri, Feb 27, 2015 at 12:48 PM, Tomas Vondra tomas.von...@2ndquadrant.com wrote: On 27.2.2015 17:59, Stephen Frost wrote: All, * Tomas Vondra (tomas.von...@2ndquadrant.com) wrote: The other feature that'd be cool to have is a debugging function on top of the view, i.e. a function pg_hba_check(host, ip, db, user, pwd) showing which hba rule matched. But that's certainly nontrivial. I'm not sure that I see why, offhand, it'd be much more than trivial ... From time to time I have to debug why are connection attempts failing, and with moderately-sized pg_hba.conf files (e.g. on database servers shared by multiple applications) that may be tricky. Identifying the rule that matched (and rejected) the connection would be helpful. If you did actually get a rejected connection, you get that in the log (as of 9.3, iirc). Such a function would make it possible to test it without having failed first though :) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] [POC] FETCH limited by bytes.
Attached is a diff containing the original (working) patch from my (incomplete) patch, plus regression test changes and documentation changes. While it's easy to regression-test the persistence of the fetch_size options, I am confounded as to how we would show that the fetch_size setting was respected. I've seen it with my own eyes viewing the query log on redshift, but I see no way to automate that. Suggestions welcome. On Fri, Feb 6, 2015 at 3:11 AM, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: Hello, Redshift has a table, stv_inflight, which serves about the same purpose as pg_stat_activity. Redshift seems to perform better with very high fetch sizes (10,000 is a good start), so the default foreign data wrapper didn't perform so well. I agree with you. I was able to confirm that the first query showed FETCH 150 FROM c1 as the query, which is normal highly unhelpful, but in this case it confirms that tables created in redshift_server do by default use the fetch_size option given during server creation. I was also able to confirm that the second query showed FETCH 151 FROM c1 as the query, which shows that per-table overrides also work. I'd be happy to stop here, but Kyotaro might feel differently. This is enough in its own way, of course. With this limited patch, one could estimate the number of rows that would fit into the desired memory block based on the row width and set fetch_size accordingly. The users including me will be happy with it when the users know how to determin the fetch size. Especially the remote tables are very few or the configuration will be enough stable. On widely distributed systems, it would be far difficult to tune fetch size manually for every foreign tables, so finally it would be left at the default and safe size, it's 100 or so. This is the similar discussion about max_wal_size on another thread. Calculating fetch size is far tougher for users than setting expected memory usage, I think. But we could go further, and have a fetch_max_memory option only at the table level, and the fdw could do that same memory / estimated_row_size calculation and derive fetch_size that way *at table creation time*, not query time. We cannot know the real length of the text type data in advance, other than that, even defined as char(n), the n is the theoretically(or in-design) maximum size for the field but in the most cases the mean length of the real data would be far small than that. For that reason, calculating the ratio at the table creation time seems to be difficult. However, I agree to the Tom's suggestion that the changes in FETCH statement is defenitly ugly, especially the overhead argument is prohibitive even for me:( Thanks to Kyotaro and Bruce Momjian for their help. Not at all. regardes, At Wed, 4 Feb 2015 18:06:02 -0500, Corey Huinker corey.huin...@gmail.com wrote in CADkLM=eTpKYX5VOfjLr0VvfXhEZbC2UeakN= p6mxmg7s86c...@mail.gmail.com I applied this patch to REL9_4_STABLE, and I was able to connect to a foreign database (redshift, actually). the basic outline of the test is below, names changed to protect my employment. create extension if not exists postgres_fdw; create server redshift_server foreign data wrapper postgres_fdw options ( host 'some.hostname.ext', port '5439', dbname 'redacted', fetch_size '150' ); create user mapping for public server redshift_server options ( user 'redacted_user', password 'comeonyouarekiddingright' ); create foreign table redshift_tab150 ( colspecs ) server redshift_server options (table_name 'redacted_table', schema_name 'redacted_schema' ); create foreign table redshift_tab151 ( colspecs ) server redshift_server options (table_name 'redacted_table', schema_name 'redacted_schema', fetch_size '151' ); -- i don't expect the fdw to push the aggregate, this is just a test to see what query shows up in stv_inflight. select count(*) from redshift_ccp150; -- i don't expect the fdw to push the aggregate, this is just a test to see what query shows up in stv_inflight. select count(*) from redshift_ccp151; For those of you that aren't familiar with Redshift, it's a columnar database that seems to be a fork of postgres 8.cough. You can connect to it with modern libpq programs (psql, psycopg2, etc). Redshift has a table, stv_inflight, which serves about the same purpose as pg_stat_activity. Redshift seems to perform better with very high fetch sizes (10,000 is a good start), so the default foreign data wrapper didn't perform so well. I was able to confirm that the first query showed FETCH 150 FROM c1 as the query, which is normal highly unhelpful, but in this case it confirms that tables created in redshift_server do by default use the fetch_size option given during server creation. I was also able to confirm that the second query showed FETCH 151 FROM c1
Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
* Pavel Stehule (pavel.steh...@gmail.com) wrote: 2015-02-27 17:59 GMT+01:00 Stephen Frost sfr...@snowman.net: I don't think we actually care what the current contents are from the backend's point of view- after all, when does an individual backend ever use the contents of pg_hba.conf after it's up and running? What would make sense, to me at least, would be: pg_hba_configured() -- spits back whatever the config file has pg_hba_active() -- shows what the postmaster is using currently I disagree and I dislike this direction. It is looks like over engineering. * load every time is wrong, because you will see possibly not active data. That's the point of the two functions- one to give you what a reload *now* would, and one to see what's currently active. * ignore reload is a attack to mental health of our users. Huh? It should to work like pg_settings. I need to see what is wrong in this moment in pg_hba.conf, not what was or what will be wrong. That's what pg_hba_active() would be from above, yes. We can load any config files via admin contrib module - so there is not necessary repeat same functionality That's hardly the same- I can't (easily, anyway) join the results of pg_read() to pg_hba_active() and see what's different or the same. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] plpgsql versus domains
On Thu, Feb 26, 2015 at 1:53 PM, Tom Lane t...@sss.pgh.pa.us wrote: To some extent this is a workaround for the fact that plpgsql does type conversions the way it does (ie, by I/O rather than by using the parser's cast mechanisms). We've talked about changing that, but people seem to be afraid of the compatibility consequences, and I'm not planning to fight two compatibility battles concurrently ;-) A sensible plan, but since we're here: - I do agree that changing the way PL/pgsql does those type conversions is scary. I can't predict what will break, and there's no getting around the scariness of that. - On the other hand, I do think it would be good to change it, because this sucks: rhaas=# do $$ declare x int; begin x := (3.0::numeric)/(1.0::numeric); end $$; ERROR: invalid input syntax for integer: 3. CONTEXT: PL/pgSQL function inline_code_block line 1 at assignment I didn't realize that this issue had even been discussed before... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
Tomas Vondra wrote: On 26.2.2015 23:42, Kevin Grittner wrote: One use case is to be able to suppress default display of columns that are used for internal purposes. For example, incremental maintenance of materialized views will require storing a count(t) column, and sometimes state information for aggregate columns, in addition to what the users explicitly request. At the developers' meeting there was discussion of whether and how to avoid displaying these by default, and it was felt that when we have this logical column ordering it would be good to have a way tosuppress default display. Perhaps this could be as simple as a special value for logical position. I don't see how hiding columns is related to this patch at all. That's completely unrelated thing, and it certainly is not part of this patch. It's not directly related to the patch, but I think the intent is that once we have this patch it will be possible to apply other transformations, such as having columns that are effectively hidden -- consider for example the idea that attlognum be set to a negative number. (For instance, consider the idea that system columns may all have -1 as attlognum, which would just be an indicator that they are never present in logical column expansion. That makes sense to me; what reason do we have to keep them using the current attnums they have?) -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
* Tomas Vondra (tomas.von...@2ndquadrant.com) wrote: On 27.2.2015 17:59, Stephen Frost wrote: All, * Tomas Vondra (tomas.von...@2ndquadrant.com) wrote: The other feature that'd be cool to have is a debugging function on top of the view, i.e. a function pg_hba_check(host, ip, db, user, pwd) showing which hba rule matched. But that's certainly nontrivial. I'm not sure that I see why, offhand, it'd be much more than trivial ... From time to time I have to debug why are connection attempts failing, and with moderately-sized pg_hba.conf files (e.g. on database servers shared by multiple applications) that may be tricky. Identifying the rule that matched (and rejected) the connection would be helpful. To clarify, I was trying to say that writing that function didn't seem very difficult to me. I definitely think that *having* that function would be very useful. But yes, that's non-trivial and out of scope of this patch. For my 2c, I view this as somewhat up to the author. I wouldn't complain if it was included in a new version of this patch as I don't think it'd add all that much complexity and it'd be very nice, but I certainly think this patch could go in without that too. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] logical column ordering
On 26.2.2015 23:42, Kevin Grittner wrote: Tomas Vondra tomas.von...@2ndquadrant.com wrote: Over the time I've heard various use cases for this patch, but in most cases it was quite speculative. If you have an idea where this might be useful, can you explain it here, or maybe point me to a place where it's described? One use case is to be able to suppress default display of columns that are used for internal purposes. For example, incremental maintenance of materialized views will require storing a count(t) column, and sometimes state information for aggregate columns, in addition to what the users explicitly request. At the developers' meeting there was discussion of whether and how to avoid displaying these by default, and it was felt that when we have this logical column ordering it would be good to have a way tosuppress default display. Perhaps this could be as simple as a special value for logical position. I don't see how hiding columns is related to this patch at all. That's completely unrelated thing, and it certainly is not part of this patch. -- Tomas Vondrahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] star schema and the optimizer
On 27/02/2015 19:45, Tom Lane wrote: I wrote: I had actually thought that we'd fixed this type of problem in recent versions, and that you should be able to get a plan that would look like Nestloop - scan dim1 - Nestloop - scan dim2 - indexscan fact table using dim1.a and dim2.b After closer study, I think this is an oversight in commit e2fa76d80ba571d4de8992de6386536867250474, which quoth +It can be useful for the parameter value to be passed down through +intermediate layers of joins, for example: + + NestLoop + - Seq Scan on A + Hash Join + Join Condition: B.Y = C.W + - Seq Scan on B + - Index Scan using C_Z_IDX on C + Index Condition: C.Z = A.X + +If all joins are plain inner joins then this is unnecessary, because +it's always possible to reorder the joins so that a parameter is used +immediately below the nestloop node that provides it. But in the +presence of outer joins, join reordering may not be possible, and then +this option can be critical. Before version 9.2, Postgres used ad-hoc This reasoning overlooked the fact that if we need parameters from more than one relation, and there's no way to join those relations to each other directly, then we have to allow passing the dim1 parameter down through the join to dim2. The attached patch seems to fix it (modulo the need for some updates in the README, and maybe a regression test). Could you see if this produces satisfactory plans for you? From what I see, it's just perfect. I'll give it a more thorough look a bit later, but it seems to be exactly what I was waiting for. Thanks a lot. Regards -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
Tomas Vondra wrote: Physical ordering is still determined by the CREATE TABLE command, In theory, you should be able to UPDATE attphysnum in pg_attribute too when the table is empty, and new tuples inserted would follow the specified ordering. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] star schema and the optimizer
On 27/02/2015 15:27, Marc Cousin wrote: On 27/02/2015 15:08, Tom Lane wrote: Marc Cousin cousinm...@gmail.com writes: So I gave a look at the optimizer's code to try to understand why I got this problem. If I understand correctly, the optimizer won't do cross joins, except if it has no choice. That's right, and as you say, the planning-speed consequences of doing otherwise would be disastrous. However, all you need to help it find the right plan is some dummy join condition between the dimension tables, which will allow the join path you want to be considered. Perhaps you could do something like SELECT * FROM dim1,dim2,facts WHERE facts.dim1=dim1.a and facts.dim2=dim2.a and dim1.b=12 AND dim2.b=17 and (dim1.a+dim2.a) is not null; No I can't. I cannot rewrite the query at all, in my context. What do you mean by disastrous ? I've given it a few tries here, and with 8 joins (same model, 7 dimensions), planning time is around 100ms. At least in my context, it's well worth the planning time, to save minutes of execution. I perfectly understand that it's not something that should be by default, that would be crazy. But in a datawarehouse, it seems to me that accepting one, or even a few seconds of planning time to save minutes of execution is perfectly legetimate. I have given it a bit more thought. Could it be possible, to mitigate this, to permit only a few (few being to define) cross joins ? Still optional, of course, it still has an important cost. Only allowing cross joins for the first 3 levels, and keeping this to left-right sided joins, I can plan up to 11 joins on my small test machine in 500ms (instead of 150ms with the unpatched one), and get a good plan, good meaning 100 times faster. Regards -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] star schema and the optimizer
I wrote: I had actually thought that we'd fixed this type of problem in recent versions, and that you should be able to get a plan that would look like Nestloop - scan dim1 - Nestloop - scan dim2 - indexscan fact table using dim1.a and dim2.b which would arise from developing an indexscan on fact that's parameterized by both other tables, resolving one of those parameterizations via a join to dim2, and then the other one via a join to dim1. I'm not sure offhand why that isn't working in this example. It looks like the issue is that the computation of param_source_rels in add_paths_to_joinrel() is overly restrictive: it thinks there is no reason to generate a parameterized-by-dim2 path for the join relation {fact, dim1}, or likewise a parameterized-by-dim1 path for the join relation {fact, dim2}. So what we need is to understand when it's appropriate to do that. Maybe the mere existence of a multiply-parameterized path among fact's paths is sufficient. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC idea - Simulated annealing to search for query plans
Josh Berkus writes: On 02/26/2015 05:50 PM, Fabrízio de Royes Mello wrote: On Thu, Feb 26, 2015 at 10:27 PM, Andres Freund and...@2ndquadrant.com mailto:and...@2ndquadrant.com wrote: On 2015-02-26 20:23:33 -0500, Tom Lane wrote: I seem to recall somebody demo'ing a simulated-annealing GEQO replacement at PGCon a couple years back. It never got to the point of being a submitted patch though. Yea, it was Jan Urbański (CCed). And the project link: https://github.com/wulczer/saio So what w'ere saying, Grzegorz, is that we would love to see someone pick this up and get it to the point of making it a feature as a GSOC project. I think if you can start from where Jan left off, you could actually complete it. Sorry, late to the party. Yes, I wrote a GEQO replacement that used simulated annealing for my Master thesis. It got to a point where it was generating plans similar to what GEQO outputs for small queries and better plans for very large ones. The thesis itself is here: https://wulczer.org/saio.pdf and the linked GitHub repo contains source for the PGCon presentation, which gives a higher-level overview. The big problem turned out to be writing the step function that generates a new join order from a previous one. Look for the Simulated Annealing challenges and Moves generator chapters in my thesis, which are the only interesting ones :) If you'd like to pick up where I left, I'd be more than happy to help in any ways I can. Best, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in pg_dump
Le 26/02/2015 12:41, Michael Paquier a écrit : On Wed, Feb 25, 2015 at 7:46 AM, Gilles Darold gilles.dar...@dalibo.com wrote: This is a far better patch and the test to export/import of the postgis_topology extension works great for me. Thanks for the work. Attached is a patch that uses an even better approach by querying only once all the FK dependencies of tables in extensions whose data is registered as dumpable by getExtensionMembership(). Could you check that it works correctly? My test case passes but an extra check would be a good nice. The patch footprint is pretty low so we may be able to backport this patch easily. Works great too. I'm agree that this patch should be backported but I think we need to warn admins in the release note that their import/restore scripts may be broken. One of the common workaround about this issue was to not take care of the error during data import and then reload data from the tables with FK errors at the end of the import. If the admins are not warned, this can conduct to duplicate entries or return an error. -- Gilles Darold Consultant PostgreSQL http://dalibo.com - http://dalibo.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] star schema and the optimizer
Marc Cousin cousinm...@gmail.com writes: On 27/02/2015 15:08, Tom Lane wrote: That's right, and as you say, the planning-speed consequences of doing otherwise would be disastrous. What do you mean by disastrous ? Let's assume you have ten fact tables, so ten join conditions (11 tables altogether). As-is, the planner considers ten 2-way joins (all between the fact table and one or another dimension table). At the level of 3-way joins, there are 45 possible join relations each of which has just 2 possible sets of input relations. At the level of 4-way joins, there are 120 join relations to consider each of which can be made in only 3 ways. And so on. It's combinatorial but the growth rate is manageable. On the other hand, if we lobotomize the must-have-a-join-condition filter, there are 11 choose 2 possible 2-way joins, or 55. At the level of 3-way joins, there are 165 possible joins, but each of these can be made in 3 different ways from a 2-way join and a singleton. At the level of 4-way joins, there are 330 possible joins, but each of these can be made in 4 ways from a 3-way join and a singleton plus 6 different ways from two 2-way joins. So at this level we're considering something like 3300 different join paths versus 360 in the existing planner. It gets rapidly worse. The real problem is that in most cases, all those extra paths are utterly useless. They would not be less useless just because you're a data warehouse or whatever. So I'm uninterested in simply lobotomizing that filter. What would make more sense is to notice that the fact table has an index that's amenable to this type of optimization, and then use that to loosen up the join restrictions, rather than just blindly consider useless joins. I had actually thought that we'd fixed this type of problem in recent versions, and that you should be able to get a plan that would look like Nestloop - scan dim1 - Nestloop - scan dim2 - indexscan fact table using dim1.a and dim2.b which would arise from developing an indexscan on fact that's parameterized by both other tables, resolving one of those parameterizations via a join to dim2, and then the other one via a join to dim1. I'm not sure offhand why that isn't working in this example. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
Arthur Silva wrote: Sorry to intrude, I've been following this post and I was wondering if it would allow (in the currently planed form or in the future) a wider set of non-rewriting DDLs to Postgres. For example, drop a column without rewriting the table. Perhaps. But dropping a column already does not rewrite the table, only marks the column as dropped in system catalogs, so do you have a better example. One obvious example is that you have CREATE TABLE t ( t1 int, t3 int ); and later want to add t2 in the middle, the only way currently is to drop the table and start again (re-creating all dependant views, FKs, etc). With the patch you will be able to add the column at the right place. If no default value is supplied for the new column, no table rewrite is necessary at all. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
On 27.2.2015 20:34, Alvaro Herrera wrote: Tomas Vondra wrote: I think we could calls to the randomization functions into some of the regression tests (say 'create_tables.sql'), but that makes regression tests ... well, random, and I'm not convinced that's a good thing. Also, this makes regression tests harder to think, because SELECT * does different things depending on the attlognum order. No, that approach doesn't seem very useful. Rather, randomize the columns in the CREATE TABLE statement, and then fix up the attlognums so that the SELECT * expansion is the same as it would be with the not-randomized CREATE TABLE. Yes, that's a possible approach too - possibly a better one for regression tests as it fixes the 'SELECT *' but it effectively uses fixed 'attlognum' and 'attnum' values (it's difficult to randomize those, as they may be referenced in other catalogs). -- Tomas Vondrahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
* Pavel Stehule (pavel.steh...@gmail.com) wrote: this topic should be divided, please. One part - functions for loading pg_hba and translating to some table. Can be two, can be one with one parameter. It will be used probably by advanced user, and I am able to accept it like you or Tomas proposed. I'm not sure why you'd need a parameter for the function. The function could back a view too. In general, I think you're agreeing with me that it'd be a useful capability to have. Second part is the content of view pg_hba_conf. It should be only one and should to view a active content. I mean a content that is actively used - when other session is started. I am strongly against the behave, when I have to close session to refresh a content of this view (after reload) - and I am against to see there not active content. Right, we also need a view (or function, or both) which provides what the *active* configuration of the running postmaster is. This is exactly what I was proposing (or what I was intending to, at least) with pg_hba_active, so, again, I think we're in agreement here. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] logical column ordering
OK, so let me summarize here (the other posts in this subthread are discussing the user interface, not the use cases, so I'll respond here). There are two main use cases: 1) change the order of columns in SELECT * - display columns so that related ones grouped together (irrespectedly whether they were added later, etc.) - keep columns synced with COPY - requires user interface (ALTER TABLE _ ALTER COLUMN _ SET ORDER _) 2) optimization of physical order (efficient storage / tuple deforming) - more efficient order for storage (deforming) - may be done manually by reordering columns in CREATE TABLE - should be done automatically (no user interface required) - seems useful both for on-disk physical tuples, and virtual tuples Anything else? -- Tomas Vondrahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
On 27.2.2015 19:50, Alvaro Herrera wrote: Tomas Vondra wrote: On 26.2.2015 23:42, Kevin Grittner wrote: One use case is to be able to suppress default display of columns that are used for internal purposes. For example, incremental maintenance of materialized views will require storing a count(t) column, and sometimes state information for aggregate columns, in addition to what the users explicitly request. At the developers' meeting there was discussion of whether and how to avoid displaying these by default, and it was felt that when we have this logical column ordering it would be good to have a way tosuppress default display. Perhaps this could be as simple as a special value for logical position. I don't see how hiding columns is related to this patch at all. That's completely unrelated thing, and it certainly is not part of this patch. It's not directly related to the patch, but I think the intent is that once we have this patch it will be possible to apply other transformations, such as having columns that are effectively hidden -- consider for example the idea that attlognum be set to a negative number. (For instance, consider the idea that system columns may all have -1 as attlognum, which would just be an indicator that they are never present in logical column expansion. That makes sense to me; what reason do we have to keep them using the current attnums they have?) My vote is against reusing attlognums in this way - I see that as a misuse, making the code needlessly complex. A better way to achieve this is to introduce a simple 'is hidden' flag into pg_attribute, and something as simple as ALTER TABLE t ALTER COLUMN a SHOW; ALTER TABLE t ALTER COLUMN a HIDE; or something along the lines. Not only that's cleaner, but it's also a better interface for the users than 'you have to set attlognum to a negative value.' -- Tomas Vondrahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
On 26.2.2015 23:34, Andres Freund wrote: On 2015-02-26 16:16:54 -0600, Jim Nasby wrote: On 2/26/15 4:01 PM, Alvaro Herrera wrote: The reason for doing it this way is that changing the underlying architecture is really hard, without having to bear an endless hackers bike shed discussion about the best userland syntax to use. It seems a much better approach to do the actually difficult part first, then let the rest to be argued to death by others and let those others do the easy part (and take all the credit along with that); that way, that discussion does not kill other possible uses that the new architecture allows. I agree that it's a sane order of developing things. But: I don't think it makes sense to commit it without the capability to change the order. Not so much because it'll not serve a purpose at that point, but rather because it'll essentially untestable. And this is a patch that needs a fair amount of changes, both automated, and manual. I agree that committing something without a code that actually uses it in practice is not the best approach. That's one of the reasons why I was asking for the use cases we expect from this. At least during development I'd even add a function that randomizes the physical order of columns, and keeps the logical one. Running the regression tests that way seems likely to catch a fair number of bugs. That's not all that difficult, and can be done in 10 lines of PL/pgSQL - see the attached file. Should be sufficient for development testing (and in production there's not much use for that anyway). +1. This patch is already several years old; lets not delay it further. Plus, I suspect that you could hack the catalog directly if you really wanted to change LCO. Worst case you could create a C function to do it. I don't think that's sufficient for testing purposes. Not only for the patch itself, but also for getting it right in new code. I think we could calls to the randomization functions into some of the regression tests (say 'create_tables.sql'), but that makes regression tests ... well, random, and I'm not convinced that's a good thing. Also, this makes regression tests harder to think, because SELECT * does different things depending on the attlognum order. -- Tomas Vondrahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services randomize.sql Description: application/sql -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
On 28/02/15 09:09, Josh Berkus wrote: Tomas, So for an API, 100% of the use cases I have for this feature would be satisfied by: ALTER TABLE __ ALTER COLUMN _ SET ORDER # and: ALTER TABLE _ ADD COLUMN colname coltype ORDER # If that's infeasible, a function would be less optimal, but would work: SELECT pg_column_order(schemaname, tablename, colname, attnum) If you set the order # to one where a column already exists, other column attnums would get bumped down, closing up any gaps in the process. Obviously, this would require some kind of exclusive lock, but then ALTER TABLE usually does, doesn't it? Might be an idea to allow lists of columns and their starting position. ALTER TABLE customer ALTER COLUMN (job_id, type_id, account_num) SET ORDER 3; So in a table with fields: 1. id 2. *account_num* 3. dob 4. start_date 5. *type_id* 6. preferred_status 7. */job_id/* 8. comment would end up like: 1. id 2. dob 3. *job_id* 4. *type_id* 5. *account_num* 6. start_date 7. preferred_status 8. comment Am assuming positions are numbered from 1 onwards. Cheers, Gavin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
On 27.2.2015 21:42, Josh Berkus wrote: On 02/27/2015 12:25 PM, Tomas Vondra wrote: On 27.2.2015 21:09, Josh Berkus wrote: Tomas, So for an API, 100% of the use cases I have for this feature would be satisfied by: ALTER TABLE __ ALTER COLUMN _ SET ORDER # and: ALTER TABLE _ ADD COLUMN colname coltype ORDER # Yes, I imagined an interface like that. Just to be clear, you're talking about logical order (and not a physical one), right? Correct. The only reason to rearrange the physical columns is in order to optimize, which presumably would be carried out by a utility command, e.g. VACUUM FULL OPTIMIZE. I was thinking more about CREATE TABLE at this moment, but yeah, VACUUM FULL OPTIMIZE might do the same thing. If we ignore the system columns, the current implementation assumes that the values in each of the three columns (attnum, attlognum and attphysnum) are distinct and within 1..natts. So there are no gaps and you'll always set the value to an existing one (so yes, shuffling is necessary). And yes, that certainly requires an exclusive lock on the pg_attribute (I don't think we need a lock on the table itself). If MVCC on pg_attribute is good enough to not lock against concurrent SELECT *, then that would be lovely. Yeah, I think this will need a bit more thought. We certainly don't want blocking queries on the table, but we need a consistent list of column definitions from pg_attribute. -- Tomas Vondrahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
Tomas Vondra wrote: 1) change the order of columns in SELECT * - display columns so that related ones grouped together (irrespectedly whether they were added later, etc.) - keep columns synced with COPY - requires user interface (ALTER TABLE _ ALTER COLUMN _ SET ORDER _) Not sure about the ORDER # syntax. In ALTER ENUM we have AFTER value and such .. I'd consider that instead. 2) optimization of physical order (efficient storage / tuple deforming) - more efficient order for storage (deforming) - may be done manually by reordering columns in CREATE TABLE - should be done automatically (no user interface required) A large part of it can be done automatically: for instance, not-nullable fixed length types ought to come first, because that enables the attcacheoff optimizations in heaptuple.c to fire for more columns. But what column comes next? The offset of the column immediately after them can also be cached, and so it would be faster to obtain than other attributes. Which one to choose here is going to be a coin toss in most cases, but I suppose there are cases out there which can benefit from having a particular column there. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
On 28/02/15 09:49, Alvaro Herrera wrote: Tomas Vondra wrote: 1) change the order of columns in SELECT * - display columns so that related ones grouped together (irrespectedly whether they were added later, etc.) - keep columns synced with COPY - requires user interface (ALTER TABLE _ ALTER COLUMN _ SET ORDER _) Not sure about the ORDER # syntax. In ALTER ENUM we have AFTER value and such .. I'd consider that instead. 2) optimization of physical order (efficient storage / tuple deforming) - more efficient order for storage (deforming) - may be done manually by reordering columns in CREATE TABLE - should be done automatically (no user interface required) A large part of it can be done automatically: for instance, not-nullable fixed length types ought to come first, because that enables the attcacheoff optimizations in heaptuple.c to fire for more columns. But what column comes next? The offset of the column immediately after them can also be cached, and so it would be faster to obtain than other attributes. Which one to choose here is going to be a coin toss in most cases, but I suppose there are cases out there which can benefit from having a particular column there. Possible, if there is no obvious (to the system) way of deciding the order of 2 columns, then the logical order should be used? As either the order does not really matter, or an expert DBA might know which is more efficient. Cheers, Gavin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
2015-02-27 19:32 GMT+01:00 Stephen Frost sfr...@snowman.net: * Pavel Stehule (pavel.steh...@gmail.com) wrote: 2015-02-27 17:59 GMT+01:00 Stephen Frost sfr...@snowman.net: I don't think we actually care what the current contents are from the backend's point of view- after all, when does an individual backend ever use the contents of pg_hba.conf after it's up and running? What would make sense, to me at least, would be: pg_hba_configured() -- spits back whatever the config file has pg_hba_active() -- shows what the postmaster is using currently I disagree and I dislike this direction. It is looks like over engineering. * load every time is wrong, because you will see possibly not active data. That's the point of the two functions- one to give you what a reload *now* would, and one to see what's currently active. * ignore reload is a attack to mental health of our users. Huh? this topic should be divided, please. One part - functions for loading pg_hba and translating to some table. Can be two, can be one with one parameter. It will be used probably by advanced user, and I am able to accept it like you or Tomas proposed. Second part is the content of view pg_hba_conf. It should be only one and should to view a active content. I mean a content that is actively used - when other session is started. I am strongly against the behave, when I have to close session to refresh a content of this view (after reload) - and I am against to see there not active content. Regards Pavel It should to work like pg_settings. I need to see what is wrong in this moment in pg_hba.conf, not what was or what will be wrong. That's what pg_hba_active() would be from above, yes. We can load any config files via admin contrib module - so there is not necessary repeat same functionality That's hardly the same- I can't (easily, anyway) join the results of pg_read() to pg_hba_active() and see what's different or the same. Thanks, Stephen
Re: [HACKERS] logical column ordering
On Fri, Feb 27, 2015 at 4:44 PM, Tomas Vondra tomas.von...@2ndquadrant.com wrote: On 27.2.2015 20:34, Alvaro Herrera wrote: Tomas Vondra wrote: I think we could calls to the randomization functions into some of the regression tests (say 'create_tables.sql'), but that makes regression tests ... well, random, and I'm not convinced that's a good thing. Also, this makes regression tests harder to think, because SELECT * does different things depending on the attlognum order. No, that approach doesn't seem very useful. Rather, randomize the columns in the CREATE TABLE statement, and then fix up the attlognums so that the SELECT * expansion is the same as it would be with the not-randomized CREATE TABLE. Yes, that's a possible approach too - possibly a better one for regression tests as it fixes the 'SELECT *' but it effectively uses fixed 'attlognum' and 'attnum' values (it's difficult to randomize those, as they may be referenced in other catalogs). -- Tomas Vondrahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers Sorry to intrude, I've been following this post and I was wondering if it would allow (in the currently planed form or in the future) a wider set of non-rewriting DDLs to Postgres. For example, drop a column without rewriting the table.
Re: [HACKERS] logical column ordering
On 02/27/2015 12:25 PM, Tomas Vondra wrote: On 27.2.2015 21:09, Josh Berkus wrote: Tomas, So for an API, 100% of the use cases I have for this feature would be satisfied by: ALTER TABLE __ ALTER COLUMN _ SET ORDER # and: ALTER TABLE _ ADD COLUMN colname coltype ORDER # Yes, I imagined an interface like that. Just to be clear, you're talking about logical order (and not a physical one), right? Correct. The only reason to rearrange the physical columns is in order to optimize, which presumably would be carried out by a utility command, e.g. VACUUM FULL OPTIMIZE. If we ignore the system columns, the current implementation assumes that the values in each of the three columns (attnum, attlognum and attphysnum) are distinct and within 1..natts. So there are no gaps and you'll always set the value to an existing one (so yes, shuffling is necessary). And yes, that certainly requires an exclusive lock on the pg_attribute (I don't think we need a lock on the table itself). If MVCC on pg_attribute is good enough to not lock against concurrent SELECT *, then that would be lovely. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
Tomas Vondra wrote: I think we could calls to the randomization functions into some of the regression tests (say 'create_tables.sql'), but that makes regression tests ... well, random, and I'm not convinced that's a good thing. Also, this makes regression tests harder to think, because SELECT * does different things depending on the attlognum order. No, that approach doesn't seem very useful. Rather, randomize the columns in the CREATE TABLE statement, and then fix up the attlognums so that the SELECT * expansion is the same as it would be with the not-randomized CREATE TABLE. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
Tomas Vondra wrote: On 27.2.2015 20:34, Alvaro Herrera wrote: Tomas Vondra wrote: I think we could calls to the randomization functions into some of the regression tests (say 'create_tables.sql'), but that makes regression tests ... well, random, and I'm not convinced that's a good thing. Also, this makes regression tests harder to think, because SELECT * does different things depending on the attlognum order. No, that approach doesn't seem very useful. Rather, randomize the columns in the CREATE TABLE statement, and then fix up the attlognums so that the SELECT * expansion is the same as it would be with the not-randomized CREATE TABLE. Yes, that's a possible approach too - possibly a better one for regression tests as it fixes the 'SELECT *' but it effectively uses fixed 'attlognum' and 'attnum' values (it's difficult to randomize those, as they may be referenced in other catalogs). Why would you care what values are used as attnum? If anything misbehaves, surely that would be a bug in the patch. (Of course, you can't just change the numbers too much later after the fact, because the attnum values could have propagated into other tables via foreign keys and such; it needs to be done during executing CREATE TABLE or immediately thereafter.) -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] POLA violation with \c service=
On Fri, Feb 27, 2015 at 02:51:18PM -0300, Alvaro Herrera wrote: I don't understand. Why don't these patches move anything to src/common? Because I misunderstood the scope. Hope to get to those this evening. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql versus domains
Robert Haas robertmh...@gmail.com writes: On Thu, Feb 26, 2015 at 1:53 PM, Tom Lane t...@sss.pgh.pa.us wrote: To some extent this is a workaround for the fact that plpgsql does type conversions the way it does (ie, by I/O rather than by using the parser's cast mechanisms). We've talked about changing that, but people seem to be afraid of the compatibility consequences, and I'm not planning to fight two compatibility battles concurrently ;-) A sensible plan, but since we're here: - I do agree that changing the way PL/pgsql does those type conversions is scary. I can't predict what will break, and there's no getting around the scariness of that. - On the other hand, I do think it would be good to change it, because this sucks: rhaas=# do $$ declare x int; begin x := (3.0::numeric)/(1.0::numeric); end $$; ERROR: invalid input syntax for integer: 3. CONTEXT: PL/pgSQL function inline_code_block line 1 at assignment If we did want to open that can of worms, my proposal would be to make plpgsql type conversions work like so: * If there is a cast pathway known to the core SQL parser, use that mechanism. * Otherwise, attempt to convert via I/O as we do today. This seems to minimize the risk of breaking things, although there would probably be corner cases that work differently (for instance I bet boolean to text might turn out differently). In the very long run we could perhaps deprecate and remove the second phase. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql versus domains
2015-02-27 21:57 GMT+01:00 Tom Lane t...@sss.pgh.pa.us: Robert Haas robertmh...@gmail.com writes: On Thu, Feb 26, 2015 at 1:53 PM, Tom Lane t...@sss.pgh.pa.us wrote: To some extent this is a workaround for the fact that plpgsql does type conversions the way it does (ie, by I/O rather than by using the parser's cast mechanisms). We've talked about changing that, but people seem to be afraid of the compatibility consequences, and I'm not planning to fight two compatibility battles concurrently ;-) A sensible plan, but since we're here: - I do agree that changing the way PL/pgsql does those type conversions is scary. I can't predict what will break, and there's no getting around the scariness of that. - On the other hand, I do think it would be good to change it, because this sucks: rhaas=# do $$ declare x int; begin x := (3.0::numeric)/(1.0::numeric); end $$; ERROR: invalid input syntax for integer: 3. CONTEXT: PL/pgSQL function inline_code_block line 1 at assignment If we did want to open that can of worms, my proposal would be to make plpgsql type conversions work like so: * If there is a cast pathway known to the core SQL parser, use that mechanism. * Otherwise, attempt to convert via I/O as we do today. This seems to minimize the risk of breaking things, although there would probably be corner cases that work differently (for instance I bet boolean to text might turn out differently). In the very long run we could perhaps deprecate and remove the second phase. +1 There can be similar solution like plpgsql/sql identifiers priority configuration. Some levels - and what you are proposing should be default. It works perfectly - and from my view and what I know from my neighborhood, there are no issues. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
Stephen Frost sfr...@snowman.net writes: Right, we also need a view (or function, or both) which provides what the *active* configuration of the running postmaster is. This is exactly what I was proposing (or what I was intending to, at least) with pg_hba_active, so, again, I think we're in agreement here. I think that's going to be a lot harder than you realize, and it will have undesirable security implications, in that whatever you do to expose the postmaster's internal state to backends will also make it visible to other onlookers; not to mention probably adding new failure modes. There are also nontrivial semantic differences in this area between Windows and other platforms (ie in an EXEC_BACKEND build the current file contents *are* the active version). If you insist on two views you will need to explain why/how they act differently on different platforms. I think the proposed mechanism (ie read and return the current contents of the file) is just fine, and we should stop there rather than engineering this to death. We've survived twenty years with *no* feature of this sort, how is it suddenly essential that we expose postmaster internal state? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
On Feb 27, 2015 5:02 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Arthur Silva wrote: Sorry to intrude, I've been following this post and I was wondering if it would allow (in the currently planed form or in the future) a wider set of non-rewriting DDLs to Postgres. For example, drop a column without rewriting the table. Perhaps. But dropping a column already does not rewrite the table, only marks the column as dropped in system catalogs, so do you have a better example. One obvious example is that you have CREATE TABLE t ( t1 int, t3 int ); and later want to add t2 in the middle, the only way currently is to drop the table and start again (re-creating all dependant views, FKs, etc). With the patch you will be able to add the column at the right place. If no default value is supplied for the new column, no table rewrite is necessary at all. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services Cool! I didn't know I could drop stuff without rewriting. Ya, that's another example, people do these from GUI tools. That's a nice side effect. Cool (again)!
Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
2015-02-27 22:26 GMT+01:00 Tom Lane t...@sss.pgh.pa.us: Stephen Frost sfr...@snowman.net writes: Right, we also need a view (or function, or both) which provides what the *active* configuration of the running postmaster is. This is exactly what I was proposing (or what I was intending to, at least) with pg_hba_active, so, again, I think we're in agreement here. I think that's going to be a lot harder than you realize, and it will have undesirable security implications, in that whatever you do to expose the postmaster's internal state to backends will also make it visible to other onlookers; not to mention probably adding new failure modes. we can do copy of pg_hba.conf somewhere when postmaster starts or when it is reloaded. Later, we can read this copy from child nodes. Is it a possibility? Regards Pavel There are also nontrivial semantic differences in this area between Windows and other platforms (ie in an EXEC_BACKEND build the current file contents *are* the active version). If you insist on two views you will need to explain why/how they act differently on different platforms. I think the proposed mechanism (ie read and return the current contents of the file) is just fine, and we should stop there rather than engineering this to death. We've survived twenty years with *no* feature of this sort, how is it suddenly essential that we expose postmaster internal state? regards, tom lane
Re: [HACKERS] Reduce pinning in btree indexes
Kyotaro == Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes: Kyotaro Just a reminder, but I am not the author of this patch:) Yes, I understand that. Kyotaro Mmm? The patch bt-nopin-v1.patch seems not contain the change Kyotaro for ExecSupportMarkRestore and the very simple function remain Kyotaro looking to return true for T_Index(Only)Scan after the patch Kyotaro applied. Right. I'm suggesting you change that, in order to determine what performance cost, if any, would result from abandoning the idea of doing mark/restore entirely. Kyotaro I understand that you'd like to see the net drag of Kyotaro performance by the memcpy(), right? No. What I am suggesting is this: if mark/restore is a performance issue, then it would be useful to know how much gain we're getting (if any) from supporting it _at all_. Let me try and explain it another way. If you change ExecSupportMarkRestore to return false for index scans, then btmarkpos/btrestorepos will no longer be called. What is the performance of this case compared to the original and patched versions? -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reduce pinning in btree indexes
Just a reminder, but I am not the author of this patch:) At Fri, 27 Feb 2015 07:26:38 +, Andrew Gierth and...@tao11.riddles.org.uk wrote in 87zj7z6ckc@news-spur.riddles.org.uk Kyotaro == Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes: You might want to try running the same test, but after patching ExecSupportsMarkRestore to return false for index scans. This will cause the planner to insert a Materialize node to handle the mark/restore. Kyotaro Mmm? The patch bt-nopin-v1.patch seems not contain the change Kyotaro for ExecSupportMarkRestore and the very simple function remain Kyotaro looking to return true for T_Index(Only)Scan after the patch Kyotaro applied. Right. I'm suggesting you change that, in order to determine what performance cost, if any, would result from abandoning the idea of doing mark/restore entirely. I understand that you'd like to see the net drag of performance by the memcpy(), right? That don't seem to be possible without breaking (a part of) the patch's function, but, concerning btmarkpos() only case, the mark struct can have garbage so only changing refcount would be viable to see the pure loss by the memcpy(). The attached patch is the destruction I did, and the result was like below. Case 2. 1M markpos, no restorepos for 1M result rows. IndesOnly scan: The patches loses about 10%. master: 3655 ms (std dev = 2.5 ms) Patched: 4038 ms (std dev = 2.6 ms) Patched: 4036 ms (std dev = 3.5 ms) Broken: 3718 ms (std dev = 1.6 ms) The Patched just above is the retook numbers. It's a little under 9% loss from broken version. That is the pure drag of memcpy(). This seems quite big as Heikki said. It's an extreme case, though. The rest 1.7% should be the share of all the other stuff. regards, -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c index 65daf27..4973b63 100644 --- a/src/backend/access/nbtree/nbtree.c +++ b/src/backend/access/nbtree/nbtree.c @@ -547,12 +547,17 @@ btmarkpos(PG_FUNCTION_ARGS) BTScanOpaque so = (BTScanOpaque) scan-opaque; hoge_markpos_count++; - memcpy(so-markPos, so-currPos, - offsetof(BTScanPosData, items[1]) + - so-currPos.lastItem * sizeof(BTScanPosItem)); - if (so-markTuples) - memcpy(so-markTuples, so-currTuples, - so-currPos.nextTupleOffset); +// memcpy(so-markPos, so-currPos, +// offsetof(BTScanPosData, items[1]) + +// so-currPos.lastItem * sizeof(BTScanPosItem)); +// if (so-markTuples) +// memcpy(so-markTuples, so-currTuples, +// so-currPos.nextTupleOffset); + if (BTScanPosIsValid(so-markPos)) + { + ReleaseBuffer(so-markPos.buf); + so-markPos.buf = InvalidBuffer; + } /* We don't take out an extra pin for the mark position. */ so-markPos.buf = InvalidBuffer; @@ -599,12 +604,13 @@ btrestrpos(PG_FUNCTION_ARGS) if (BTScanPosIsValid(so-markPos)) { - memcpy(so-currPos, so-markPos, - offsetof(BTScanPosData, items[1]) + - so-markPos.lastItem * sizeof(BTScanPosItem)); - if (so-currTuples) -memcpy(so-currTuples, so-markTuples, - so-markPos.nextTupleOffset); +// memcpy(so-currPos, so-markPos, +// offsetof(BTScanPosData, items[1]) + +// so-markPos.lastItem * sizeof(BTScanPosItem)); +// if (so-currTuples) +//memcpy(so-currTuples, so-markTuples, +// so-markPos.nextTupleOffset); + IncrBufferRefCount(so-markPos.buf); } else BTScanPosInvalidate(so-currPos); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reduce pinning in btree indexes
Kyotaro == Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes: Kyotaro Anyway I'm sorry but I have left my dev env and cannot have Kyotaro time till next week. The point is moot; rather than try and explain it further I did the test myself, and the performance penalty of disabling mark/restore is substantial, on the order of 30%, so that's a non-starter. (I was a bit surprised that it was so bad...) -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
* Tom Lane (t...@sss.pgh.pa.us) wrote: Stephen Frost sfr...@snowman.net writes: Right, we also need a view (or function, or both) which provides what the *active* configuration of the running postmaster is. This is exactly what I was proposing (or what I was intending to, at least) with pg_hba_active, so, again, I think we're in agreement here. I think that's going to be a lot harder than you realize, and it will have undesirable security implications, in that whatever you do to expose the postmaster's internal state to backends will also make it visible to other onlookers; not to mention probably adding new failure modes. I had been considering what it'd take (and certainly appreciated that it's not trivial to look at the postmaster post-fork) but had also been thinking a simpler approach might be possible (one which doesn't involve *directly* looking at the postmaster config)- we could probably reasonably consider whatever the backend has currently is the same as the active configuration, provided we reload the pg_hba.conf into the backends when a sighup is sent, just as we do with postgresql.conf. I understand that there may be objections to that on the basis that it's work that's (other than for this case) basically useless, but then again, it's not like we anticipate reloads happening with a high frequency or that we expect loading these files to take all that long. The original patch only went off of what was in place when the backend was started from the postmaster and the later patch changed it to just always show what was currently in the pg_hba.conf file, but what everyone on this thread (except those worried more about the implementation and less about the capability) expects and wants is pg_settings, but for pg_hba. The way we get that is to do it exactly the same as how we handle pg_settings. I think the proposed mechanism (ie read and return the current contents of the file) is just fine, and we should stop there rather than engineering this to death. We've survived twenty years with *no* feature of this sort, how is it suddenly essential that we expose postmaster internal state? Perhaps I'm missing something, but I really don't see it to be a huge deal to just reload pg_hba.conf in the backends when a sighup is done, which would provide pg_settings-like results (ignoring that you can set GUCs directly in the backend, of course), with two ways through the function which loads the file- one which actually updates the global setting in the backend during a sighup, and one which provides the results in variables passed in by the caller (or similar) and allows returning the contents of the current pg_hba.conf as parsed in an SRF. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
Pavel, * Pavel Stehule (pavel.steh...@gmail.com) wrote: 2015-02-27 22:26 GMT+01:00 Tom Lane t...@sss.pgh.pa.us: Stephen Frost sfr...@snowman.net writes: Right, we also need a view (or function, or both) which provides what the *active* configuration of the running postmaster is. This is exactly what I was proposing (or what I was intending to, at least) with pg_hba_active, so, again, I think we're in agreement here. I think that's going to be a lot harder than you realize, and it will have undesirable security implications, in that whatever you do to expose the postmaster's internal state to backends will also make it visible to other onlookers; not to mention probably adding new failure modes. we can do copy of pg_hba.conf somewhere when postmaster starts or when it is reloaded. Please see my reply to Tom. There's no trivial way to reach into the postmaster from a backend- but we do get a copy of whatever the postmaster had when we forked, and the postmaster only reloads pg_hba.conf on a sighup and that sighup is passed down to the children, so we simply need to also reload the pg_hba.conf in the children when they get a sighup. That's how postgresql.conf is handled, which is what pg_settings is based off of, and I believe is the behavior folks are really looking for. Thanks, Stephen signature.asc Description: Digital signature
[HACKERS] Re: [COMMITTERS] pgsql: Invent a memory context reset/delete callback mechanism.
On Fri, 2015-02-27 at 22:17 +, Tom Lane wrote: In passing, per discussion, rearrange some boolean fields in struct MemoryContextData so as to avoid wasted padding space. For safety, this requires making allowInCritSection's existence unconditional; but I think that's a better approach than what was there anyway. I notice that this uses the bytes in MemoryContextData that I was hoping to use for the memory accounting patch (for memory-bounded HashAgg). Leaving no space for even a 32-bit value (without AllocSetContext growing beyond the magical 192-byte number Andres mentioned) removes most of the options for memory accounting. The only things I can think of are: 1. Move a few less-frequently-accessed fields out to an auxiliary structure (Tomas proposed something similar); or 2. Walk the memory contexts, but use some kind of estimation/extrapolation so that it doesn't need to be done for every input tuple of HashAgg. Thoughts? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New CF app deployment
On Fri, Feb 27, 2015 at 11:32 AM, Stefan Kaltenbrunner ste...@kaltenbrunner.cc wrote: On 02/26/2015 01:59 PM, Michael Paquier wrote: On Thu, Feb 26, 2015 at 9:15 PM, Asif Naeem anaeem...@gmail.com wrote: This thread seems relevant, Please guide me to how can access older CF pages The MSVC portion of this fix got completely lost in the void: https://commitfest.postgresql.org/action/patch_view?id=1330 Above link result in the following error i.e. Not found The specified URL was not found. Please do let me know if I missed something. Thanks. Try commitfest-old instead, that is where the past CF app stores its data, like that: https://commitfest-old.postgresql.org/action/patch_view?id=1330 hmm maybe we should have some sort of handler the redirects/reverse proxies to the old commitfest app for this. 1+ for seamless integration, at least error message seems require to be more helpful. Thanks. Stefan
Re: [HACKERS] Merge compact/non compact commits, make aborts dynamically sized
On 2015-02-27 16:26:08 +0900, Michael Paquier wrote: On Wed, Feb 25, 2015 at 8:10 PM, Andres Freund and...@2ndquadrant.com wrote: On 2015-02-24 20:51:42 +0200, Heikki Linnakangas wrote: On 02/20/2015 05:21 PM, Andres Freund wrote: There's one bit that I'm not so sure about though: To avoid duplication I've added Parse(Commit/Abort)Record(), but unfortunately that has to be available both in front and backend code - so it's currently living in xactdesc.c. I think we can live with that, but it's certainly not pretty. Yeah, that's ugly. Why does frontend code need that? The old format isn't exactly trivial for frontend code to decode either. pg_xlogdump outputs subxacts and such; I don't forsee other usages. Sure, we could copy the code around, but I think that's worse than having it in xactdesc.c. Needs a comment explaining why it's there if I haven't added one already. FWIW, I think they would live better in frontend code for client applications. What do you mean with that? You mean you'd rather see a copy in pg_xlogdump somewhere? How would you trigger that being used instead of the normal description routine? +/* free opcode 0x70 */ + +#define XLOG_XACT_OPMASK 0x70 There is a contradiction here, 0x70 is not free. The above is a mask, not an opcode - so I don't think it's a contraction. I'll expand on the commentary on the next version. Thanks for having a look! Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New CF app deployment
Thank you Michael, It works. On Thu, Feb 26, 2015 at 5:59 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Thu, Feb 26, 2015 at 9:15 PM, Asif Naeem anaeem...@gmail.com wrote: This thread seems relevant, Please guide me to how can access older CF pages The MSVC portion of this fix got completely lost in the void: https://commitfest.postgresql.org/action/patch_view?id=1330 Above link result in the following error i.e. Not found The specified URL was not found. Please do let me know if I missed something. Thanks. Try commitfest-old instead, that is where the past CF app stores its data, like that: https://commitfest-old.postgresql.org/action/patch_view?id=1330 -- Michael
Re: [HACKERS] Reduce pinning in btree indexes
Hello. Mmm... We might be focusing on a bit different things.. 2015/02/27 17:27 Andrew Gierth Kyotaro I understand that you'd like to see the net drag of Kyotaro performance by the memcpy(), right? No. What I am suggesting is this: if mark/restore is a performance issue, then it would be useful to know how much gain we're getting (if any) from supporting it _at all_. The mark/restore works as before with this patch, except the stuff for early dropping of buffer pins. As far as my understanding so far all the stuff in the patch is for the purpose but doesn't intend to boost btree itself. That is, it reduces the chance to block vacuum for possible burden on general performance. And I think the current issue in focus is that the burden might a bit too heavy on specific situation. Therefore I tried to measure how heavy/light the burden is. Am I correct so far? Let me try and explain it another way. If you change ExecSupportMarkRestore to return false for index scans, then btmarkpos/btrestorepos will no longer be called. What is the performance of this case compared to the original and patched versions? As you mentioned before, such change will make the planner turn to, perhaps materialize plan or rescan, or any other scans which are no use comparing to indexscan concerning the issue in focus, the performance drag when btree scan does extremely frequent mark/restore in comparison to unpatched, less copy-amount version. Your suggestion looks intending somewhat different from this. Anyway I'm sorry but I have left my dev env and cannot have time till next week. regards, -- Kyotaro Horiguti
Re: [HACKERS] MemoryContext reset/delete callbacks
Andres Freund and...@2ndquadrant.com writes: On 2015-02-26 19:28:50 -0500, Tom Lane wrote: 1. I used ilists for the linked list of callback requests. This creates a dependency from memory contexts to lib/ilist. That's all right at the moment since lib/ilist does no memory allocation, but it might be logically problematic. We could just use explicit struct foo * links with little if any notational penalty, so I wonder if that would be better. Maybe I'm partial here, but I don't see a problem. Actually the reason I started the ilist stuff was that I wrote a different memory context implementation ;). Wish I'd time/energy to go back to that... After further reflection, I concluded that it was better to go with the low-tech struct foo *next approach. Aside from the question of whether we really want mcxt.c to have any more dependencies than it already does, there's the stylistic point that mcxt.c is already managing lists (of child contexts) that way; it would be odd to have two different list technologies in use in the one data structure. You could of course argue that both of these should be changed to slists, but that would be a matter for a separate patch. (And frankly, I'm not so in love with the slist notation that I'd think it an improvement.) I also rearranged the bool fields as you suggested to avoid wasted padding space. I'm still not exactly convinced that it's worth the ugliness, but it's not worth arguing about. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
On 02/27/2015 12:48 PM, Tomas Vondra wrote: On 27.2.2015 21:42, Josh Berkus wrote: On 02/27/2015 12:25 PM, Tomas Vondra wrote: On 27.2.2015 21:09, Josh Berkus wrote: Tomas, So for an API, 100% of the use cases I have for this feature would be satisfied by: ALTER TABLE __ ALTER COLUMN _ SET ORDER # and: ALTER TABLE _ ADD COLUMN colname coltype ORDER # Yes, I imagined an interface like that. Just to be clear, you're talking about logical order (and not a physical one), right? Correct. The only reason to rearrange the physical columns is in order to optimize, which presumably would be carried out by a utility command, e.g. VACUUM FULL OPTIMIZE. I was thinking more about CREATE TABLE at this moment, but yeah, VACUUM FULL OPTIMIZE might do the same thing. Actually, I'm going to go back on what I said. We need an API for physical column reordering, even if it's just pg_ functions. The reason is that we want to enable people writing their own physical column re-ordering tools, so that our users can figure out for us what the best reordering algorithm is. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Buildfarm has got the measles
A significant fraction of the buildfarm is showing this in HEAD: *** *** 375,382 create table rewritemetoo1 of rewritetype; create table rewritemetoo2 of rewritetype; alter type rewritetype alter attribute a type text cascade; - NOTICE: Table 'rewritemetoo1' is being rewritten (reason = 4) NOTICE: Table 'rewritemetoo2' is being rewritten (reason = 4) -- but this doesn't work create table rewritemetoo3 (a rewritetype); alter type rewritetype alter attribute a type varchar cascade; --- 375,382 create table rewritemetoo1 of rewritetype; create table rewritemetoo2 of rewritetype; alter type rewritetype alter attribute a type text cascade; NOTICE: Table 'rewritemetoo2' is being rewritten (reason = 4) + NOTICE: Table 'rewritemetoo1' is being rewritten (reason = 4) -- but this doesn't work create table rewritemetoo3 (a rewritetype); alter type rewritetype alter attribute a type varchar cascade; == Evidently the order in which child tables are visited isn't too stable. I'm inclined to think that that's fine and this regression test needs reconsideration. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
On 27.2.2015 23:48, Josh Berkus wrote: On 02/27/2015 12:48 PM, Tomas Vondra wrote: On 27.2.2015 21:42, Josh Berkus wrote: On 02/27/2015 12:25 PM, Tomas Vondra wrote: On 27.2.2015 21:09, Josh Berkus wrote: Tomas, So for an API, 100% of the use cases I have for this feature would be satisfied by: ALTER TABLE __ ALTER COLUMN _ SET ORDER # and: ALTER TABLE _ ADD COLUMN colname coltype ORDER # Yes, I imagined an interface like that. Just to be clear, you're talking about logical order (and not a physical one), right? Correct. The only reason to rearrange the physical columns is in order to optimize, which presumably would be carried out by a utility command, e.g. VACUUM FULL OPTIMIZE. I was thinking more about CREATE TABLE at this moment, but yeah, VACUUM FULL OPTIMIZE might do the same thing. Actually, I'm going to go back on what I said. We need an API for physical column reordering, even if it's just pg_ functions. The reason is that we want to enable people writing their own physical column re-ordering tools, so that our users can figure out for us what the best reordering algorithm is. I doubt that. For example, do you realize you can only do that while the table is completely empty, and in that case you can just do a CREATE TABLE with the proper order? I also doubt the users will be able to optimize the order better than users, who usually have on idea of how this actually works internally. But if we want to allow users to define this, I'd say let's make that part of CREATE TABLE, i.e. the order of columns defines logical order, and you use something like 'AFTER' to specify physical order. CREATE TABLE test ( a INT AFTER b,-- attlognum = 1, attphysnum = 2 b INT -- attlognum = 2, attphysnum = 1 ); It might get tricky because of cycles, though. -- Tomas Vondrahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
Tomas Vondra-4 wrote But if we want to allow users to define this, I'd say let's make that part of CREATE TABLE, i.e. the order of columns defines logical order, and you use something like 'AFTER' to specify physical order. CREATE TABLE test ( a INT AFTER b,-- attlognum = 1, attphysnum = 2 b INT -- attlognum = 2, attphysnum = 1 ); Why not memorialize this as a storage parameter? CREATE TABLE test ( a INT, b INT ) WITH (layout = 'b, a') ; A canonical form should be defined and then ALTER TABLE can either directly update the current value or require the user to specify a new layout before it will perform the alteration. David J. -- View this message in context: http://postgresql.nabble.com/logical-column-ordering-tp5829775p5839825.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
On 02/27/2015 03:06 PM, Tomas Vondra wrote: On 27.2.2015 23:48, Josh Berkus wrote: Actually, I'm going to go back on what I said. We need an API for physical column reordering, even if it's just pg_ functions. The reason is that we want to enable people writing their own physical column re-ordering tools, so that our users can figure out for us what the best reordering algorithm is. I doubt that. For example, do you realize you can only do that while the table is completely empty, and in that case you can just do a CREATE TABLE with the proper order? Well, you could recreate the table as the de-facto API, although as you point out below that still requires new syntax. But I was thinking of something which would re-write the table, just like ADD COLUMN x DEFAULT '' does now. I also doubt the users will be able to optimize the order better than users, who usually have on idea of how this actually works internally. We have a lot of power users, including a lot of the people on this mailing list. Among the things we don't know about ordering optimization: * How important is it for index performance to keep key columns adjacent? * How important is it to pack values 4 bytes, as opposed to packing values which are non-nullable? * How important is it to pack values of the same size, as opposed to packing values which are non-nullable? But if we want to allow users to define this, I'd say let's make that part of CREATE TABLE, i.e. the order of columns defines logical order, and you use something like 'AFTER' to specify physical order. CREATE TABLE test ( a INT AFTER b,-- attlognum = 1, attphysnum = 2 b INT -- attlognum = 2, attphysnum = 1 ); It might get tricky because of cycles, though. It would be a lot easier to allow the user to specific a scalar. CREATE TABLE test ( a INT NOT NULL WITH ( lognum 1, physnum 2 ) b INT WITH ( lognum 2, physnum 1 ) ... and just throw an error if the user creates duplicates or gaps. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
David G Johnston wrote Tomas Vondra-4 wrote But if we want to allow users to define this, I'd say let's make that part of CREATE TABLE, i.e. the order of columns defines logical order, and you use something like 'AFTER' to specify physical order. CREATE TABLE test ( a INT AFTER b,-- attlognum = 1, attphysnum = 2 b INT -- attlognum = 2, attphysnum = 1 ); Why not memorialize this as a storage parameter? CREATE TABLE test ( a INT, b INT ) WITH (layout = 'b, a') ; A canonical form should be defined and then ALTER TABLE can either directly update the current value or require the user to specify a new layout before it will perform the alteration. David J. Extending the idea a bit further why not have CREATE TABLE be the API; or at least something similar to it? CREATE OR REARRANGE TABLE test ( [...] ) WITH (); The [...] part would be logical and the WITH() would define the physical. The PK would simply be whatever is required to make the command work. David J. -- View this message in context: http://postgresql.nabble.com/logical-column-ordering-tp5829775p5839828.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] star schema and the optimizer
Hi all, I've been facing an issue with star schemas for a while. PostgreSQL's performance is not that good without rewriting the query (or at least I couldn't find a way to make it do what I want). Here is a very basic mockup schema I used for the rest of this mail. It's of course too small to see very long runtimes (I see minutes, not tenths of seconds, with the schema that triggered this work): create table dim1 (a int, b int); create table dim2 (a int, b int); insert into dim1 select i,(random()*1000)::int from generate_series(1,1) g(i); insert into dim2 select i,(random()*1000)::int from generate_series(1,1) g(i); create index idxdim11 on dim1(b); create index idxdim12 on dim1(a); create index idxdim21 on dim2(b); create index idxdim22 on dim2(a); create table facts (dim1 bigint, dim2 bigint, data1 text, data2 text, data3 text, data4 text, data5 text, data6 text); insert into facts select (random()*1000)::int, (random()*1000)::int,i,i,i,i,i from generate_series(1,1000) g(i); create index idxfacts on facts(dim1,dim2); analyze; Here is the query that I want to make faster: SELECT * FROM dim1,dim2,facts WHERE facts.dim1=dim1.a and facts.dim2=dim2.a and dim1.b=12 AND dim2.b=17; PostgreSQL creates this plan: Nested Loop (cost=233.98..207630.07 rows=8 width=99) (actual time=131.891..131.891 rows=0 loops=1) Join Filter: (facts.dim2 = dim2.a) Rows Removed by Join Filter: 239796 - Index Scan using idxdim22 on dim2 (cost=0.29..343.29 rows=9 width=8) (actual time=0.672..4.436 rows=12 loops=1) Filter: (b = 17) Rows Removed by Filter: 9988 - Materialize (cost=233.70..206094.28 rows=9000 width=91) (actual time=0.471..6.673 rows=19983 loops=12) - Nested Loop (cost=233.70..206049.28 rows=9000 width=91) (actual time=5.633..53.121 rows=19983 loops=1) - Index Scan using idxdim12 on dim1 (cost=0.29..343.29 rows=9 width=8) (actual time=0.039..4.359 rows=10 loops=1) Filter: (b = 12) Rows Removed by Filter: 9990 - Bitmap Heap Scan on facts (cost=233.41..22756.32 rows=9990 width=83) (actual time=1.113..4.146 rows=1998 loops=10) Recheck Cond: (dim1 = dim1.a) Heap Blocks: exact=19085 - Bitmap Index Scan on idxfacts (cost=0.00..230.92 rows=9990 width=0) (actual time=0.602..0.602 rows=1998 loops=10) Index Cond: (dim1 = dim1.a) Planning time: 1.896 ms Execution time: 132.588 ms What I used to do was to set join_collapse_limit to 1 and rewrite the query like this: EXPLAIN ANALYZE SELECT * FROM dim1 cross join dim2 join facts on (facts.dim1=dim1.a and facts.dim2=dim2.a) where dim1.b=12 AND dim2.b=17; QUERY PLAN Nested Loop (cost=13.25..3661.66 rows=8 width=99) (actual time=0.758..0.758 rows=0 loops=1) - Nested Loop (cost=8.71..57.82 rows=81 width=16) (actual time=0.065..0.151 rows=120 loops=1) - Bitmap Heap Scan on dim1 (cost=4.35..28.39 rows=9 width=8) (actual time=0.040..0.057 rows=10 loops=1) Recheck Cond: (b = 12) Heap Blocks: exact=10 - Bitmap Index Scan on idxdim11 (cost=0.00..4.35 rows=9 width=0) (actual time=0.033..0.033 rows=10 loops=1) Index Cond: (b = 12) - Materialize (cost=4.35..28.44 rows=9 width=8) (actual time=0.002..0.006 rows=12 loops=10) - Bitmap Heap Scan on dim2 (cost=4.35..28.39 rows=9 width=8) (actual time=0.021..0.040 rows=12 loops=1) Recheck Cond: (b = 17) Heap Blocks: exact=11 - Bitmap Index Scan on idxdim21 (cost=0.00..4.35 rows=9 width=0) (actual time=0.017..0.017 rows=12 loops=1) Index Cond: (b = 17) - Bitmap Heap Scan on facts (cost=4.54..44.39 rows=10 width=83) (actual time=0.004..0.004 rows=0 loops=120) Recheck Cond: ((dim1 = dim1.a) AND (dim2 = dim2.a)) - Bitmap Index Scan on idxfacts (cost=0.00..4.53 rows=10 width=0) (actual time=0.004..0.004 rows=0 loops=120) Index Cond: ((dim1 = dim1.a) AND (dim2 = dim2.a)) Planning time: 0.520 ms Execution time: 0.812 ms The cost is 100 times lower. So this plan seems to be a good candidate. Or at least it keeps my users happy. That rewriting worked for me, but today, I'm in a context where I cannot rewrite the query... it's generated. So I gave a look at the optimizer's code to try to understand why I got this problem. If I understand correctly, the optimizer won't do cross joins, except if it has no choice. A funny sidenote before continuing: having dim1.b = dim2.b gives the right plan
Re: [HACKERS] Performance improvement for joins where outer side is unique
On 26 February 2015 at 08:39, Tomas Vondra tomas.von...@2ndquadrant.com wrote: I've been looking at this patch, mostly because it seems like a great starting point for improving estimation for joins on multi-column FKs. Currently we do this: CREATE TABLE parent (a INT, b INT, PRIMARY KEY (a,b)); CREATE TABLE child (a INT, b INT, FOREIGN KEY (a,b) REFERENCES parent (a,b)); INSERT INTO parent SELECT i, i FROM generate_series(1,100) s(i); INSERT INTO child SELECT i, i FROM generate_series(1,100) s(i); ANALYZE; EXPLAIN SELECT * FROM parent JOIN child USING (a,b); QUERY PLAN - Hash Join (cost=2.00..66978.01 rows=1 width=8) Hash Cond: ((parent.a = child.a) AND (parent.b = child.b)) - Seq Scan on parent (cost=0.00..14425.00 rows=100 width=8) - Hash (cost=14425.00..14425.00 rows=100 width=8) - Seq Scan on child (cost=0.00..14425.00 rows=100 width=8) (5 rows) Which is of course non-sense, because we know it's a join on FK, so the join will produce 1M rows (just like the child table). This seems like a rather natural extension of what you're doing in this patch, except that it only affects the optimizer and not the executor. Do you have any plans in this direction? If not, I'll pick this up as I do have that on my TODO. Hi Tomas, I guess similar analysis could be done on FKs as I'm doing on unique indexes. Perhaps my patch for inner join removal can help you more with that. You may notice that in this patch I have ended up changing the left join removal code so that it just checks if has_unique_join is set for the special join. Likely something similar could be done with your idea and the inner join removals, just by adding some sort of flag on RelOptInfo to say join_row_exists or some better name. Quite likely if there's any pending foreign key triggers, then it won't matter at all for the sake of row estimates. Regards David Rowley
Re: [HACKERS] How about to have relnamespace and relrole?
The attatched are the fourth version of this patch. 0001-Add-regrole_v4.patch 0002-Add-regnamespace_v4.patch Seems like you have missed to attach both the patches. -- Jeevan B Chalke Principal Software Engineer, Product Development EnterpriseDB Corporation The Enterprise PostgreSQL Company
Re: [HACKERS] Merge compact/non compact commits, make aborts dynamically sized
On Fri, Feb 27, 2015 at 6:49 PM, Andres Freund and...@2ndquadrant.com wrote: On 2015-02-27 16:26:08 +0900, Michael Paquier wrote: On Wed, Feb 25, 2015 at 8:10 PM, Andres Freund and...@2ndquadrant.com wrote: On 2015-02-24 20:51:42 +0200, Heikki Linnakangas wrote: On 02/20/2015 05:21 PM, Andres Freund wrote: There's one bit that I'm not so sure about though: To avoid duplication I've added Parse(Commit/Abort)Record(), but unfortunately that has to be available both in front and backend code - so it's currently living in xactdesc.c. I think we can live with that, but it's certainly not pretty. Yeah, that's ugly. Why does frontend code need that? The old format isn't exactly trivial for frontend code to decode either. pg_xlogdump outputs subxacts and such; I don't forsee other usages. Sure, we could copy the code around, but I think that's worse than having it in xactdesc.c. Needs a comment explaining why it's there if I haven't added one already. FWIW, I think they would live better in frontend code for client applications. What do you mean with that? You mean you'd rather see a copy in pg_xlogdump somewhere? How would you trigger that being used instead of the normal description routine? No, no. I meant that it is good the way your patch does it in xactdesc.c, where both frontend and backend can reach it. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Docs about shared memory
Hi! Where i can find explanation about how postgresql works with shared memory? --- Best regard Vadim Gribanov Linkedin: https://www.linkedin.com/in/yoihito Skype: v1mk550 Github: https://github.com/yoihito