Re: [HACKERS] inherit support for foreign tables
On 2015/03/23 2:57, Tom Lane wrote: Etsuro Fujita fujita.ets...@lab.ntt.co.jp writes: [ fdw-inh-8.patch ] I've committed this with some substantial rearrangements, notably: * As I mentioned earlier, I got rid of a few unnecessary restrictions on foreign tables so as to avoid introducing warts into inheritance behavior. In particular, we now allow NOT VALID CHECK constraints (and hence ALTER ... VALIDATE CONSTRAINT), ALTER SET STORAGE, and ALTER SET WITH/WITHOUT OIDS. These are probably no-ops anyway for foreign tables, though conceivably an FDW might choose to implement some behavior for STORAGE or OIDs. I agree with you on this point. However, ISTM there is a bug in handling OIDs on foreign tables; while we now allow for ALTER SET WITH/WITHOUT OIDS, we still don't allow the default_with_oids parameter for foreign tables. I think that since CREATE FOREIGN TABLE should be consistent with ALTER FOREIGN TABLE, we should also allow the parameter for foreign tables. Attached is a patch for that. Best regards, Etsuro Fujita *** a/src/backend/commands/tablecmds.c --- b/src/backend/commands/tablecmds.c *** *** 580,586 DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId, descriptor = BuildDescForRelation(schema); localHasOids = interpretOidsOption(stmt-options, ! (relkind == RELKIND_RELATION)); descriptor-tdhasoid = (localHasOids || parentOidCount 0); /* --- 580,587 descriptor = BuildDescForRelation(schema); localHasOids = interpretOidsOption(stmt-options, ! (relkind == RELKIND_RELATION || ! relkind == RELKIND_FOREIGN_TABLE)); descriptor-tdhasoid = (localHasOids || parentOidCount 0); /* -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Supporting src/test/modules in MSVC builds
Hi all, As mentioned previously (cab7npqscphafxs2rzeb-fbccjqiknqxjhloztkggim1mf5x...@mail.gmail.com), attached are patches to add support for src/test/modules in MSVC builds, modules whose tests are not supported since they have been moved from contrib/: - 0001 adds support for the build portion. - 0002 adds support for the installation. I arrived at the conclusion that those modules should be installed by default, because contribcheck relies on the fact that tests should be run on a server already running, and modulescheck should do the same IMO. - 0003 adds a new target modulescheck in vcregress. Patches 0001 and 0003 are based on some previous work from Alvaro, that I modified slightly after testing them. Note that this split is done to test each step on the build farm for safety. Regards, -- Michael From 1b58eba30d0347a7718a0db18ac6ae2c02888aaf Mon Sep 17 00:00:00 2001 From: Michael Paquier mich...@otacoo.com Date: Wed, 15 Apr 2015 22:14:33 -0700 Subject: [PATCH 1/3] Include modules of src/test/modules in build commit_ts, being only a module used for test purposes, is simply ignored in the process. --- src/tools/msvc/Mkvcbuild.pm | 33 +++-- 1 file changed, 19 insertions(+), 14 deletions(-) diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm index e4dbebf..986f3b3 100644 --- a/src/tools/msvc/Mkvcbuild.pm +++ b/src/tools/msvc/Mkvcbuild.pm @@ -28,7 +28,7 @@ my $libpgcommon; my $postgres; my $libpq; -# Set of variables for contrib modules +# Set of variables for modules in contrib/ and src/test/modules/ my $contrib_defines = { 'refint' = 'REFINT_VERBOSE' }; my @contrib_uselibpq = ('dblink', 'oid2name', 'postgres_fdw', 'vacuumlo'); @@ -50,7 +50,7 @@ my $contrib_extraincludes = my $contrib_extrasource = { 'cube' = [ 'contrib\cube\cubescan.l', 'contrib\cube\cubeparse.y' ], 'seg' = [ 'contrib\seg\segscan.l', 'contrib\seg\segparse.y' ], }; -my @contrib_excludes = ('pgcrypto', 'intagg', 'sepgsql'); +my @contrib_excludes = ('pgcrypto', 'commit_ts', 'intagg', 'sepgsql'); # Set of variables for frontend modules my $frontend_defines = { 'initdb' = 'FRONTEND' }; @@ -564,15 +564,18 @@ sub mkvcbuild my $mf = Project::read_file('contrib/pgcrypto/Makefile'); GenerateContribSqlFiles('pgcrypto', $mf); - opendir($D, 'contrib') || croak Could not opendir on contrib!\n; - while (my $d = readdir($D)) + foreach my $subdir ('contrib', 'src/test/modules') { - next if ($d =~ /^\./); - next unless (-f contrib/$d/Makefile); - next if (grep { /^$d$/ } @contrib_excludes); - AddContrib($d); + opendir($D, $subdir) || croak Could not opendir on $subdir!\n; + while (my $d = readdir($D)) + { + next if ($d =~ /^\./); + next unless (-f $subdir/$d/Makefile); + next if (grep { /^$d$/ } @contrib_excludes); + AddContrib($subdir, $d); + } + closedir($D); } - closedir($D); $mf = Project::read_file('src\backend\utils\mb\conversion_procs\Makefile'); @@ -689,14 +692,15 @@ sub AddSimpleFrontend # Add a simple contrib project sub AddContrib { + my $subdir = shift; my $n = shift; - my $mf = Project::read_file('contrib\\' . $n . '\Makefile'); + my $mf = Project::read_file($subdir/$n/Makefile); if ($mf =~ /^MODULE_big\s*=\s*(.*)$/mg) { my $dn = $1; my $proj = - $solution-AddProject($dn, 'dll', 'contrib', 'contrib\\' . $n); + $solution-AddProject($dn, 'dll', 'contrib', $subdir/$n); $proj-AddReference($postgres); AdjustContribProj($proj); } @@ -705,8 +709,9 @@ sub AddContrib foreach my $mod (split /\s+/, $1) { my $proj = - $solution-AddProject($mod, 'dll', 'contrib', 'contrib\\' . $n); - $proj-AddFile('contrib\\' . $n . '\\' . $mod . '.c'); + $solution-AddProject($mod, 'dll', 'contrib', $subdir/$n); + my $filename = $mod . '.c'; + $proj-AddFile($subdir . '\\' . $n . '\\' . $mod . '.c'); $proj-AddReference($postgres); AdjustContribProj($proj); } @@ -714,7 +719,7 @@ sub AddContrib elsif ($mf =~ /^PROGRAM\s*=\s*(.*)$/mg) { my $proj = - $solution-AddProject($1, 'exe', 'contrib', 'contrib\\' . $n); + $solution-AddProject($1, 'exe', 'contrib', $subdir/$n); AdjustContribProj($proj); } else -- 2.3.5 From e901543c2fa728646dca13a66979a6f0619dd87f Mon Sep 17 00:00:00 2001 From: Michael Paquier mich...@otacoo.com Date: Wed, 15 Apr 2015 23:13:14 -0700 Subject: [PATCH 2/3] Support installation of test modules in MSVC Note that those modules are needed in the installation because like contribcheck, modulescheck does not create a cluster from scratch on where to run the tests. It seems also good to include them to be able to perform those tests using a central installation path. --- src/tools/msvc/Install.pm | 151 +- 1 file changed, 81 insertions(+), 70 deletions(-) diff --git a/src/tools/msvc/Install.pm b/src/tools/msvc/Install.pm index 93a6724..c82743d 100644 --- a/src/tools/msvc/Install.pm +++ b/src/tools/msvc/Install.pm @@
Re: [HACKERS] inherit support for foreign tables
Hello, At Thu, 16 Apr 2015 12:20:47 +0900, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote in 552f2a8f.2090...@lab.ntt.co.jp On 2015/04/15 3:52, Alvaro Herrera wrote: On 4/14/15 5:49 AM, Etsuro Fujita wrote: postgres=# create foreign table ft1 (c1 int) server myserver options (table_name 't1'); CREATE FOREIGN TABLE postgres=# create foreign table ft2 (c1 int) server myserver options (table_name 't2'); CREATE FOREIGN TABLE postgres=# alter foreign table ft2 inherit ft1; ALTER FOREIGN TABLE postgres=# select * from ft1 for update; ERROR: could not find junk tableoid1 column I think this is a bug. Attached is a patch fixing this issue. I think the real question is whether we're now (I mean after this patch) emitting useless tableoid columns that we didn't previously have. I think the answer is yes, and if so I think we need a smaller comb to fix the problem. This one seems rather large. My answer for that would be *no* because I think tableoid would be needed for EvalPlanQual checking in more complex SELECT FOR UPDATE on the inheritance or UPDATE/DELETE involving the inheritance as a source table. Also, if we allow the FDW to change the behavior of SELECT FOR UPDATE so as to match the local semantics exactly, which I'm working on in another thread, I think tableoid would also be needed for the actual row locking. Given the parent foreign talbes, surely they need tableoids for such usage. The patch preserves the condition rc-isParent so it newly affects exactly only parent foreign tables for now. Before the parent foreign tables introduced, ROW_MARK_COPY and RTE_RELATION are mutually exclusive so didn't need, or cannot have tableoid. But now it intorduces an rte with ROW_MARK_COPY RTE_RELATION and there seems no reason for parent tables in any kind not to have tableoid. After such consideration, I came to think that the patch is a reasonable fix, not mere a workaround. Thoughts? regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] show xl_prev in xlog.c errcontext
On 04/15/2015 11:35 PM, Alvaro Herrera wrote: I found this patch in my local repo that I wrote some weeks or months ago while debugging some XLog corruption problem: it was difficult to pinpoint what XLog record in a long sequence of WAL files was causing a problem, and the displaying the prev pointer in errcontext made finding it much easier -- I could correlate it with pg_xlogdump output, I think. Seems like a good idea, but why print the prev pointer, and not the location of the record itself? - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On 2015/04/15 2:27, Jim Nasby wrote: On 4/14/15 1:05 AM, Kyotaro HORIGUCHI wrote: As an example, the following operations cause an unexpected result. Those results are indeed surprising, but since we allow it in a direct connection I don't see why we wouldn't allow it in the Postgres FDW... As for the FDW not knowing about keys, why would it need to? If you try to do something illegal it's the remote side that should throw the error, not the FDW. Of course, if you try to do a locking operation on an FDW that doesn't support it, the FDW should throw an error... but that's different. Ah, you are right. FOR NO KEY UPDATE and FOR KEY SHARE would be useful in the Postgres FDW if we assume the user performs those properly based on information about keys for a remote table. Sorry, my explanation was not correct, but I want to make it clear that the proposed patch also allows the FDW to change the behavior of FOR NO KEY UPDATE and/or FOR KEY SHARE row locking so as to match the local semantics exactly. BTW, I revised docs a bit. Attached is an updated version of the patch. Best regards, Etsuro Fujita *** a/doc/src/sgml/fdwhandler.sgml --- b/doc/src/sgml/fdwhandler.sgml *** *** 211,216 BeginForeignScan (ForeignScanState *node, --- 211,232 /para para + If the FDW changes handling commandSELECT FOR UPDATE/SHARE/ row + locking, this routine should perform any initialization needed prior to + the actual row locking if the foreign table is referenced in a + commandSELECT FOR UPDATE/SHARE/. To decide whether the foreign + table is referenced in the command or not, it's recommended to use + functionExecFindRowMark/ with the missing_ok argument set to true, + which returns an structnameExecRowMark/ struct if the foreign table + is referenced in the command and otherwise returns a NULL. Information + about the row locking is accessible through the + structnameExecRowMark/ struct. + (The structfieldfdw_state/ field of structnameExecRowMark/ is + available for the FDW to store any private state it needs for the row + locking.) + /para + + para Note that when literal(eflags amp; EXEC_FLAG_EXPLAIN_ONLY)/ is true, this function should not perform any externally-visible actions; it should only do the minimum required to make the node state valid *** *** 598,603 IsForeignRelUpdatable (Relation rel); --- 614,701 /sect2 +sect2 id=fdw-callbacks-row-locking + titleFDW Routines For commandSELECT FOR UPDATE/SHARE/ row locking + /title + + para + If an FDW supports commandSELECT FOR UPDATE/SHARE/ row locking, + it should provide the following callback functions: + /para + + para + programlisting + RowMarkType + GetForeignRowMarkType (LockClauseStrength strength); + /programlisting + + Report which row-marking option to support for a lock strength + associated with a commandSELECT FOR UPDATE/SHARE/ request. + This is called at the beginning of query planning. + literalstrength/ is a member of the literalLockClauseStrength/ + enum type. + The return value should be a member of the literalRowMarkType/ + enum type. See + filenamesrc/include/nodes/lockoptions.h/ and + filenamesrc/include/nodes/plannodes.h/ for information about + these enum types. + /para + + para + If the functionGetForeignRowMarkType/ pointer is set to + literalNULL/, the default option is selected for any lock strength, + and both functionLockForeignRow/ and functionFetchForeignRow/ + described below will not be called at query execution time. + /para + + para + programlisting + bool + LockForeignRow (EState *estate, + ExecRowMark *erm, + ItemPointer tupleid); + /programlisting + + Lock one tuple in the foreign table. + literalestate/ is global execution state for the query. + literalerm/ is the structnameExecRowMark/ struct describing + the target foreign table. + literaltupleid/ identifies the tuple to be locked. + This function should return literaltrue/, if the FDW lock the tuple + successfully. Otherwise, return literalfalse/. + /para + + para + If the functionLockForeignRow/ pointer is set to + literalNULL/, attempts to lock rows will fail + with an error message. + /para + + para + programlisting + HeapTuple + FetchForeignRow (EState *estate, + ExecRowMark *erm, + ItemPointer tupleid); + /programlisting + + Fetch one tuple from the foreign table. + literalestate/ is global execution state for the query. + literalerm/ is the structnameExecRowMark/ struct describing + the target foreign table. + literaltupleid/ identifies the tuple to be fetched. + This function should return the fetched
Re: [HACKERS] show xl_prev in xlog.c errcontext
On Thu, Apr 16, 2015 at 3:25 PM, Heikki Linnakangas hlinn...@iki.fi wrote: On 04/15/2015 11:35 PM, Alvaro Herrera wrote: I found this patch in my local repo that I wrote some weeks or months ago while debugging some XLog corruption problem: it was difficult to pinpoint what XLog record in a long sequence of WAL files was causing a problem, and the displaying the prev pointer in errcontext made finding it much easier -- I could correlate it with pg_xlogdump output, I think. Seems like a good idea, but why print the prev pointer, and not the location of the record itself? And both? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix broken Install.bat when target directory contains a space
Hi Michael, I spend spend time look into the patch. Good catch, I am also surprised to see that current Windows install script don’t support spaces in the path. Please see my findings as following i.e. *Without the patch* 1. C:\PG\postgresql\src\tools\msvcinstall C:\PG\postgresql\inst with space without patch with was unexpected at this time. 2. C:\PG\postgresql\src\tools\msvcinstall Invalid command line options. Usage: install.bat path 3. C:\PG\postgresql\src\tools\msvcinstall /? Installing version 9.5 for release in /? Copying build output files...Could not copy release\postgres\postgres.exe to /?\bin\postgres.exe at Install.pm line 40 Install::lcopy('release\postgres\postgres.exe', '/?\bin\postgres.exe') called at Install.pm line 324 Install::CopySolutionOutput('release', '/?') called at Install.pm line 93 Install::Install('/?', undef) called at install.pl line 13 *With the patch* 1. C:\PG\postgresql\src\tools\msvcinstall C:\PG\postgresql\inst with space without patch Works fine. 2. C:\PG\postgresql\src\tools\msvcinstall Usage: install.pl targetdir [installtype] installtype: client 3. C:\PG\postgresql\src\tools\msvcinstall /? Invalid command line options. Usage: install.bat path Following change looks confusing to me i.e. -if NOT %1== GOTO RUN_INSTALL +if NOT [%1]==[/?] GOTO RUN_INSTALL Along with fixing the space in installation path, it is also changing the behavior of install script, why not just if NOT [%1]==[] GOTO RUN_INSTALL, checking for /? seems good for help message but it seem not well handled in the script, it is still saying “Invalid command line options.”, along with this, option /? seems not handled by any other .bat build script. Other than this, with the patch applied, is it an acceptable behavior that (2) shows usage message as 'Usage:* install.pl http://install.pl* targetdir [installtype]' but (3) shows usage message as 'Usage: *install.bat* path'. Thanks. Regards, Muhammad Asif Naeem On Mon, Mar 2, 2015 at 9:27 AM, Michael Paquier michael.paqu...@gmail.com wrote: Hi all, When using install.bat with a path containing spaces, I got surprised by a couple of errors. 1) First with this path: $ install c:\Program Files\pgsql I am getting the following error: Files\pgsql== was unexpected at this time. This is caused by an incorrect evaluation of the first parameter in install.bat. 2) After correcting the first error, the path is truncated to c:\Program because first argument value does not seem to be correctly parsed when used with install.pl. Attached is a patch fixing both problems. I imagine that it would be good to get that backpatched. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Minor improvement to config.sgml
Hi, Attached is a small patch to mark up on with literal in doc/src/sgml/config.sgml. Best regards, Etsuro Fujita diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index b30c68d..0d8624a 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -6752,7 +6752,7 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir' determines whether OIDs will be included in tables created by commandSELECT INTO/command. The parameter is literaloff/ by default; in productnamePostgreSQL/ 8.0 and earlier, it -was on by default. +was literalon/ by default. /para para -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0
On 2015-04-15 17:58:54 +0300, Heikki Linnakangas wrote: When the speculative insertion is finished, write a new kind of a WAL record for that. The record only needs to contain the ctid of the tuple. Replaying that record will clear the flag on the heap tuple that said that it was a speculative insertion. In logical decoding, decode speculative insertions like any other insertion. To decode a super-deletion record, scan the reorder buffer for the transaction to find the corresponding speculative insertion record for the tuple, and remove it. BTW, that'd work just as well without the new WAL record to finish a speculative insertion. Am I missing something? I'm, completely independent of logical decoding, of the *VERY* strong opinion that 'speculative insertions' should never be visible when looking with normal snapshots. For one it allows to simplify considerations around wraparound (which has proven to be a very good idea, c.f. multixacts + vacuum causing data corruption years after it was thought to be harmless). For another it allows to reclaim/redefine the bit after a database restart/upgrade. Given how complex this is and how scarce flags are that seems like a really good property. And avoiding those flags to be visible to the outside requires a WAL record, otherwise it won't be correct on the standby. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: default_index_tablespace
On 15 Apr 2015 19:12, Tom Lane t...@sss.pgh.pa.us wrote: I'm afraid this idea is a nonstarter, because it will break existing applications, and in particular existing pg_dump output files, which expect to be able to determine an index's tablespace by setting default_tablespace. (It is *not* adequate that the code falls back to default_tablespace if the new GUC is unset; if it is set, you've still broken pg_dump.) The incremental value, if indeed there is any, of being able to control index positioning this way seems unlikely to justify a backwards-compatibility break of such magnitude. Just brainstorming here but that just means default_tablespace needs to take precedence. We could have a default_table_tablespace and default_index_tablespace which default_tablespace overrides. Or we could allow a mini config language in default_tablespace like table=space1,index=space2.
Re: [HACKERS] Turning off HOT/Cleanup sometimes
On 15 Apr 2015 15:43, Simon Riggs si...@2ndquadrant.com wrote: It all depends upon who is being selfish. Why is a user selfish for not wanting to clean every single block they scan, when the people that made the mess do nothing and go faster 10 minutes from now? Randomly and massively penalising large SELECTs makes no sense. Some cleanup is OK, with reasonable limits, which is why that is proposed. I don't think it's productive to think of a query as a different actor with only an interest in its own performance and no interest in overall system performance. From a holistic point of view the question is how many times is a given hit chain going to need to be followed before it's pruned. Or to put it another way, how expensive is creating a hot chain. Does it cause a single prune? a fixed number of chain readers followed by a prune? Does the amount of work depend on the workload or is it consistent? My intuition is that a fixed cutoff like five pages is dangerous because if you update many pages there's no limit to the number of times they'll be read before they're all pruned. The steady state could easily be that every query is having to read hot chains forever. My intuition, again, is that what we need is a percentage such as do 10 prunes then ignore the next 1000 clean pages with hot chains. That guarantees that after 100 selects the hot chains will all be pruned but each select will only prune 1% of the clean pages it sees.
Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0
On 2015-04-15 18:53:15 +0300, Heikki Linnakangas wrote: Hmm, ok, I've read the INSERT ... ON CONFLICT UPDATE and logical decoding thread now, and I have to say that IMHO it's a lot more sane to handle this in ReorderBufferCommit() like Peter first did, than to make the main insertion path more complex like this. I don't like Peter's way much. For one it's just broken. For another it's quite annoying to trigger disk reads to figure out what actual type of record something is. If we go that way that's the way I think it should be done: Whenever we encounter a speculative record we 'unlink' it from the changes that will be reused for spooling from disk and do nothing further. Then we just continue reading through the records. If the next thing we encounter is a super-deletion we throw away that record. If it's another type of change (or even bettter a 'speculative insertion succeeded' record) insert it. That'll still require some uglyness, but it should not be too bad. I earlier thought that'd not be ok because there could be a new catalog snapshot inbetween, but I was mistaken: The locking on the source transaction prevents problems. Another idea is to never spill the latest record to disk, at least if it was a speculative insertion. Then you would be sure that when you see the super-deletion record, the speculative insertion it refers to is still in memory. That seems simple. It's not guaranteed to be the last record, there can be records inbetween (snapshots from other backends at the very least). Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Turning off HOT/Cleanup sometimes
On Thu, Apr 16, 2015 at 2:47 PM, Greg Stark st...@mit.edu wrote: On 15 Apr 2015 15:43, Simon Riggs si...@2ndquadrant.com wrote: It all depends upon who is being selfish. Why is a user selfish for not wanting to clean every single block they scan, when the people that made the mess do nothing and go faster 10 minutes from now? Randomly and massively penalising large SELECTs makes no sense. Some cleanup is OK, with reasonable limits, which is why that is proposed. I don't think it's productive to think of a query as a different actor with only an interest in its own performance and no interest in overall system performance. From a holistic point of view the question is how many times is a given hit chain going to need to be followed before it's pruned. Or to put it another way, how expensive is creating a hot chain. Does it cause a single prune? a fixed number of chain readers followed by a prune? Does the amount of work depend on the workload or is it consistent? IMO the size or traversal of the HOT chain is not that expensive compared to the cost of either pruning too frequently, which generates WAL as well as makes buffers dirty. OTOH cost of less frequent pruning could also be very high. It can cause severe table bloat which may just stay for a very long time. Even if dead space is recovered within a page, truncating a bloated heap is not always possible. In such cases, even SELECTs would be slowed down just because they need to read/scan far more pages than they otherwise would have. IOW its probably wrong to assume that not-pruning quickly enough will have impact only on the non-SELECT queries. I also concur with arguments upthread that this change needs to be carefully calibrated because it can lead to significant degradation for certain workloads. My intuition, again, is that what we need is a percentage such as do 10 prunes then ignore the next 1000 clean pages with hot chains. That guarantees that after 100 selects the hot chains will all be pruned but each select will only prune 1% of the clean pages it sees. I think some such proposal was made in the last. There could be knob to control how much a read-only query (or may be a read-only transaction) should do HOT cleanup, say as a percentage of pages it looks at. The default can be left at 100% in the first release so that the current behaviour is not suddenly disrupted. But it will allow others to play with the percentages and then based on field reports, we can change defaults in the next releases. Thanks, Pavan -- Pavan Deolasee http://www.linkedin.com/in/pavandeolasee
Re: [HACKERS] FPW compression leaks information
On Wed, Apr 15, 2015 at 9:42 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Wed, Apr 15, 2015 at 9:20 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Wed, Apr 15, 2015 at 2:22 PM, Fujii Masao wrote: On Wed, Apr 15, 2015 at 11:55 AM, Michael Paquier wrote: 1) Doc patch to mention that it is possible that compression can give hints to attackers when working on sensible fields that have a non-fixed size. I think that this patch is enough as the first step. I'll get something done for that at least, a big warning below the description of wal_compression would do it. So here is a patch for this purpose, with the following text being used: + warning +para + When enabling varnamewal_compression/varname, there is a risk + to leak data similarly to the BREACH and CRIME attacks on SSL where + the compression ratio of a full page image gives a hint of what is + the existing data of this page. Tables that contain sensitive + information like structnamepg_authid/structname with password + data could be potential targets to such attacks. Note that as a + prerequisite a user needs to be able to insert data on the same page + as the data targeted and need to be able to detect checkpoint + presence to find out if a compressed full page write is included in + WAL to calculate the compression ratio of a page using WAL positions + before and after inserting data on the page with data targeted. +/para + /warning Comments and reformulations are welcome. Regards, -- Michael diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index b30c68d..2f61e29 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -2303,6 +2303,22 @@ include_dir 'conf.d' but at the cost of some extra CPU spent on the compression during WAL logging and on the decompression during WAL replay. /para + + warning +para + When enabling varnamewal_compression/varname, there is a risk + to leak data similarly to the BREACH and CRIME attacks on SSL where + the compression ratio of a full page image gives a hint of what is + the existing data of this page. Tables that contain sensitive + information like structnamepg_authid/structname with password + data could be potential targets to such attacks. Note that as a + prerequisite a user needs to be able to insert data on the same page + as the data targeted and need to be able to detect checkpoint + presence to find out if a compressed full page write is included in + WAL to calculate the compression ratio of a page using WAL positions + before and after inserting data on the page with data targeted. +/para + /warning /listitem /varlistentry -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2015/03/05 21:08, Etsuro Fujita wrote: Here is an updated version. The EXPLAIN output has also been improved as discussed in [1]. I noticed that the EXPLAIN for a pushed-down update (delete) on inheritance childs doubly displays Foreign Update (Foreign Delete), one for ForeignScan and the other for ModifyTable. Here is an example: postgres=# explain verbose update parent set c1 = c1; QUERY PLAN -- Update on public.parent (cost=0.00..364.54 rows=4819 width=10) Update on public.parent Foreign Update on public.ft1 Foreign Update on public.ft2 - Seq Scan on public.parent (cost=0.00..0.00 rows=1 width=10) Output: parent.c1, parent.ctid - Foreign Update on public.ft1 (cost=100.00..182.27 rows=2409 width=10) Remote SQL: UPDATE public.t1 SET c1 = c1 - Foreign Update on public.ft2 (cost=100.00..182.27 rows=2409 width=10) Remote SQL: UPDATE public.t2 SET c1 = c1 (10 rows) Should we do something? Suggestions are welcome. Best regards, Etsuro Fujita [1] http://www.postgresql.org/message-id/31942.1410534...@sss.pgh.pa.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 2015/04/16 19:57, Amit Langote wrote: On 16-04-2015 PM 07:50, Etsuro Fujita wrote: The EXPLAIN output has also been improved as discussed in [1]. I noticed that the EXPLAIN for a pushed-down update (delete) on inheritance childs doubly displays Foreign Update (Foreign Delete), one for ForeignScan and the other for ModifyTable. Here is an example: postgres=# explain verbose update parent set c1 = c1; QUERY PLAN -- Update on public.parent (cost=0.00..364.54 rows=4819 width=10) Update on public.parent Foreign Update on public.ft1 Foreign Update on public.ft2 - Seq Scan on public.parent (cost=0.00..0.00 rows=1 width=10) Output: parent.c1, parent.ctid - Foreign Update on public.ft1 (cost=100.00..182.27 rows=2409 width=10) Remote SQL: UPDATE public.t1 SET c1 = c1 - Foreign Update on public.ft2 (cost=100.00..182.27 rows=2409 width=10) Remote SQL: UPDATE public.t2 SET c1 = c1 (10 rows) Should we do something? Suggestions are welcome. From what I see in Tom's commit message[0] for FTI patch, this shouldn't be, right? To be specific, there should be Foreign Scan there as per the commit. Am I missing something? As shown in the below example, this patch doesn't change the EXPLAIN output for non-pushed-down update (delete) cases, but since we changed the EXPLAIN output as discussed in [1], the patch doubly displays Foreign Update (Foreign Delete) for pushed-down update (delet) cases like the above example. postgres=# explain verbose update parent set c1 = trunc(random() * 9 + 1)::int; QUERY PLAN - Update on public.parent (cost=0.00..452.06 rows=5461 width=6) Update on public.parent Foreign Update on public.ft1 Remote SQL: UPDATE public.t1 SET c1 = $2 WHERE ctid = $1 Foreign Update on public.ft2 Remote SQL: UPDATE public.t2 SET c1 = $2 WHERE ctid = $1 - Seq Scan on public.parent (cost=0.00..0.01 rows=1 width=6) Output: (trunc(((random() * '9'::double precision) + '1'::double precision)))::integer, parent.ctid - Foreign Scan on public.ft1 (cost=100.00..226.03 rows=2730 width=6) Output: (trunc(((random() * '9'::double precision) + '1'::double precision)))::integer, ft1.ctid Remote SQL: SELECT ctid FROM public.t1 FOR UPDATE - Foreign Scan on public.ft2 (cost=100.00..226.03 rows=2730 width=6) Output: (trunc(((random() * '9'::double precision) + '1'::double precision)))::integer, ft2.ctid Remote SQL: SELECT ctid FROM public.t2 FOR UPDATE (14 rows) Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
On 16-04-2015 PM 07:50, Etsuro Fujita wrote: The EXPLAIN output has also been improved as discussed in [1]. I noticed that the EXPLAIN for a pushed-down update (delete) on inheritance childs doubly displays Foreign Update (Foreign Delete), one for ForeignScan and the other for ModifyTable. Here is an example: postgres=# explain verbose update parent set c1 = c1; QUERY PLAN -- Update on public.parent (cost=0.00..364.54 rows=4819 width=10) Update on public.parent Foreign Update on public.ft1 Foreign Update on public.ft2 - Seq Scan on public.parent (cost=0.00..0.00 rows=1 width=10) Output: parent.c1, parent.ctid - Foreign Update on public.ft1 (cost=100.00..182.27 rows=2409 width=10) Remote SQL: UPDATE public.t1 SET c1 = c1 - Foreign Update on public.ft2 (cost=100.00..182.27 rows=2409 width=10) Remote SQL: UPDATE public.t2 SET c1 = c1 (10 rows) Should we do something? Suggestions are welcome. From what I see in Tom's commit message[0] for FTI patch, this shouldn't be, right? To be specific, there should be Foreign Scan there as per the commit. Am I missing something? Thanks, Amit [1] http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=cb1ca4d800621dcae67ca6c799006de99fa4f0a5 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication and WAL archive interactions
On 03/01/2015 12:36 AM, Venkata Balaji N wrote: Patch did get applied successfully to the latest master. Can you please rebase. Here you go. On 01/31/2015 03:07 PM, Andres Freund wrote: On 2014-12-19 22:56:40 +0200, Heikki Linnakangas wrote: This add two new archive_modes, 'shared' and 'always', to indicate whether the WAL archive is shared between the primary and standby, or not. In shared mode, the standby tracks which files have been archived by the primary. The standby refrains from recycling files that the primary has not yet archived, and at failover, the standby archives all those files too from the old timeline. In 'always' mode, the standby's WAL archive is taken to be separate from the primary's, and the standby independently archives all files it receives from the primary. I don't really like this approach. Sharing a archive is rather dangerous in my experience - if your old master comes up again (and writes in the last wal file) or similar, you can get into really bad situations. It doesn't have to actually be shared. The master and standby could archive to different locations, but the responsibility of archiving is shared, so that on promotion, the standby ensures that every WAL file gets archived. If the master didn't do it, then the standby will. Yes, if the master comes up again, it might try to archive a file that the standby already archived. But that's not so bad. Both copies of the file will be identical. You could put logic in archive_command to check, if the file already exists in the archive, whether the contents are identical, and return success without doing anything if they are. Oh, hang on, that's not necessarily true. On promotion, the standby archives the last, partial WAL segment from the old timeline. That's just wrong (http://www.postgresql.org/message-id/52fcd37c.3070...@vmware.com), and in fact I somehow thought I changed that already, but apparently not. So let's stop doing that. What I was thinking about was instead trying to detect the point up to which files were safely archived by running restore command to check for the presence of archived files. Then archive anything that has valid content and isn't yet archived. That doesn't sound particularly complicated to me. Hmm. That assumes that the standby has a valid restore_command, and can access the WAL archive. Not a too unreasonable requirement I guess, but with the scheme I proposed, it's not necessary. Seems a bit silly to copy a whole segment from the archive just to check if it exists, though. - Heikki From db5c4311baf4e3a2ae3308c4d0d9975ee3692a18 Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas heikki.linnakangas@iki.fi Date: Thu, 16 Apr 2015 14:40:24 +0300 Subject: [PATCH v2 1/1] Make WAL archival behave more sensibly in standby mode. This adds two new archive_modes, 'shared' and 'always', to indicate whether the WAL archive is shared between the primary and standby, or not. In shared mode, the standby tracks which files have been archived by the primary. The standby refrains from recycling files that the primary has not yet archived, and at failover, the standby archives all those files too from the old timeline. In 'always' mode, the standby's WAL archive is taken to be separate from the primary's, and the standby independently archives all files it receives from the primary. Fujii Masao and me. --- doc/src/sgml/config.sgml | 12 +- doc/src/sgml/high-availability.sgml | 48 +++ doc/src/sgml/protocol.sgml| 31 + src/backend/access/transam/xlog.c | 29 - src/backend/postmaster/postmaster.c | 37 -- src/backend/replication/walreceiver.c | 172 -- src/backend/replication/walsender.c | 47 +++ src/backend/utils/misc/guc.c | 21 ++-- src/backend/utils/misc/postgresql.conf.sample | 2 +- src/include/access/xlog.h | 14 ++- 10 files changed, 351 insertions(+), 62 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index b30c68d..e352b8e 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -2521,7 +2521,7 @@ include_dir 'conf.d' variablelist varlistentry id=guc-archive-mode xreflabel=archive_mode - termvarnamearchive_mode/varname (typeboolean/type) + termvarnamearchive_mode/varname (typeenum/type) indexterm primaryvarnamearchive_mode/ configuration parameter/primary /indexterm @@ -2530,7 +2530,15 @@ include_dir 'conf.d' para When varnamearchive_mode/ is enabled, completed WAL segments are sent to archive storage by setting -xref linkend=guc-archive-command. +xref linkend=guc-archive-command. In addition to literaloff/, +to disable, there are three modes: literalon/, literalshared/, +and literalalways/. During normal operation, there is no +
Re: [HACKERS] Turning off HOT/Cleanup sometimes
Pavan Deolasee wrote: On Thu, Apr 16, 2015 at 2:47 PM, Greg Stark st...@mit.edu wrote: From a holistic point of view the question is how many times is a given hit chain going to need to be followed before it's pruned. Or to put it another way, how expensive is creating a hot chain. Does it cause a single prune? a fixed number of chain readers followed by a prune? Does the amount of work depend on the workload or is it consistent? IMO the size or traversal of the HOT chain is not that expensive compared to the cost of either pruning too frequently, which generates WAL as well as makes buffers dirty. OTOH cost of less frequent pruning could also be very high. It can cause severe table bloat which may just stay for a very long time. Even if dead space is recovered within a page, truncating a bloated heap is not always possible. I think you're failing to consider that in the patch there is a distinction between read-only page accesses and page updates. During a page update, HOT cleanup is always done even with the patch, so there won't be any additional bloat that would not be there without the patch. It's only the read-only accesses to the patch that skip the HOT pruning. Of course, as Greg says there will be some additional scans of the HOT chain by read-only processes. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: default_index_tablespace
On 4/15/15 11:33 PM, Amit Kapila wrote: On Thu, Apr 16, 2015 at 8:01 AM, Bruce Momjian br...@momjian.us mailto:br...@momjian.us wrote: On Wed, Apr 15, 2015 at 07:12:11PM -0400, Tom Lane wrote: jltal...@adv-solutions.net mailto:jltal...@adv-solutions.net writes: This small patch implements a new GUC (default_index_tablespace) plus supporting code. Originated from a customer request, the feature intends to make creation of indexes on SSD-backed tablespaces easy and convenient (almost transparent) for users: the DBA can just set it and indexes will be placed in the specified tablespace --as opposed to the same tablespace where the referenced table is-- without having to specify it every time. Another way to provide different default tablespace for index could be to provide it at Database level. Have a new option INDEX_TABLESPACE in Create Database command which can be used to create indexes when not specified during Create Index command. This would also need changes in pg_dump (like while dumping info about database) but on initial look, it seems it can be done without much changes. That's same idea that Stephen and I have discussed in the past. Something like: CREATE DATABASE name SET TABLESPACE table_volume SET INDEX TABLESPACE index_volume; This has some real usability advantages. In the past I've written code to move tables to where they need to be once the db update is complete. The tables tend to be small or empty so this is not usually a big deal - but sometimes it is. Trying to get a tablespace clause on every index in the build scripts is a real PITA. -- - David Steele da...@pgmasters.net signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Turning off HOT/Cleanup sometimes
On 2015-04-16 10:20:20 -0300, Alvaro Herrera wrote: I think you're failing to consider that in the patch there is a distinction between read-only page accesses and page updates. During a page update, HOT cleanup is always done even with the patch, so there won't be any additional bloat that would not be there without the patch. That's not really true (and my benchmark upthread proves it). The fact that hot pruning only happens when we can get a cleanup lock means that we can end up with more pages that are full, if we prune on select less often. Especially if SELECTs are more frequent than write accesses - pretty darn common - the likelihood of SELECTs getting the lock is correspondingly higher. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Move pg_upgrade from contrib/ to src/bin/
--On 15. April 2015 15:02:05 -0400 Andrew Dunstan and...@dunslane.net wrote: We've handled the buildfarm being red for a few days before. People are usually good about applying fixes fairly quickly. Took me some time to get that due to my mail backlog, but i've done the hotfix for dotterel and forced a run for all branches on this box. -- Thanks Bernd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Supporting src/test/modules in MSVC builds
On 04/16/2015 02:46 AM, Michael Paquier wrote: Hi all, As mentioned previously (cab7npqscphafxs2rzeb-fbccjqiknqxjhloztkggim1mf5x...@mail.gmail.com), attached are patches to add support for src/test/modules in MSVC builds, modules whose tests are not supported since they have been moved from contrib/: - 0001 adds support for the build portion. - 0002 adds support for the installation. I arrived at the conclusion that those modules should be installed by default, because contribcheck relies on the fact that tests should be run on a server already running, and modulescheck should do the same IMO. - 0003 adds a new target modulescheck in vcregress. Patches 0001 and 0003 are based on some previous work from Alvaro, that I modified slightly after testing them. Note that this split is done to test each step on the build farm for safety. Regards, Thanks for doing this. It looks good, and if you've tested it I'm satisfied. I suggest that we apply patches 1 and 2 immediately. AUIU they don't require any changes to the buildfarm, as the MSVC build process will automatically build and install it with these changes. Then if all goes well we can apply the third patch and I'll fix the buildfarm client for the forthcoming release to run the tests on MSVC builds. Nothing will break in the meantime - the tests just won't get run until the new client version is deployed. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0
On Thu, Apr 16, 2015 at 2:23 AM, Andres Freund and...@anarazel.de wrote: I'm, completely independent of logical decoding, of the *VERY* strong opinion that 'speculative insertions' should never be visible when looking with normal snapshots. For one it allows to simplify considerations around wraparound (which has proven to be a very good idea, c.f. multixacts + vacuum causing data corruption years after it was thought to be harmless). For another it allows to reclaim/redefine the bit after a database restart/upgrade. Given how complex this is and how scarce flags are that seems like a really good property. And avoiding those flags to be visible to the outside requires a WAL record, otherwise it won't be correct on the standby. I'm a bit distracted here, and not sure exactly what you mean. What's a normal snapshot? Do you just mean that you think that speculative insertions should be explicitly affirmed by a second record (making it not a speculative tuple, but rather, a fully fledged tuple)? IOW, an MVCC snapshot has no chance of seeing a tuple until it was affirmed by a second in-place modification, regardless of tuple xmin xact commit status? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: default_index_tablespace
I'm afraid this idea is a nonstarter, because it will break existing applications, and in particular existing pg_dump output files, which expect to be able to determine an index's tablespace by setting default_tablespace. (It is *not* adequate that the code falls back to default_tablespace if the new GUC is unset; if it is set, you've still broken pg_dump.) Got it. Thank you for the feedback. The incremental value, if indeed there is any, of being able to control index positioning this way seems unlikely to justify a backwards-compatibility break of such magnitude. I can see why someone would want this because random I/O, which is frequent for indexes, is much faster on SSD than magnetic disks. (Sequential I/O is more similar for the two.) The general idea is something I've brought up previously also (I believe it was even discussed at the Dev meeting in, uh, 2013?) so I'm not anxious to simply dismiss it, but it'd certainly have to be done correctly.. Any suggestions on how to do it properly? Does Greg Stark's suggestion (at CAM-w4HPOASwsQMdGZqjyFHNubbUnWrUAo8ibci-97UKU=po...@mail.gmail.com) make sense to you? This approach might suffer from the same problem as mine, though. It seems to me, IMVHO, a limitation in pg_dump ---since 8.0 when tablespace support for CREATE INDEX was implemented--- that we should fix. Keeping backwards compatibility is indeed required; I just did not think about pg_dump at all :( I don't mind reworking the patch or redoing it completely once there is a viable solution. Thanks, / J.L. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] initdb -S and tablespaces
Hi. Here's a variation of the earlier patch that follows all links in PGDATA. Does this look more like what you had in mind? -- Abhijit From d86888b0d2f5a3a57027d26ce050a3bbb58670d3 Mon Sep 17 00:00:00 2001 From: Abhijit Menon-Sen a...@2ndquadrant.com Date: Thu, 6 Nov 2014 00:45:56 +0530 Subject: Recursively fsync PGDATA at startup if needed It's needed if we need to perform crash recovery or if fsync was disabled at some point while the server was running earlier (and we must store that information in the control file). This is so that we don't lose older unflushed writes in the event of a power failure after crash recovery, where more recent writes are preserved. See 20140918083148.ga17...@alap3.anarazel.de for details. --- src/backend/access/transam/xlog.c | 168 src/bin/pg_controldata/pg_controldata.c | 2 + src/include/catalog/pg_control.h| 8 +- 3 files changed, 177 insertions(+), 1 deletion(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 2580996..263d2d0 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -835,6 +835,8 @@ static void WALInsertLockAcquireExclusive(void); static void WALInsertLockRelease(void); static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt); +static void fsync_pgdata(char *datadir); + /* * Insert an XLOG record represented by an already-constructed chain of data * chunks. This is a low-level routine; to construct the WAL record header @@ -4811,6 +4813,7 @@ BootStrapXLOG(void) ControlFile-checkPoint = checkPoint.redo; ControlFile-checkPointCopy = checkPoint; ControlFile-unloggedLSN = 1; + ControlFile-fsync_disabled = false; /* Set important parameter values for use when replaying WAL */ ControlFile-MaxConnections = MaxConnections; @@ -5868,6 +5871,27 @@ StartupXLOG(void) ereport(FATAL, (errmsg(control file contains invalid data))); + /* + * If we need to perform crash recovery, we issue an fsync on the + * data directory and its contents to try to ensure that any data + * written before the crash are flushed to disk. Otherwise a power + * failure in the near future might cause earlier unflushed writes + * to be lost, even though more recent data written to disk from + * here on would be persisted. + * + * We also do this if the control file indicates that fsync was + * disabled at some point while the server was running earlier. + */ + + if (enableFsync + (ControlFile-fsync_disabled || + (ControlFile-state != DB_SHUTDOWNED + ControlFile-state != DB_SHUTDOWNED_IN_RECOVERY))) + { + fsync_pgdata(data_directory); + ControlFile-fsync_disabled = false; + } + if (ControlFile-state == DB_SHUTDOWNED) { /* This is the expected case, so don't be chatty in standalone mode */ @@ -8236,6 +8260,8 @@ CreateCheckPoint(int flags) /* crash recovery should always recover to the end of WAL */ ControlFile-minRecoveryPoint = InvalidXLogRecPtr; ControlFile-minRecoveryPointTLI = 0; + if (!enableFsync) + ControlFile-fsync_disabled = true; /* * Persist unloggedLSN value. It's reset on crash recovery, so this goes @@ -11107,3 +11133,145 @@ SetWalWriterSleeping(bool sleeping) XLogCtl-WalWriterSleeping = sleeping; SpinLockRelease(XLogCtl-info_lck); } + +/* + * Hint to the OS that it should get ready to fsync() this file. + * + * Adapted from pre_sync_fname in initdb.c + */ +static void +pre_sync_fname(char *fname, bool isdir) +{ + int fd; + + fd = open(fname, O_RDONLY | PG_BINARY); + + /* + * Some OSs don't allow us to open directories at all (Windows returns + * EACCES) + */ + if (fd 0 isdir (errno == EISDIR || errno == EACCES)) + return; + + if (fd 0) + ereport(FATAL, +(errmsg(could not open file \%s\ before fsync, + fname))); + + pg_flush_data(fd, 0, 0); + + close(fd); +} + +/* + * walkdir: recursively walk a directory, applying the action to each + * regular file and directory (including the named directory itself) + * and following symbolic links. + */ +static void +walkdir(char *path, void (*action) (char *fname, bool isdir)) +{ + DIR *dir; + struct dirent *de; + + dir = AllocateDir(path); + while ((de = ReadDir(dir, path)) != NULL) + { + char subpath[MAXPGPATH]; + struct stat fst; + + CHECK_FOR_INTERRUPTS(); + + if (strcmp(de-d_name, .) == 0 || + strcmp(de-d_name, ..) == 0) + continue; + + snprintf(subpath, MAXPGPATH, %s/%s, path, de-d_name); + + if (lstat(subpath, fst) 0) + ereport(ERROR, + (errcode_for_file_access(), + errmsg(could not stat file \%s\: %m, subpath))); + + if (S_ISREG(fst.st_mode)) + (*action) (subpath, false); + else if (S_ISDIR(fst.st_mode)) + walkdir(subpath, action); +#ifndef WIN32 + else if (S_ISLNK(fst.st_mode)) +#else + else if (pg_win32_is_junction(subpath)) +#endif + { +#if defined(HAVE_READLINK) || defined(WIN32) + char linkpath[MAXPGPATH]; + int len; + struct stat lst; + + len =
[HACKERS] Disabling trust/ident authentication configure option
We have a customer using a patch to harden their PostgreSQL installation (see attached) they would like to contribute. This patch adds the ability to disable trust and ident authentication at compile time via configure options, thus making it impossible to use these authentication methods for sloppy configuration purposes. This is critical to their software deployment, as stated in their use case description: snip PostgreSQL is deployed as part of a larger technical solution (e.g. a Telecommunication system) and a field engineer has to install/upgrade this solution. The engineer is a specialist in the Telco domain and has only little knowledge of databases and especially PostgreSQL setup. We now want to provide these kinds of users with pre-hardened packages that make it very hard to accidentally expose their database. For this purpose the patch allows to optionally disable the trust and ident authentication methods. Especially the trust mechanism is very critical as it might actually provide useful functionality for our user. Think of an engineer who has to do a night shift upgrade with a customer breathing down his neck to get the system back online. Circumventing all these authentication configuration issues by just enabling trust is very easy and looks well supported and documented. (the documentation states the dangers but there are no *big red flags* or something). After finishing the upgrade the engineer forgets to restore the secure configuration, and a malicious attacker later uses this access method to gain access to the database. /snip Currently the patch adds new configure options --without-trust-auth and --without-ident-auth and makes peer authentication default when these options are set. My testing shows that regression tests (which are normally using trust) are still working. This works as far as it goes to Linux and friends, Windows currently is not adressed yet. Maybe its worth to consider making sspi the default on this platform instead. There was a discussion some time ago ([1]), but i think the reasoning behind this patch is a little bit different than discussed there. [1] http://www.postgresql.org/message-id/CAN2Y=umt7cpkxzhaufw7szeckdwcwsuulmh4xphuxkqbtdu...@mail.gmail.com -- Thanks Bernd disable_trust_ident_configure.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Turning off HOT/Cleanup sometimes
On Thu, Apr 16, 2015 at 6:50 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Pavan Deolasee wrote: On Thu, Apr 16, 2015 at 2:47 PM, Greg Stark st...@mit.edu wrote: From a holistic point of view the question is how many times is a given hit chain going to need to be followed before it's pruned. Or to put it another way, how expensive is creating a hot chain. Does it cause a single prune? a fixed number of chain readers followed by a prune? Does the amount of work depend on the workload or is it consistent? IMO the size or traversal of the HOT chain is not that expensive compared to the cost of either pruning too frequently, which generates WAL as well as makes buffers dirty. OTOH cost of less frequent pruning could also be very high. It can cause severe table bloat which may just stay for a very long time. Even if dead space is recovered within a page, truncating a bloated heap is not always possible. I think you're failing to consider that in the patch there is a distinction between read-only page accesses and page updates. During a page update, HOT cleanup is always done even with the patch, so there won't be any additional bloat that would not be there without the patch. It's only the read-only accesses to the patch that skip the HOT pruning. Ah, Ok. I'd not read the patch. But now that I do, I feel much more comfortable with the change. In fact, I wonder if its just enough to either do full HOT prune for target relations and not at all for all other relations involved in the query. My apologies if this is done based on discussions upthread. I haven't read the entire thread yet. Thanks, Pavan -- Pavan Deolasee http://www.linkedin.com/in/pavandeolasee
Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0
On 04/16/2015 12:18 PM, Andres Freund wrote: On 2015-04-15 18:53:15 +0300, Heikki Linnakangas wrote: Hmm, ok, I've read the INSERT ... ON CONFLICT UPDATE and logical decoding thread now, and I have to say that IMHO it's a lot more sane to handle this in ReorderBufferCommit() like Peter first did, than to make the main insertion path more complex like this. I don't like Peter's way much. For one it's just broken. For another it's quite annoying to trigger disk reads to figure out what actual type of record something is. If we go that way that's the way I think it should be done: Whenever we encounter a speculative record we 'unlink' it from the changes that will be reused for spooling from disk and do nothing further. Then we just continue reading through the records. If the next thing we encounter is a super-deletion we throw away that record. If it's another type of change (or even bettter a 'speculative insertion succeeded' record) insert it. That'll still require some uglyness, but it should not be too bad. Sounds good to me. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0
On Thu, Apr 16, 2015 at 2:18 AM, Andres Freund and...@anarazel.de wrote: On 2015-04-15 18:53:15 +0300, Heikki Linnakangas wrote: Hmm, ok, I've read the INSERT ... ON CONFLICT UPDATE and logical decoding thread now, and I have to say that IMHO it's a lot more sane to handle this in ReorderBufferCommit() like Peter first did, than to make the main insertion path more complex like this. I don't like Peter's way much. For one it's just broken. For another it's quite annoying to trigger disk reads to figure out what actual type of record something is. If we go that way that's the way I think it should be done: Whenever we encounter a speculative record we 'unlink' it from the changes that will be reused for spooling from disk and do nothing further. Then we just continue reading through the records. You mean we continue iterating through *changes* from ReorderBufferCommit()? If the next thing we encounter is a super-deletion we throw away that record. If it's another type of change (or even bettter a 'speculative insertion succeeded' record) insert it. By insert it, I gather you mean report to the the logical decoding plugin as an insert change. That'll still require some uglyness, but it should not be too bad. So, to be clear, what you have in mind is sort of a hybrid between my first and second approaches (mostly my first approach). We'd have coordination between records originally decoded into changes, maybe intercepting them during xact reassembly, like my first approach. We'd mostly do the same thing as the first approach, actually. The major difference would be that there'd be explicit speculative affirmation WAL records. But we wouldn't rely on those affirmation records within ReorderBufferCommit() - we'd rely on the *absence* of a super deletion WAL record (to report an insert change to the decoding plugin). To emphasize, like my first approach, it would be based on an *absence* of a super deletion WAL record. However, like my second approach, there would be a speculative affirmation WAL record. The new speculative affirmation WAL record would however be quite different to what my second approach to logical decoding (the in-place update record thing that was in V3.3) actually did. In particular, it would be far more simple, and the tuple would be built from the original insertion record within logical decoding. Right now, I'm tired, so bear with me. What I think I'm not quite getting here is how the new type of affirmation record affects visibility (or anything else, actually). Clearly dirty snapshots need to see the record to set a speculative insertion token for their caller to wait on (and HeapTupleSatisfiesVacuum() cannot see the speculatively inserted tuple as reclaimable, of course). They need this *before* the speculative insertion is affirmed, of course. Maybe you mean: If the speculative insertion xact is in progress, then the tuple is visible to several types of snapshots (dirty, vacuum, self, any). If it is not, then tuples are not visible because they are only speculative (and not *confirmed*). If they were confirmed, and the xact was committed, then those tuples are logically and physically indistinguishable from tuples that were inserted in the ordinary manner. Is that it? Basically, the affirmation records have nothing much to do with logical decoding in particular. But you still need to super delete, so that several types of snapshots ((dirty, vacuum, self, any) *stop* seeing the tuple as visible independent of the inserting xact being in progress. I earlier thought that'd not be ok because there could be a new catalog snapshot inbetween, but I was mistaken: The locking on the source transaction prevents problems. I thought this was the case. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?
On 04/16/2015 01:01 PM, David G. Johnston wrote: If this is not covered adequately enough in the documentation then that should be remedied. Did you evaluate the documentation in that light while preparing your blog post? Your response seems very defensive. I'm unclear on why discussing improving an error message should get such a hostile response; it certainly doesn't invite a serious reply. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?
Folks: SELECT device_id, count(*)::INT as present, count(*)::INT FILTER (WHERE valid) as valid_count, mode()::INT WITHIN GROUP (order by val) as mode, percentile_disc(0.5)::INT WITHIN GROUP (order by val) as median FROM dataflow_0913 GROUP BY device_id ORDER BY device_id; ERROR: syntax error at or near FILTER LINE 4: count(*)::INT FILTER (WHERE valid) as valid_count, The error is right, that's invalid syntax. I can't insert a ::INT between the aggregate() and FILTER. However, the error message is also rather confusing to the user; they're likely to look for their mistake in the wrong place. The same goes for WITHIN GROUP (and OVER, too, I think). Is there some kind of possible HINT we could add to make this easier to debug? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Turning off HOT/Cleanup sometimes
Andres Freund wrote: On 2015-04-16 10:20:20 -0300, Alvaro Herrera wrote: I think you're failing to consider that in the patch there is a distinction between read-only page accesses and page updates. During a page update, HOT cleanup is always done even with the patch, so there won't be any additional bloat that would not be there without the patch. That's not really true (and my benchmark upthread proves it). The fact that hot pruning only happens when we can get a cleanup lock means that we can end up with more pages that are full, if we prune on select less often. Especially if SELECTs are more frequent than write accesses - pretty darn common - the likelihood of SELECTs getting the lock is correspondingly higher. Interesting point. Of course, this code should count HOT cleanups against the total limit when they are effectively carried out, and ignore those that are skipped because of inability to acquire the cleanup lock. Not sure whether the submitted code does that. Can we keep stats on how many pages we don't clean in the updating process due to failure to acquire cleanup lock? My intuition says that it should be similar to the number of backends running concurrently, but that might be wrong. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?
On Thu, Apr 16, 2015 at 1:06 PM, Josh Berkus j...@agliodbs.com wrote: On 04/16/2015 01:01 PM, David G. Johnston wrote: If this is not covered adequately enough in the documentation then that should be remedied. Did you evaluate the documentation in that light while preparing your blog post? Your response seems very defensive. I'm unclear on why discussing improving an error message should get such a hostile response; it certainly doesn't invite a serious reply. Mostly because I have no clue how difficult it would be... I'm all for improvement but obviously if it was that simple you would have just proposed the desired wording and asked someone to commit it. I was trying to understand (and communicate my understanding) why it wasn't done better in the first place. Beyond that there is still the need to teach the user the correct syntax of these functions - and not only point out when they get it wrong so they blindly fix their typo - in order to better avoid the situation where the syntax is valid but the outcome is unexpected. Two separate problems of which the later seems more important than the former - and probably somewhat easier to implement. David J.
Re: [HACKERS] inherit support for foreign tables
On Wed, Apr 15, 2015 at 09:35:05AM +0900, Kyotaro HORIGUCHI wrote: Hi, Before suppressing the symptom, I doubt the necessity and/or validity of giving foreign tables an ability to be a parent. Is there any reasonable usage for the ability? I think we should choose to inhibit foreign tables from becoming a parent rather than leaving it allowed then taking measures for the consequent symptom. I have a use case for having foreign tables as non-leaf nodes in a partitioning hierarchy, namely geographic. One might have a table at HQ called foo_world, then partitions under it called foo_jp, foo_us, etc., in one level, foo_us_ca, foo_us_pa, etc. in the next level, and on down, each in general in a separate data center. Is there something essential about having non-leaf nodes as foreign tables that's a problem here? Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Supporting src/test/modules in MSVC builds
Andrew Dunstan wrote: Thanks for doing this. Yes, much appreciated. It looks good, and if you've tested it I'm satisfied. I suggest that we apply patches 1 and 2 immediately. AUIU they don't require any changes to the buildfarm, as the MSVC build process will automatically build and install it with these changes. Okay, I just pushed patches 1 and 2. Then if all goes well we can apply the third patch and I'll fix the buildfarm client for the forthcoming release to run the tests on MSVC builds. Nothing will break in the meantime - the tests just won't get run until the new client version is deployed. Will wait for a day or so to see what the Win buildfarm members say. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assertion failure when streaming logical changes
On 04/07/2015 03:54 PM, Andres Freund wrote: On 2015-04-07 17:22:12 +0800, Craig Ringer wrote: It might be a good idea to apply this if nothing better is forthcoming. Logical decoding in WALsenders is broken at the moment. Yes. Committed. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?
On 04/16/2015 01:18 PM, David G. Johnston wrote: On Thu, Apr 16, 2015 at 1:06 PM, Josh Berkus j...@agliodbs.com mailto:j...@agliodbs.comwrote: On 04/16/2015 01:01 PM, David G. Johnston wrote: If this is not covered adequately enough in the documentation then that should be remedied. Did you evaluate the documentation in that light while preparing your blog post? Your response seems very defensive. I'm unclear on why discussing improving an error message should get such a hostile response; it certainly doesn't invite a serious reply. Mostly because I have no clue how difficult it would be... I'm all for improvement but obviously if it was that simple you would have just proposed the desired wording and asked someone to commit it. I was trying to understand (and communicate my understanding) why it wasn't done better in the first place. Right. First, I don't know that it's possible for the parser to figure out that error as opposed to actual mistakes in the filter clause, or just straight gibberish. Second, I'm not sure how to write a succinct HINT which gets the point across. However, the error we're giving for windowing suggests that we can improve things. In the example you quote: SQL Error: ERROR: OVER specified, but ceil is not a window function nor an aggregate function LINE 1: SELECT ceil(count(*)) OVER () ... that's actually a lot clearer error. However, we still get the same error with casting: josh=# select count(*)::INT OVER (order by device_id, val) from dataflow_0913 ; ERROR: syntax error at or near OVER LINE 1: select count(*)::INT OVER (order by device_id, val) from dat... ... so maybe we can't. Beyond that there is still the need to teach the user the correct syntax of these functions - and not only point out when they get it wrong so they blindly fix their typo - in order to better avoid the situation where the syntax is valid but the outcome is unexpected. Two separate problems of which the later seems more important than the former - and probably somewhat easier to implement. Absolutely. Hence my blog post; I'm hoping that people will at least get a google hit on the error text. The problem is, in the main docs, how do we reasonably document *combining* features? In Syntax? My head hurts just thinking about it. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?
On Thu, Apr 16, 2015 at 12:22 PM, Josh Berkus j...@agliodbs.com wrote: Folks: SELECT device_id, count(*)::INT as present, count(*)::INT FILTER (WHERE valid) as valid_count, mode()::INT WITHIN GROUP (order by val) as mode, percentile_disc(0.5)::INT WITHIN GROUP (order by val) as median FROM dataflow_0913 GROUP BY device_id ORDER BY device_id; ERROR: syntax error at or near FILTER LINE 4: count(*)::INT FILTER (WHERE valid) as valid_count, The error is right, that's invalid syntax. I can't insert a ::INT between the aggregate() and FILTER. However, the error message is also rather confusing to the user; they're likely to look for their mistake in the wrong place. The same goes for WITHIN GROUP (and OVER, too, I think). SELECT count(*)::int OVER () FROM ( VALUES (1),(2),(3) ) src; Is there some kind of possible HINT we could add to make this easier to debug? Do you have a suggested hint so that the effort to make it work can be compared to its usefulness? For kicks I ran the following - since val::type is simply another syntax for type(val)... SELECT ceil(count(*)) OVER () FROM ( VALUES (1),(2),(3) ) src SQL Error: ERROR: OVER specified, but ceil is not a window function nor an aggregate function LINE 1: SELECT ceil(count(*)) OVER () The non-hint has been around as long as window functions and hasn't really come up as an issue - not enough so to motivate a change at least. Is the idiomatic usage of FILTER and WITHIN GROUP making this error more likely? The foot-gun in your blog post is more problematic but also seemingly impossible to avoid except through education of the user. It would not be unreasonable to accept that the current error is acting like a canary and forcing the user to go read the documentation on OVER/FILTER/WITHIN GROUP and learn to write the expression as a single unit. If this is not covered adequately enough in the documentation then that should be remedied. Did you evaluate the documentation in that light while preparing your blog post? David J.
Re: [HACKERS] Turning off HOT/Cleanup sometimes
On 16 April 2015 at 15:21, Andres Freund and...@anarazel.de wrote: On 2015-04-16 10:20:20 -0300, Alvaro Herrera wrote: I think you're failing to consider that in the patch there is a distinction between read-only page accesses and page updates. During a page update, HOT cleanup is always done even with the patch, so there won't be any additional bloat that would not be there without the patch. That's not really true (and my benchmark upthread proves it). The fact that hot pruning only happens when we can get a cleanup lock means that we can end up with more pages that are full, if we prune on select less often. Especially if SELECTs are more frequent than write accesses - pretty darn common - the likelihood of SELECTs getting the lock is correspondingly higher. Your point that we *must* do *some* HOT cleanup on SELECTs is proven beyond question. Alvaro has not disputed that, ISTM you misread that. Pavan has questioned that point but the results upthread are there, he explains he hasn't read that yet. The only question is how much cleanup on SELECT? Having one SELECT hit 10,000 cleanups while another hits 0 creates an unfairness and unpredictability in the way we work. Maybe some people running a backup actually like the fact it cleans the database; others think that is a bad thing. Few people issuing large queries think it is good behaviour. Anybody running replication also knows that this causes a huge slam of WAL which can increase replication delay, which is a concern for HA. That is how we arrive at the idea of a cleanup limit, further enhanced by a limit that applies only to dirtying clean blocks, which we have 4? recent votes in favour of. I would personally be in favour of a parameter to control the limit, since whatever we chose is right/wrong depending upon circumstances. I am however comfortable with not having a parameter if people think it is hard to tune that, which I agree it would be, hence no parameter in the patch.
[HACKERS] Performance tuning assisted by a GUI application
Hello, (Please pardon me if this is offtopic and I should send it to another mailing list instead) I had a brief discussion on #postgresql and thought that perhaps there might be a need for a tool that would enable a fine-tuning of PostgreSQL performance settings by conveniently testing them with a sample SQL query with the aid of a simple GUI application. To illustrate this, I created this little proof of concept: https://gist.github.com/d33tah/d01f3599e55e53d00f68 Screenshot can be found here: https://imgur.com/TguH6Xq This is by far not even alpha quality here and would need some basic Python knowledge to set up the connection string. This only supports modifying cpu_tuple_cost right now, but I guess that you can extrapolate where this would go. What do you think? Would anybody be interested in an application like this? Cheers, Jacek Wielemborek signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Supporting src/test/modules in MSVC builds
Andrew Dunstan wrote: On 04/16/2015 07:42 PM, Michael Paquier wrote: Then if all goes well we can apply the third patch and I'll fix the buildfarm client for the forthcoming release to run the tests on MSVC builds. Nothing will break in the meantime - the tests just won't get run until the new client version is deployed. Will wait for a day or so to see what the Win buildfarm members say. currawong and mastodon seem satisfied. Yeah. I think we can push the third one - I am just about ready to release the new buildfarm client. Pushed. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_upgrade in 9.5 broken for adminpack
pg_upgrade was recently broken for use upgrading from a system with adminpack installed. Breaking commit is: commit 30982be4e5019684e1772dd9170aaa53f5a8e894 Author: Peter Eisentraut pete...@gmx.net Integrate pg_upgrade_support module into backend from pg_upgrade_dump_12870.log pg_restore: creating EXTENSION adminpack pg_restore: creating COMMENT EXTENSION adminpack pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC entry 2806; 0 0 COMMENT EXTENSION adminpack pg_restore: [archiver (db)] could not execute query: ERROR: extension adminpack does not exist Command was: COMMENT ON EXTENSION adminpack IS 'administrative functions for PostgreSQL'; I get the same error whether the source database is 9.2.10 or 9.5.HEAD. Cheers, Jeff
Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
Hanada-san, I merged explain patch into foreign_join patch. Now v12 is the latest patch. It contains many garbage lines... Please ensure the patch is correctly based on the latest master + custom_join patch. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei kai...@ak.jp.nec.com -Original Message- From: Shigeru HANADA [mailto:shigeru.han...@gmail.com] Sent: Thursday, April 16, 2015 5:06 PM To: Kaigai Kouhei(海外 浩平) Cc: Ashutosh Bapat; Robert Haas; Tom Lane; Thom Brown; pgsql-hackers@postgreSQL.org Subject: ##freemail## Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API) Kaigai-san, 2015/04/15 22:33、Kouhei Kaigai kai...@ak.jp.nec.com のメール: Oops, that’s right. Attached is the revised version. I chose fully qualified name, schema.relname [alias] for the output. It would waste some cycles during planning if that is not for EXPLAIN, but it seems difficult to get a list of name of relations in ExplainForeignScan() phase, because planning information has gone away at that time. I understand. Private data structure of the postgres_fdw is not designed to keep tree structure data (like relations join tree), so it seems to me a straightforward way to implement the feature. I have a small suggestion. This patch makes deparseSelectSql initialize the StringInfoData if supplied, however, it usually shall be a task of function caller, not callee. In this case, I like to initStringInfo(relations) next to the line of initStingInfo(sql) on the postgresGetForeignPlan. In my sense, it is a bit strange to pass uninitialized StringInfoData, to get a text form. @@ -803,7 +806,7 @@ postgresGetForeignPlan(PlannerInfo *root, */ initStringInfo(sql); deparseSelectSql(sql, root, baserel, fpinfo-attrs_used, remote_conds, -params_list, fdw_ps_tlist, retrieved_attrs); +params_list, fdw_ps_tlist, retrieved_attrs, relations); /* * Build the fdw_private list that will be available in the executor. Agreed. If caller passes a buffer, it should be initialized by caller. In addition to your idea, I added a check that the RelOptInfo is a JOINREL, coz BASEREL doesn’t need relations for its EXPLAIN output. Also, could you merge the EXPLAIN output feature on the main patch? I think here is no reason why to split this feature. I merged explain patch into foreign_join patch. Now v12 is the latest patch. -- Shigeru HANADA shigeru.han...@gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Supporting src/test/modules in MSVC builds
On Fri, Apr 17, 2015 at 4:47 AM, Alvaro Herrera wrote: Andrew Dunstan wrote: It looks good, and if you've tested it I'm satisfied. I suggest that we apply patches 1 and 2 immediately. AUIU they don't require any changes to the buildfarm, as the MSVC build process will automatically build and install it with these changes. Okay, I just pushed patches 1 and 2. Cool, thanks. Then if all goes well we can apply the third patch and I'll fix the buildfarm client for the forthcoming release to run the tests on MSVC builds. Nothing will break in the meantime - the tests just won't get run until the new client version is deployed. Will wait for a day or so to see what the Win buildfarm members say. currawong and mastodon seem satisfied. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Supporting src/test/modules in MSVC builds
On 04/16/2015 07:42 PM, Michael Paquier wrote: Then if all goes well we can apply the third patch and I'll fix the buildfarm client for the forthcoming release to run the tests on MSVC builds. Nothing will break in the meantime - the tests just won't get run until the new client version is deployed. Will wait for a day or so to see what the Win buildfarm members say. currawong and mastodon seem satisfied. Yeah. I think we can push the third one - I am just about ready to release the new buildfarm client. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
Fujita-san, On 16-04-2015 PM 08:40, Etsuro Fujita wrote: From what I see in Tom's commit message[0] for FTI patch, this shouldn't be, right? To be specific, there should be Foreign Scan there as per the commit. Am I missing something? As shown in the below example, this patch doesn't change the EXPLAIN output for non-pushed-down update (delete) cases, but since we changed the EXPLAIN output as discussed in [1], the patch doubly displays Foreign Update (Foreign Delete) for pushed-down update (delet) cases like the above example. postgres=# explain verbose update parent set c1 = trunc(random() * 9 + 1)::int; QUERY PLAN - Update on public.parent (cost=0.00..452.06 rows=5461 width=6) Update on public.parent Foreign Update on public.ft1 Remote SQL: UPDATE public.t1 SET c1 = $2 WHERE ctid = $1 Foreign Update on public.ft2 Remote SQL: UPDATE public.t2 SET c1 = $2 WHERE ctid = $1 - Seq Scan on public.parent (cost=0.00..0.01 rows=1 width=6) Output: (trunc(((random() * '9'::double precision) + '1'::double precision)))::integer, parent.ctid - Foreign Scan on public.ft1 (cost=100.00..226.03 rows=2730 width=6) Output: (trunc(((random() * '9'::double precision) + '1'::double precision)))::integer, ft1.ctid Remote SQL: SELECT ctid FROM public.t1 FOR UPDATE - Foreign Scan on public.ft2 (cost=100.00..226.03 rows=2730 width=6) Output: (trunc(((random() * '9'::double precision) + '1'::double precision)))::integer, ft2.ctid Remote SQL: SELECT ctid FROM public.t2 FOR UPDATE (14 rows) I think I missed the point that you are talking about the result after the patch for foreign udpate pushdown (which is the topic of this thread) has been applied. Sorry about the noise. By the way, one suggestion may be to attach a (pushed down) to the ModifyTable's Foreign Update. And in that case, there would be no mention of corresponding scan node in the list below exactly because there would be none. postgres=# explain verbose update parent set c1 = c1; QUERY PLAN -- Update on public.parent (cost=0.00..364.54 rows=4819 width=10) Update on public.parent Foreign Update (pushed down) on public.ft1 Foreign Update (pushed down) on public.ft2 - Seq Scan on public.parent (cost=0.00..0.00 rows=1 width=10) Output: parent.c1, parent.ctid Thoughts? Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade in 9.5 broken for adminpack
On Thu, Apr 16, 2015 at 07:33:50PM -0700, Jeff Janes wrote: pg_upgrade was recently broken for use upgrading from a system with adminpack installed. Breaking commit is: commit 30982be4e5019684e1772dd9170aaa53f5a8e894 Author: Peter Eisentraut pete...@gmx.net Integrate pg_upgrade_support module into backend from pg_upgrade_dump_12870.log pg_restore: creating EXTENSION adminpack pg_restore: creating COMMENT EXTENSION adminpack pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC entry 2806; 0 0 COMMENT EXTENSION adminpack pg_restore: [archiver (db)] could not execute query: ERROR: extension adminpack does not exist Command was: COMMENT ON EXTENSION adminpack IS 'administrative functions for PostgreSQL'; I get the same error whether the source database is 9.2.10 or 9.5.HEAD. Uh, I am confused how moving pg_upgrade or pg_upgrade_support would break the loading of the adminpack extension. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reparsing query
It is not difficult to output parsed query in some tool readable format but it comes with a maintain overhead: once tools rely on it, we have to conform to some schema continuously, like the xml/xmlns. Do we want to take this? Depends on how far the tools can go with this exposed information. I think part of the problems could be resolved by using adequate API to the parse tree. For example, pgpool-II imports PostgreSQL's raw parser to do a query rewriting. The parse tree format could be changed release by release but essential API for it (for example tree_walker) is very stable. As a result, the workforce for rewriting rarely needs to be modified. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?
Josh Berkus j...@agliodbs.com writes: ERROR: syntax error at or near FILTER LINE 4: count(*)::INT FILTER (WHERE valid) as valid_count, The error is right, that's invalid syntax. I can't insert a ::INT between the aggregate() and FILTER. However, the error message is also rather confusing to the user; they're likely to look for their mistake in the wrong place. The same goes for WITHIN GROUP (and OVER, too, I think). Is there some kind of possible HINT we could add to make this easier to debug? Probably not; or at least, IMO it would be a fundamental misjudgment to go in with the idea of improving this one single case. It'd be cool if we could improve reporting of syntax errors in general, though. Unfortunately Bison doesn't provide a whole heckuva lot of support for that. The only simple thing it offers is %error-verbose, which I've experimented with in the past and come away with the impression that it'd be completely useless for most people :-(. It seems to basically add information about which tokens would be valid next at the point of the error report. That isn't any help for the sort of user conceptual error you're talking about here. You could imagine teaching yyerror() to have some SQL-specific knowledge that it could apply by comparing the current lookahead token to the current parse state stack ... but neither of those things are exposed to it by Bison. We could probably get the lookahead token indirectly by inspecting the lexer's state, but it's not clear that's enough to do much. In this particular example, that would mean showing the exact same hint whenever a syntax error is reported at an OVER token; and I'm afraid that it'd be mighty hard to write a hint for that that wouldn't frequently do more harm than good. (Note in particular that since OVER, FILTER, and WITHIN are unreserved keywords, they might be used as simple column/table/function names.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?
On 04/16/2015 04:36 PM, Tom Lane wrote: You could imagine teaching yyerror() to have some SQL-specific knowledge that it could apply by comparing the current lookahead token to the current parse state stack ... but neither of those things are exposed to it by Bison. We could probably get the lookahead token indirectly by inspecting the lexer's state, but it's not clear that's enough to do much. In this particular example, that would mean showing the exact same hint whenever a syntax error is reported at an OVER token; and Yeah, and that's what we *don't* want to do. That's why I was wondering if it was even possible to figure out the extra syntax case. I'm afraid that it'd be mighty hard to write a hint for that that wouldn't frequently do more harm than good. (Note in particular that since OVER, FILTER, and WITHIN are unreserved keywords, they might be used as simple column/table/function names.) No kidding. Heck, I just spent 1/2 hour figuring out a bug which was actually caused by the fact that a user created a view called mode. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PERFORM] pushing order by + limit to union subqueries
On Sat, Feb 28, 2015 at 8:24 AM, Tom Lane t...@sss.pgh.pa.us wrote: There would be cases where that would be a win, and there would be cases where it wouldn't be, so I'd not be in favor of making the transformation blindly. Unfortunately, given the current state of the planner that's all we could do really, because the subqueries are planned at arm's length and then we just mechanically combine them. Doing it right would entail fully planning each subquery twice, which would be very expensive. Yes, after pulling up, subqueries are planned independently and we glue them together finally. I have a longstanding desire to rewrite the upper levels of the planner to use path generation and comparison, which should make it more practical for the planner to compare alternative implementations of UNION and other top-level constructs. But I've been saying I would do that for several years now, so don't hold your breath :-( GreenPlum utilizes Cascades optimizer framework (also used in SQL Server and some others) to make the optimizer more modular and extensible. In our context here, it allows thorough optimization without pre-defined boundaries - no subquery planning then glue them. Is that something in your mind? Regards, Qingqing -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers