Re: [HACKERS] Bogus sizing parameters in some AllocSetContextCreate calls
On Sat, Aug 27, 2016 at 2:08 PM, Tom Lane wrote: > The standard calling pattern for AllocSetContextCreate is > > cxt = AllocSetContextCreate(parent, name, > ALLOCSET_DEFAULT_MINSIZE, > ALLOCSET_DEFAULT_INITSIZE, > ALLOCSET_DEFAULT_MAXSIZE); > > or for some contexts you might s/DEFAULT/SMALL/g if you expect the context > to never contain very much data. I happened to notice that there are a > few calls that get this pattern wrong. After a bit of grepping, I found: > > hba.c lines 389, 2196: >ALLOCSET_DEFAULT_MINSIZE, >ALLOCSET_DEFAULT_MINSIZE, >ALLOCSET_DEFAULT_MAXSIZE); > > guc-file.l line 146: >ALLOCSET_DEFAULT_MINSIZE, >ALLOCSET_DEFAULT_MINSIZE, >ALLOCSET_DEFAULT_MAXSIZE); > > brin.c line 857: > > ALLOCSET_SMALL_INITSIZE, > ALLOCSET_SMALL_MINSIZE, > ALLOCSET_SMALL_MAXSIZE); > > autovacuum.c line 2184: > ALLOCSET_DEFAULT_INITSIZE, > ALLOCSET_DEFAULT_MINSIZE, > ALLOCSET_DEFAULT_MAXSIZE); > > typcache.c lines 757, 842: > > ALLOCSET_SMALL_INITSIZE, > ALLOCSET_SMALL_MINSIZE, > ALLOCSET_SMALL_MAXSIZE); > > In all of these cases, we're passing zero as the initBlockSize, which is > invalid, but but AllocSetContextCreate silently forces it up to 1K. So > there's no failure but there may be some inefficiency due to small block > sizes of early allocation blocks. I seriously doubt that's intentional; > in some of these cases it might be sane to use the SMALL parameters > instead, but this isn't a good way to get that effect. The last four > cases are also passing a nonzero value as the minContextSize, forcing > the context to allocate at least that much instantly and hold onto it > over resets. That's inefficient as well, though probably only minorly so. I noticed this a while back while playing with my alternate allocator, but wasn't sure how much of it was intentional. Anyway, +1 for doing something to clean this up. Your proposal sounds OK, but maybe AllocSetContextCreateDefault() and AllocSetContextCreateSmall() would be nice and easier to remember. Also, I think we ought to replace this code in aset.c: initBlockSize = MAXALIGN(initBlockSize); if (initBlockSize < 1024) initBlockSize = 1024; maxBlockSize = MAXALIGN(maxBlockSize); With this: Assert(initBlockSize >= 1024 && initBlockSize == MAXALIGN(initBlockSize)); Assert(maxBlockSize == MAXALIGN(maxBlockSize)); This might save a few cycles which wouldn't be unwelcome given that AllocSetContextCreate does show up in profiles, but more importantly I think it's completely inappropriate for this function to paper over whatever errors may be made by callers. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksum error and VACUUM FULL
On Thu, Aug 25, 2016 at 5:45 AM, Tatsuki Kadomoto < tatsuki.kadom...@proceranetworks.com> wrote: > I faced incorrect checksum on "global/pg_filenode.map" at the right > timing "VACUUM FULL" is executed and session was aborted. > > Aug 16 20:51:19 postgres[22329]: [2-1] FATAL: relation mapping file > "global/pg_filenode.map" contains incorrect checksum > > Aug 16 20:51:19 postgres[22329]: [2-2] STATEMENT: SELECT > id,readbm,writebm,survbm,timeout FROM Users WHERE username='packetlogicd' AND > password=md5('x') > > I'm reading the comment in src/backend/utils/cache/relmapper.c . > > === > Therefore mapped catalogs can only be relocated by operations such as > VACUUM FULL and CLUSTER, which make no transactionally-significant changes: > it must be safe for the new file to replace the old, even if the > transaction itself aborts. > === > > Does this comment mean it's expected to get this kind of checksum error if > the timing is really bad? > I don't think so. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] increasing the default WAL segment size
On Fri, Aug 26, 2016 at 12:39 AM, Andres Freund wrote: > Maybe I'm missing something here - but why would we need to do any of > that? The WAL already isn't compatible between versions, and we don't > reuse the old server's WAL anyway? Isn't all that's needed relaxing some > error check? Yeah. If this change is made in a new major version - and how else would we do it? - it doesn't introduce any incompatibility that wouldn't be present already. pg_upgrade doesn't (and can't) migrate WAL. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On Fri, Aug 26, 2016 at 4:45 AM, Eduardo Morras wrote: > From my ignorance, could block size affect this WAL size increase? > > In past (didn't tried with >9 versions) you can change disk block page size > from 8KB to 16 KB or 32KB (or 4) modifing src/include/pg_config.h BLCKSZ 8192 > and recompiling. (There are some mails in 1999-2002 about this topic) Yeah, I think that's still supposed to work although I don't know whether anyone has tried it lately. It is a separate topic from this issue, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bogus sizing parameters in some AllocSetContextCreate calls
Robert Haas writes: > Also, I think we ought to replace this code in aset.c: > initBlockSize = MAXALIGN(initBlockSize); > if (initBlockSize < 1024) > initBlockSize = 1024; > maxBlockSize = MAXALIGN(maxBlockSize); > With this: > Assert(initBlockSize >= 1024 && initBlockSize == MAXALIGN(initBlockSize)); > Assert(maxBlockSize == MAXALIGN(maxBlockSize)); Good idea --- if we'd had it that way, these errors would never have gotten committed in the first place. I'm for doing that only in HEAD though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bogus sizing parameters in some AllocSetContextCreate calls
On 08/28/2016 04:41 PM, Tom Lane wrote: Robert Haas writes: Also, I think we ought to replace this code in aset.c: initBlockSize = MAXALIGN(initBlockSize); if (initBlockSize < 1024) initBlockSize = 1024; maxBlockSize = MAXALIGN(maxBlockSize); With this: Assert(initBlockSize >= 1024 && initBlockSize == MAXALIGN(initBlockSize)); Assert(maxBlockSize == MAXALIGN(maxBlockSize)); Good idea --- if we'd had it that way, these errors would never have gotten committed in the first place. I'm for doing that only in HEAD though. Then maybe also add Assert(initBlockSize <= maxBlockSize); And perhaps also an assert making sure the minContextSize value makes sens with respect to the min/max block sizes? I'm however pretty sure this will have absolutely no impact on profiles. It might save a few cycles, but this only runs when creating the memory context and all profiles I've seen are about palloc/pfree. It might matter when creating many memory contexts, but then you probably have other things to worry about. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bogus sizing parameters in some AllocSetContextCreate calls
On 08/28/2016 04:41 PM, Tom Lane wrote: Robert Haas writes: Also, I think we ought to replace this code in aset.c: initBlockSize = MAXALIGN(initBlockSize); if (initBlockSize < 1024) initBlockSize = 1024; maxBlockSize = MAXALIGN(maxBlockSize); With this: Assert(initBlockSize >= 1024 && initBlockSize == MAXALIGN(initBlockSize)); Assert(maxBlockSize == MAXALIGN(maxBlockSize)); Good idea --- if we'd had it that way, these errors would never have gotten committed in the first place. I'm for doing that only in HEAD though. regards, tom lane Then maybe also add Assert(initBlockSize <= maxBlockSize); And perhaps also an assert making sure the minContextSize value makes sens with respect to the min/max block sizes? I'm however pretty sure this will have absolutely no impact on profiles. It might save a few cycles, but this only runs when creating the memory context and all profiles I've seen are about palloc/pfree. It might matter when creating many memory contexts, but then you probably have other things to worry about. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bogus sizing parameters in some AllocSetContextCreate calls
Tomas Vondra writes: > I'm however pretty sure this will have absolutely no impact on profiles. Yeah, I do not believe that either. Also, I think that the intent of the existing code is to defend against bad parameters even in non-assert builds. Which might argue for turning these into test-and-elog rather than asserts. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bogus sizing parameters in some AllocSetContextCreate calls
On 08/28/2016 04:59 PM, Tom Lane wrote: Tomas Vondra writes: I'm however pretty sure this will have absolutely no impact on profiles. Yeah, I do not believe that either. Also, I think that the intent of the existing code is to defend against bad parameters even in non-assert builds. Which might argue for turning these into test-and-elog rather than asserts. I agree. There's a fair number of contrib modules, and when they compile just fine people are unlikely to test them with assert-enabled builds. So they would not notice any issues, but it'd get broken as we've removed the code that fixes the values for them. -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PostgreSQL Version 10, missing minor version
The routine in PostGIS to parse out the version number from pg_config is breaking in the 10 cycle. Issue seems to be because there is no minor specified. e.g. pgconfig --version returns: PostgreSQL 10devel Instead of expected PostgreSQL 10.0devel Is this the way it's going to be or will there be a .0 tacked at the end before release? Thanks, Regina -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL Version 10, missing minor version
On 08/28/2016 09:55 AM, Regina Obe wrote: > The routine in PostGIS to parse out the version number from pg_config is > breaking in the 10 cycle. > > Issue seems to be because there is no minor specified. > > e.g. > > pgconfig --version > > returns: > > PostgreSQL 10devel > > Instead of expected > > PostgreSQL 10.0devel > > Is this the way it's going to be or will there be a .0 tacked at the end > before release? Given the version numbering change, postgres version 10 is equivalent to version 9.6 (i.e. the "major" version number), and 10.0 is equivalent to 9.6.0 (i.e. the "major.minor" version). So I suspect that given... pg_config --version PostgreSQL 9.5.3 pg_config --version PostgreSQL 9.6beta4 ... you will see Postgres 10 go through the stages... pg_config --version PostgreSQL 10devel pg_config --version PostgreSQL 10beta1 pg_config --version PostgreSQL 10.0 HTH, Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development signature.asc Description: OpenPGP digital signature
Re: [HACKERS] PostgreSQL Version 10, missing minor version
"Regina Obe" writes: > The routine in PostGIS to parse out the version number from pg_config is > breaking in the 10 cycle TBH, I wonder why you are doing that in the first place; it does not seem like the most reliable source of version information. If you need to do compile-time tests, PG_CATALOG_VERSION is considered the best thing to look at, or VERSION_NUM in Makefiles. However, if you're dead set on getting a version number out of a string representation, you'll need to make changes similar to commit 69dc5ae408f68c302029a6b43912a2cc16b1256c. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RLS related docs
On 05/30/2016 01:56 PM, Joe Conway wrote: > On 05/26/2016 12:26 AM, Dean Rasheed wrote: >> On 25 May 2016 at 02:04, Joe Conway wrote: >>> Please see attached two proposed patches for the docs related to RLS: >>> >>> 1) Correction to pg_restore >>> 2) Additional mentions that "COPY FROM" does not allow RLS to be enabled >>> >>> Comments? >>> >> >> The pg_restore change looks good -- that was clearly wrong. >> >> Also, +1 for the new note in pg_dump. > > Great, thanks! > >> For COPY, I think perhaps it would be more logical to put the new note >> immediately after the third note which describes the privileges >> required, since it's kind of related, and then we can talk about the >> RLS policies required, e.g.: >> >> If row-level security is enabled for the table, COPY table TO is >> internally converted to COPY (SELECT * FROM table) TO, and the >> relevant security policies are applied. Currently, COPY FROM is not >> supported for tables with row-level security. > > This sounds better than what I had, so I will do it that way. Apologies for the delay, but new patch attached. Assuming no more comments, will commit this, backpatched to 9.5, in a day or two. Thanks, Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml index 07e2f45..af15fd1 100644 *** a/doc/src/sgml/ref/copy.sgml --- b/doc/src/sgml/ref/copy.sgml *** COPY coun *** 419,424 --- 419,434 + If row-level security is enabled for the table, COPY + table TO is + internally converted to COPY (SELECT * FROM + table) TO ..., + and the relevant security policies are applied. Currently, + COPY FROM is not supported for tables with row-level + security. Use equivalent INSERT statements instead. + + + Files named in a COPY command are read or written directly by the server, not by the client application. Therefore, they must reside on or be accessible to the database server machine, diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml index be1b684..4fa925c 100644 *** a/doc/src/sgml/ref/pg_dump.sgml --- b/doc/src/sgml/ref/pg_dump.sgml *** PostgreSQL documentation *** 699,704 --- 699,709 to dump the parts of the contents of the table that they have access to. + + Note that if you use this option currently, you probably also want + the dump be in INSERT format, as the + COPY FROM during restore does not support row security. + diff --git a/doc/src/sgml/ref/pg_restore.sgml b/doc/src/sgml/ref/pg_restore.sgml index c906919..ef5bab4 100644 *** a/doc/src/sgml/ref/pg_restore.sgml --- b/doc/src/sgml/ref/pg_restore.sgml *** *** 527,533 Note that this option currently also requires the dump be in INSERT ! format, as COPY TO does not support row security. --- 527,533 Note that this option currently also requires the dump be in INSERT ! format, as COPY FROM does not support row security. signature.asc Description: OpenPGP digital signature
Re: [HACKERS] src/include/catalog/pg_foreign_table.h still refers genbki.sh
Tomas Vondra writes: > I've noticed the comment in src/include/catalog/pg_foreign_table.h still > talks about genbki.sh, but AFAIK it was replaced with genbki.pl. Hmm, somebody must have copied-and-pasted this header comment from some old catalog include file. Fixed, thanks. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] distinct estimate of a hard-coded VALUES list
On Mon, Aug 22, 2016 at 10:19 AM, Robert Haas wrote: > On Sat, Aug 20, 2016 at 4:58 PM, Tom Lane wrote: > > Jeff Janes writes: > >> On Thu, Aug 18, 2016 at 2:25 PM, Tom Lane wrote: > >>> It does know it, what it doesn't know is how many duplicates there are. > > > >> Does it know whether the count comes from a parsed query-string > list/array, > >> rather than being an estimate from something else? If it came from a > join, > >> I can see why it would be dangerous to assume they are mostly distinct. > >> But if someone throws 6000 things into a query string and only 200 > distinct > >> values among them, they have no one to blame but themselves when it > makes > >> bad choices off of that. > > > > I am not exactly sold on this assumption that applications have > > de-duplicated the contents of a VALUES or IN list. They haven't been > > asked to do that in the past, so why do you think they are doing it? > I think most people wouldn't go out of their way to de-duplicate simply because the way they get the list is inherently deduplicated in the first place, or at least is not going to routinely have massive redundancy. If I thought it was likely to have massive redundancy, then I would add code to deduplicate just because I don't want to lug redundant data around inside my own app server code, regardless of what it does to PostgreSQL's planner. We can be sure that someone somewhere is sending huge lists which only a couple hundred distinct values, but I don't think we should let that block changes. I think the more serious source of regressions will be cases where the current bad distinct estimate is cancelling some other bad estimate elsewhere in the planner, yielding an efficient plan by accident. Improving the estimate could push the plan over to a bad choice, leading to a large regression upon upgrading. For that reason, I think it is better not to try to get your patch (which I tested and looks good to me, except that it still isn't picking the best plan for unrelated reasons) into 9.6 after beta4 but just put it into 10.0. I don't know when most people with stringent SLA start validating the performance of new versions, but I suspect some are already doing so before beta4 and it wouldn't be nice to change this on them. > > It's hard to know, but my intuition is that most people would > deduplicate. I mean, nobody is going to want to their query generator > to send X IN (1, 1, ) to the server if it > could have just sent X IN (1). > Yes, I agree with that. But note that this isn't currently relevant to X IN (1) anyway. 'X IN (list)' gets planned as if it were "X=ANY(ARRAY...)", which goes through a different machinery than "X=ANY(VALUES...)" There is a 9 year old to-do item which proposes (if I understand it correctly) to change this so that "X=ANY(ARRAY...)" goes through the same plan as "X=ANY(VALUES...)" A problem with doing that is currently people can, in lieu of planner hints, re-write their queries between the two ways of expressing them, in case one way gives a better plan. If they were always planned identically, it would take away that safety hatch. The most conservative thing, given the current state, would be to fix "X=ANY(ARRAY...)" so that it builds a hash table out of the ARRAY, and then probes that table for each X, rather than iterating over the ARRAY for each X, which is what it currently does. That way you would get the benefit of using a hash, without the risk of tipping over into some unsuitable join order in the process. It should have no user-facing downsides, just the ugliness of having two ways to implement fundamentally the same thing. So basically this line of a plan: Filter: (aid = ANY ('{...}') Currently glosses over a nested loop over the array. We could make it gloss over a hash probe into the "array" instead. Cheers, Jeff
[HACKERS] Reminder: 9.6rc1 wraps tomorrow
Just making sure everyone's on the same page. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] comment fix for CUSTOMPATH_* flags
Hello, I noticed the source code comment around CustomPath structure says "see above" for definition of CUSTOMPATH_* flags. It was originally right, but it was moved to nodes/extensible.h on the further development. So, no comments are above. The attached patch corrects the comment for the right location. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei pgsql-v9.6-custom-flags-comments-fixup.patch Description: pgsql-v9.6-custom-flags-comments-fixup.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] VACUUM's ancillary tasks
The attached two patches scratch two itches I've been having for a while. I'm attaching them together because the second depends on the first. Both deal with the fact that [auto]vacuum has taken on more roles than its original purpose. Patch One: autovacuum insert-heavy tables If you have a table that mostly receives INSERTs, it will never get vacuumed because there are no (or few) dead rows. I have added an "inserts_since_vacuum" field to PgStat_StatTabEntry which works exactly the same way as "changes_since_analyze" does. The reason such a table needs to be vacuumed is currently twofold: the visibility map is not updated, slowing down index-only scans; and BRIN indexes are not maintained, rendering them basically useless. Patch Two: autovacuum after table rewrites This patch addresses the absurdity that a standard VACUUM is required after a VACUUM FULL because the visibility map gets blown away. This is also the case for CLUSTER and some versions of ALTER TABLE that rewrite the table. I thought about having those commands do the same work themselves, but it seems better to have them simply trigger autovacuum than quadruplicate the work. I do this by having them fill in the "inserts_since_vacuum" field added in Patch One with the number of rows rewritten. This assumes that autovacuum_vacuum_scale_factor is < 1.0 which hopefully is a safe assumption. While doing this, I noticed that ALTER TABLE should also re-analyze the table for obvious reasons, so I have that one set "changes_since_analyze" to the number of rows rewritten as well. I have not included any kind of test suite here because I don't really have any ideas how to go about it in a sane way. Suggestions welcome. Attention reviewer: Please note that some of the documentation in the first patch gets removed by the second patch, in case they both don't get committed. I have added this to the imminent commitfest. These patches are rebased as of 26fa446. -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support autovac_01_v01.patch Description: invalid/octet-stream autovac_02_v01.patch Description: invalid/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Sample configuration files
We have sample configuration files for postgresql.conf and recovery.conf, but we do not have them for contrib modules. This patch attempts to add them. Although the patch puts the sample configuration files in the tree, it doesn't try to install them. That's partly because I think it would need an extension version bump and that doesn't seem worth it. Also, while writing this patch, it crossed my mind that the plpgsql variables should probably be in the main postgresql.conf file because we install it by default. I will happily write that patch if desired, independently of this patch. Current as of 26fa446, added to next commitfest. -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support sample_confs_v01.patch Description: invalid/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PassDownLimitBound for ForeignScan/CustomScan
Hello, The attached patch adds an optional callback to support special optimization if ForeignScan/CustomScan are located under the Limit node in plan-tree. Our sort node wisely switches the behavior when we can preliminary know exact number of rows to be produced, because all the Sort node has to return is the top-k rows when it is located under the Limit node. It is much lightweight workloads than sorting of entire input rows when nrows is not small. In my case, this information is very useful because GPU can complete its sorting operations mostly on L1-grade memory if we can preliminary know the top-k value is enough small and fits to size of the fast memory. Probably, it is also valuable for Fujita-san's case because this information allows to attach "LIMIT k" clause on the remote query of postgres_fdw. It will reduce amount of the network traffic and remote CPU consumption once we got support of sort pushdown. One thing we need to pay attention is cost estimation on the planner stage. In the existing code, only create_ordered_paths() and create_merge_append_path() considers the limit clause for cost estimation of sorting. They use the 'limit_tuples' of PlannerInfo; we can reference the structure when extension adds ForeignPath/CustomPath, so I think we don't need a special enhancement on the planner stage. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei pgsql-v10-fdw-css-limit-bound.v1.patch Description: pgsql-v10-fdw-css-limit-bound.v1.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VACUUM's ancillary tasks
Hi, On 2016-08-29 03:26:06 +0200, Vik Fearing wrote: > The attached two patches scratch two itches I've been having for a > while. I'm attaching them together because the second depends on the first. > > Both deal with the fact that [auto]vacuum has taken on more roles than > its original purpose. > > > Patch One: autovacuum insert-heavy tables > > If you have a table that mostly receives INSERTs, it will never get > vacuumed because there are no (or few) dead rows. I have added an > "inserts_since_vacuum" field to PgStat_StatTabEntry which works exactly > the same way as "changes_since_analyze" does. > > The reason such a table needs to be vacuumed is currently twofold: the > visibility map is not updated, slowing down index-only scans; and BRIN > indexes are not maintained, rendering them basically useless. It might be worthwhile to look at http://archives.postgresql.org/message-id/CAMkU%3D1zGu5OshfzxKBqDmxxKcoDJu4pJux8UAo5h7k%2BGA_jS3Q%40mail.gmail.com there's definitely some overlap. > Patch Two: autovacuum after table rewrites > > This patch addresses the absurdity that a standard VACUUM is required > after a VACUUM FULL because the visibility map gets blown away. This is > also the case for CLUSTER and some versions of ALTER TABLE that rewrite > the table. I think this should rather fixed by maintaining the VM during cluster. IIRC there was an attempt late in the 9.5 cycle, but Bruce (IIRC) ran out of steam. And nobody picked it up again ... :( Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] VACUUM's ancillary tasks
On Monday, 29 August 2016, Andres Freund > wrote: > Hi, > > On 2016-08-29 03:26:06 +0200, Vik Fearing wrote: > > The attached two patches scratch two itches I've been having for a > > while. I'm attaching them together because the second depends on the > first. > > > > Both deal with the fact that [auto]vacuum has taken on more roles than > > its original purpose. > > > > > > Patch One: autovacuum insert-heavy tables > > > > If you have a table that mostly receives INSERTs, it will never get > > vacuumed because there are no (or few) dead rows. I have added an > > "inserts_since_vacuum" field to PgStat_StatTabEntry which works exactly > > the same way as "changes_since_analyze" does. > > > > The reason such a table needs to be vacuumed is currently twofold: the > > visibility map is not updated, slowing down index-only scans; and BRIN > > indexes are not maintained, rendering them basically useless. > > It might be worthwhile to look at > http://archives.postgresql.org/message-id/CAMkU%3D1zGu5Oshfz > xKBqDmxxKcoDJu4pJux8UAo5h7k%2BGA_jS3Q%40mail.gmail.com > there's definitely some overlap. > > > > Patch Two: autovacuum after table rewrites > > > > This patch addresses the absurdity that a standard VACUUM is required > > after a VACUUM FULL because the visibility map gets blown away. This is > > also the case for CLUSTER and some versions of ALTER TABLE that rewrite > > the table. > > I think this should rather fixed by maintaining the VM during > cluster. > +1 > > IIRC there was an attempt late in the 9.5 cycle, but Bruce > (IIRC) ran out of steam. And nobody picked it up again ... :( > > It may be worth to look at https://www.postgresql.org/message-id/CAMkU%3D1w3osJJ2FneELhhNRLxfZitDgp9FPHee08NT2FQFmz_pQ%40mail.gmail.com I've updated this patch to apply to current HEAD, can propose it to pg10. Regards, -- Masahiko Sawada -- Regards, -- Masahiko Sawada
Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint)
On 24 August 2016 at 03:10, Robert Haas wrote: > > On Tue, Aug 23, 2016 at 12:59 PM, Craig Ringer wrote: > > Also fine by me. You're right, keep it simple. It means the potential set of > > values isn't discoverable the same way, but ... meh. Using it usefully means > > reading the docs anyway. > > > > The remaining 2 patches of interest are attached - txid_status() and > > txid_convert_if_recent(). Thanks for committing txid_current_if_assigned(). > > > > Now I'd best stop pretending I'm in a sensible timezone. > > I reviewed this version some more and found some more problems. Thanks. It took me a few days to prep a new patch as I found another issue in the process. Updated patch attached. The updated series starts (0001) with a change to slru.c to release the control lock when throwing an exception so that we don't deadlock with ourselves when re-entering slru.c; explanation below. Then there's the txid_status (0002) patch with fixes, then txid_convert_if_recent(0003). I omitted txid_incinerate() ; I have an updated version that sets xact status to aborted for burned xacts instead of leaving them at 0 (in-progress), but haven't had time to finish it so it doesn't try to blindly set the status of xacts on pages where it didn't hold XidGenLock when the page was added to the clog. > +uint32 xid_epoch = (uint32)(xid_with_epoch >>32); > +TransactionId xid = (TransactionId)(xid_with_epoch); > > I think this is not project style. In particular, I think that the > first one needs a space after the cast and another space before the > 32; and I think the second one has an unnecessary set of parentheses > and needs a space added. OK, no problems. I didn't realise spacing around casts was specified. > > +/* > + * Underlying implementation of txid_status, which is mapped to an enum in > + * system_views.sql. > + */ > > Not any more... That's why I shouldn't revise a patch at 1am ;) > > +if (TransactionIdIsCurrentTransactionId(xid) || > TransactionIdIsInProgress(xid)) > +status = gettext_noop("in progress"); > +else if (TransactionIdDidCommit(xid)) > +status = gettext_noop("committed"); > +else if (TransactionIdDidAbort(xid)) > +status = gettext_noop("aborted"); > +else > +/* shouldn't happen */ > +ereport(ERROR, > +(errmsg_internal("unable to determine commit status of > xid "UINT64_FORMAT, xid))); > > Maybe I'm all wet here, but it seems like there might be a problem > here if the XID is older than the CLOG truncation point but less than > one epoch old. get_xid_in_recent_past only guarantees that the > transaction is less than one epoch old, not that we still have CLOG > data for it. Good point. The call would then fail with something like ERROR: could not access status of transaction 778793573 DETAIL: could not open file "pg_clog/02E6": No such file or directory This probably didn't come up in my wraparound testing because I'm aggressively forcing wraparound by writing a lot of clog very quickly under XidGenLock, and because I'm mostly looking at xacts that are either recent or past the xid boundary. I've added better add coverage for xacts around 2^30 behind the nextXid to the wraparound tests; can't add them to txid.sql since the xid never gets that far in normal regression testing. What I'd really like is to be able to ask transam.c to handle the xid_in_recent_past logic, treating an attempt to read an xid from beyond the clog truncation threshold as a soft error indicating unknown xact state. But that involves delving into slru.c, and I really, really don't want to touch that for what should be a simple and pretty noninvasive utility function. A PG_TRY to trap ERRCODE_UNDEFINED_FILE seems like it'd be sufficient, except for two issues: * I see no accepted way to access the errcode etc from within the PG_CATCH block, though. PG_RE_THROW() can do it because pg_re_throw() is in elog.c. I couldn't find any existing code that seems to check details about an error thrown in a PG_TRY block, only SPI calls. I don't want to ignore all types of errors and potentially hide problems, so I just used geterrcode() - which is meant for errcontext callbacks - and changed the comment to say it can be used in PG_CATCH too. I don't see why it shouldn't be. We should probably have some sort of PG_CATCH_INFO(varname) that exposes the top ErrorData, but that's not needed for this patch so I left it alone. * TransactionIdGetStatus() releases the CLogControlLock taken by SimpleLruReadPage_ReadOnly() on normal exit but not after an exception thrown from SlruReportIOError(). It seems appropriate for SimpleLruReadPage() to release the LWLock before calling SlruReportIOError(), so I've done that as a separate patch (now 0001). I also removed the TransactionIdInProgress check in txid_status and just assume it's in progress if it isn't our xid, committed or aborted. TransactionIdInProgress looks like it's potentially more expensive, a
Re: [HACKERS] PostgreSQL Version 10, missing minor version
On 29 August 2016 at 02:52, Tom Lane wrote: > "Regina Obe" writes: >> The routine in PostGIS to parse out the version number from pg_config is >> breaking in the 10 cycle > > TBH, I wonder why you are doing that in the first place; it does not > seem like the most reliable source of version information. If you > need to do compile-time tests, PG_CATALOG_VERSION is considered the > best thing to look at, or VERSION_NUM in Makefiles. This is my cue to pop up and say that if you're looking at the startup message you have to use the version string, despite it not being the most reliable source of information, because we don't send server_version_num ;) Patch attached. Yes, I know PostGIS doesn't use it, but it makes no sense to tell people not to parse the server version out in some situations then force them to in others. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services From 9e6fce06b07ca8e272a6125c9a75ac2ba7714d03 Mon Sep 17 00:00:00 2001 From: Craig Ringer Date: Mon, 29 Aug 2016 11:31:52 +0800 Subject: [PATCH] Report server_version_num alongside server_version in startup packet We expose PG_VERSION_NUM in Makefiles and headers and in pg_settings, but the equivalent server_version_num isn't sent in the startup packet so clients must rely on parsing the server_version. Make server_version_num GUC_REPORT so clients can use server_version_num if present and fall back to server_version for older PostgreSQL versions. --- src/backend/utils/misc/guc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index c5178f7..36e3604 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -2767,7 +2767,7 @@ static struct config_int ConfigureNamesInt[] = {"server_version_num", PGC_INTERNAL, PRESET_OPTIONS, gettext_noop("Shows the server version as an integer."), NULL, - GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE + GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_REPORT }, &server_version_num, PG_VERSION_NUM, PG_VERSION_NUM, PG_VERSION_NUM, -- 2.5.5 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint)
Hi, On 2016-08-29 11:25:39 +0800, Craig Ringer wrote: > ERROR: could not access status of transaction 778793573 > DETAIL: could not open file "pg_clog/02E6": No such file or directory > > What I'd really like is to be able to ask transam.c to handle the > xid_in_recent_past logic, treating an attempt to read an xid from > beyond the clog truncation threshold as a soft error indicating > unknown xact state. But that involves delving into slru.c, and I > really, really don't want to touch that for what should be a simple > and pretty noninvasive utility function. Can't you "just" check this against ShmemVariableCache->oldestXid while holding appropriate locks? > A PG_TRY to trap ERRCODE_UNDEFINED_FILE seems like it'd be sufficient, > except for two issues: It seems like a bad idea to PG_CATCH and not re-throw an error. That generally is quite error prone. At the very least locking and such gets a lot more complicated (as you noticed below). > * TransactionIdGetStatus() releases the CLogControlLock taken by > SimpleLruReadPage_ReadOnly() on normal exit but not after an exception > thrown from SlruReportIOError(). It seems appropriate for > SimpleLruReadPage() to release the LWLock before calling > SlruReportIOError(), so I've done that as a separate patch (now 0001). We normally prefer to handle this via the "bulk" releases in the error handlers. It's otherwise hard to write code that handles these situations reliably. It's different for spinlocks, but those normally protect far smaller regions of code. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL Version 10, missing minor version
On 2016-08-29 11:41:00 +0800, Craig Ringer wrote: > On 29 August 2016 at 02:52, Tom Lane wrote: > > "Regina Obe" writes: > >> The routine in PostGIS to parse out the version number from pg_config is > >> breaking in the 10 cycle > > > > TBH, I wonder why you are doing that in the first place; it does not > > seem like the most reliable source of version information. If you > > need to do compile-time tests, PG_CATALOG_VERSION is considered the > > best thing to look at, or VERSION_NUM in Makefiles. > > This is my cue to pop up and say that if you're looking at the startup > message you have to use the version string, despite it not being the > most reliable source of information, because we don't send > server_version_num ;) > > Patch attached. Yes, I know PostGIS doesn't use it, but it makes no > sense to tell people not to parse the server version out in some > situations then force them to in others. If they're using pg_config atm, that seems unlikely to be related. That sounds more like a build time issue - there won't be a running server. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is a newly created index contains the invalid LSN?
On 8/26/16 4:17 PM, Andres Freund wrote: On 2016-08-26 18:46:42 +0300, Yury Zhuravlev wrote: Thanks all. Now understand LSN strongly connected with WAL. However how difficult put last system LSN instead 0? It's not so important but will allow make use LSN more consistent. Maybe explain why you're interested in page lsns, that'd perhaps allow us to give more meaningful feedback. Yeah, especially since you mentioned this being for backups. I suspect you *want* those WAL records marked with 0, because that tells you that you can't rely on WAL when you back that data up. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming of pg_xlog and pg_clog
On 8/26/16 4:08 PM, Andres Freund wrote: Splitting of ephemeral data seems to have a benefit, the rest seems more like rather noisy busywork to me. People accidentally blowing away pg_clog or pg_xlog is a pretty common occurrence, and I don't think there's all that many tools that reference them. I think it's well worth renaming them. I actually like the idea of re-organizing everything too. It would be easy to include a tool that puts symlinks in place to all the original locations if you need that. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL consistency check facility
Thank you. I've updated it accordingly. On Sun, Aug 28, 2016 at 11:20 AM, Peter Geoghegan wrote: > On Sat, Aug 27, 2016 at 9:47 PM, Amit Kapila wrote: >> Right, I think there is no need to mask all the flags. However apart >> from BTP_HAS_GARBAGE, it seems we should mask BTP_SPLIT_END as that is >> just used to save some processing for vaccum and won't be set after >> crash recovery or on standby after WAL replay. > > Right you are -- while BTP_INCOMPLETE_SPLIT is set during recovery, > BTP_SPLIT_END is not. Still, most of the btpo_flags flags that are > masked in the patch shouldn't be. > > > -- > Peter Geoghegan -- Thanks & Regards, Kuntal Ghosh EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch proposal
On Fri, Aug 26, 2016 at 10:58 PM, Stephen Frost wrote: > * Venkata B Nagothi (nag1...@gmail.com) wrote: > > On Fri, Aug 26, 2016 at 12:30 PM, Stephen Frost > wrote: > > > * Venkata B Nagothi (nag1...@gmail.com) wrote: > > > > This behaviour will be similar to that of > recovery_target="immediate" and > > > > can be aliased. > > > > > > I don't believe we're really going at this the right way. Clearly, > > > there will be cases where we'd like promotion at the end of the WAL > > > stream (as we currently have) even if the recovery point is not found, > > > but if the new option's "promote" is the same as "immediate" then we > > > don't have that. > > > > My apologies for the confusion. Correction - I meant, "promote" option > > would promote the database once the consistent point is reached at the > > end-of-the-WAL. > > "consistent point" and "end-of-the-WAL" are not the same. > > > > recovery_target = immediate, other > > > > > > recovery_action_target_found = promote, pause, shutdown > > > > This is currently supported by the existing parameter called > > "recovery_target_action" > > Right, sure, we could possibly use that, or create an alias, etc. > > > > recovery_action_target_not_found = promote, pause, shutdown > > > > This is exactly what newly proposed parameter will do. > > Then it isn't the same as 'recovery_target = immediate'. > > > > One question to ask is if we need to support an option for xid and time > > > related to when we realize that we won't find the recovery target. If > > > consistency is reached at a time which is later than the recovery > target > > > for time, what then? Do we go through the rest of the WAL and perform > > > the "not found" action at the end of the WAL stream? If we use that > > > approach, then at least all of the recovery target types are handled > the > > > same, but I can definitely see cases where an administrator might > prefer > > > an "error" option. > > > > Currently, this situation is handled by recovery_target_inclusive > > parameter. > > No, that's not the same. > > > Administrator can choose to stop the recovery at any consistent > > point before or after the specified recovery target. Is this what you > were > > referring to ? > > Not quite. > > > I might need a bit of clarification here, the parameter i am proposing > > would be effective only if the specified recovery target is not reached > and > > may not be effective if the recovery goes beyond recovery target point. > > Ok, let's consider this scenario: > > Backup started at: 4/22D3B8E0, time: 2016-08-26 08:29:08 > Backup completed (consistency) at: 4/22D3B8EF, 2016-08-26 08:30:00 > recovery_target is not set > recovery_target_time = 2016-08-26 08:29:30 > recovery_target_inclusive = false > > In such a case, we will necessairly commit transactions which happened > between 8:29:30 and 8:30:00 and only stop (I believe..) once we've > reached consistency. We could possibly detect that case and throw an > error instead. The same could happen with xid. > Yes, currently, PG just recovers until the consistency is reached. It makes sense to throw an error instead. > Working through more scenarios would be useful, I believe. For example, > what if the xid or time specified happened before the backup started? > > Basically, what I was trying to get at is that we might be able to > detect that we'll never find a given recovery target point without > actually replaying all of WAL and wondering if we should provide options > to control what to do in such a case. > Ah, Yes, I got the point and I agree. Thanks for the clarification. Yes, good idea and it is important to ensure PG errors out and warn the administrator with appropriate message indicating that... "a much earlier backup must be used..." if the specified recovery target xid, name or time is earlier than the backup time. Regards, Venkata B N Fujitsu Australia
Re: [HACKERS] Renaming of pg_xlog and pg_clog
On 29 Aug 2016 12:10 PM, "Jim Nasby" wrote: > > On 8/26/16 4:08 PM, Andres Freund wrote: >> >> Splitting of ephemeral data seems to have a benefit, the rest seems more >> like rather noisy busywork to me. > > > People accidentally blowing away pg_clog or pg_xlog is a pretty common occurrence, and I don't think there's all that many tools that reference them. I think it's well worth renaming them. Yeah. I've seen it in BIG production users who really should have known better. People won't see a README in amongst 5000 xlog segments while freaking out about the sever being down. I don't care if it comes as part of some greater reorg or not but I'll be really annoyed if scope creep lands up killing the original proposal to just rename these dirs. I think that a simple rename should be done first. Then if some greater reorg is to be done it can be done shortly after. The only people that'll upset are folks tracking early 10.0 dev and they'll be aware it's coming.
Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint)
On Sun, Aug 28, 2016 at 11:25 PM, Craig Ringer wrote: > What I'd really like is to be able to ask transam.c to handle the > xid_in_recent_past logic, treating an attempt to read an xid from > beyond the clog truncation threshold as a soft error indicating > unknown xact state. But that involves delving into slru.c, and I > really, really don't want to touch that for what should be a simple > and pretty noninvasive utility function. I think you're going to have to bite the bullet and do that, though, because ... > A PG_TRY to trap ERRCODE_UNDEFINED_FILE seems like it'd be sufficient, ...I don't think this has any chance of being acceptable. You can't catch errors and not re-throw them. That's bad news. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] GiST penalty functions [PoC]
Hi, hackers! Some time ago I put a patch to commitfest that optimizes slightly GiST inserts and updates. It's effect in general is quite modest (15% perf improved), but for sorted data inserts are 3 times quicker. This effect I attribute to mitigation of deficiencies of old split algorithm. Alexander Korotkov advised me to lookup some features of the RR*-tree. The RR*-tree differs from combination of Gist + cube extension in two algorithms: 1.Algorithm to choose subtree for insertion 2.Split algorithm This is the message regarding implementation of first one in GiST. I think that both of this algorithms will improve querying performance. Long story short, there are two problems in choose subtree algorithm. 1.GiST chooses subtree via penalty calculation for each entry, while RR*-tree employs complicated conditional logic: when there are MBBs (Minimum Bounding Box) which do not require extensions, smallest of them is chosen; in some cases, entries are chosen not by volume, but by theirs margin. 2.RR*-tree algorithm jumps through entry comparation non-linearly, I think this means that GiST penalty computation function will have to access other entries on a page. In this message I address only first problem. I want to construct a penalty function that will choose: 1.Entry with a zero volume and smallest margin, that can accommodate insertion tuple without extension, if one exists; 2.Entry with smallest volume, that can accommodate insertion tuple without extension, if one exists; 3.Entry with zero volume increase after insert and smallest margin increase, if one exists; 4.Entry with minimal volume increase after insert. Current cube behavior inspects only 4th option by returning as a penalty (float) MBB’s volume increase. To implement all 4 options, I use a hack: order of positive floats corresponds exactly to order of integers with same binary representation (I’m not sure this stands for every single supported platform). So I use two bits of float as if it were integer, and all others are used as shifted bits of corresponding float: option 4 – volume increase, 3 - margin increase, 2 – MBB volume, 1 – MBB margin. You can check the reinterpretation cast function pack_float() in the patch. I’ve tested patch performance with attached test. On my machine patch slows index construction time from 60 to 76 seconds, reduces size of the index from 85Mb to 82Mb, reduces time of aggregates computation from 36 seconds to 29 seconds. I do not know: should I continue this work in cube, or it’d be better to fork cube? Maybe anyone have tried already RR*-tree implementation? Science papers show very good search performance increase. Please let me know if you have any ideas, information or interest in this topic. Best regards, Andrey Borodin, Octonica & Ural Federal University. diff --git a/contrib/cube/cube.c b/contrib/cube/cube.c index 3feddef..9c8c63d 100644 --- a/contrib/cube/cube.c +++ b/contrib/cube/cube.c @@ -96,6 +96,7 @@ bool cube_contains_v0(NDBOX *a, NDBOX *b); bool cube_overlap_v0(NDBOX *a, NDBOX *b); NDBOX *cube_union_v0(NDBOX *a, NDBOX *b); void rt_cube_size(NDBOX *a, double *sz); +void rt_cube_edge(NDBOX *a, double *sz); NDBOX *g_cube_binary_union(NDBOX *r1, NDBOX *r2, int *sizep); bool g_cube_leaf_consistent(NDBOX *key, NDBOX *query, StrategyNumber strategy); bool g_cube_internal_consistent(NDBOX *key, NDBOX *query, StrategyNumber strategy); @@ -420,6 +421,13 @@ g_cube_decompress(PG_FUNCTION_ARGS) PG_RETURN_POINTER(entry); } +float pack_float(float actualValue, int realm) +{ + // two bits for realm, other for value + int realmAjustment = *((int*)&actualValue)/4; + int realCode = realm * (INT32_MAX/4) + realmAjustment; // we have 4 realms + return *((float*)&realCode); +} /* ** The GiST Penalty method for boxes @@ -428,6 +436,11 @@ g_cube_decompress(PG_FUNCTION_ARGS) Datum g_cube_penalty(PG_FUNCTION_ARGS) { + // REALM 0: No extension is required, volume is zerom return edge + // REALM 1: No extension is required, return nonzero volume + // REALM 2: Volume extension is zero, return nonzero edge extension + // REALM 3: Volume extension is nonzero, return it + GISTENTRY *origentry = (GISTENTRY *) PG_GETARG_POINTER(0); GISTENTRY *newentry = (GISTENTRY *) PG_GETARG_POINTER(1); float *result = (float *) PG_GETARG_POINTER(2); @@ -440,6 +453,32 @@ g_cube_penalty(PG_FUNCTION_ARGS) rt_cube_size(ud, &tmp1); rt_cube_size(DatumGetNDBOX(origentry->key), &tmp2); *result = (float) (tmp1 - tmp2); + if(*result == 0) + { + double tmp3 = tmp1; + rt_cube_edge(ud, &tmp1); + rt_cube_edge(DatumGetNDBOX(origentry->key), &tmp2); + *result = (float) (tmp1 - tmp2); + if(*result == 0) + { +
[HACKERS] ecpg -? option doesn't work in windows
ecpg option --help alternative -? doesn't work in windows. In windows, the PG provided getopt_long function is used for reading the provided options. The getopt_long function returns '?' for invalid characters also but it sets optopt option to 0 in case if the character itself is a '?'. But this works for Linux and others, whereas for windows, optopt is not 0. Because of this reason it is failing. I feel, from this commit 5b88b85c on wards, it is not working. I feel instead of fixing the getopt_long function to reset optopt parameter to zero whenever it is returning '?', I prefer fixing the ecpg in handling the version and help options seperate. Patch is attached. Any one prefers the getopt_long function fix, I can produce the patch for the same. Regards, Hari Babu Fujitsu Australia ecpg_bugfix.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming of pg_xlog and pg_clog
On Mon, Aug 29, 2016 at 2:36 PM, Craig Ringer wrote: > I don't care if it comes as part of some greater reorg or not but I'll be > really annoyed if scope creep lands up killing the original proposal to just > rename these dirs. I think that a simple rename should be done first. Then > if some greater reorg is to be done it can be done shortly after. The only > people that'll upset are folks tracking early 10.0 dev and they'll be aware > it's coming. Okay, so let's do it. Attached are two patches: - 0001 renames pg_clog to pg_trans. I have let clog.c with its current name, as well as its structures. That's the mechanical patch, the ony interesting part being in pg_upgrade. - 0002 renames pg_xlog to pg_wal. -- Michael From 74ff97a9520f496e0df303e777ad93d5eca054f5 Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Mon, 29 Aug 2016 15:05:46 +0900 Subject: [PATCH 1/2] Rename pg_clog to pg_trans pg_upgrade is updated to handle the transfer correctly to post-10 clusters. --- doc/src/sgml/backup.sgml | 4 +- doc/src/sgml/catalogs.sgml | 4 +- doc/src/sgml/config.sgml | 2 +- doc/src/sgml/func.sgml | 2 +- doc/src/sgml/maintenance.sgml | 8 ++-- doc/src/sgml/ref/pg_resetxlog.sgml | 4 +- doc/src/sgml/ref/pg_rewind.sgml| 2 +- doc/src/sgml/storage.sgml | 10 ++--- doc/src/sgml/wal.sgml | 2 +- src/backend/access/heap/heapam.c | 4 +- src/backend/access/transam/README | 12 +++--- src/backend/access/transam/clog.c | 2 +- src/backend/access/transam/commit_ts.c | 2 +- src/backend/access/transam/multixact.c | 2 +- src/backend/access/transam/subtrans.c | 4 +- src/backend/access/transam/transam.c | 2 +- src/backend/access/transam/twophase.c | 4 +- src/backend/access/transam/xact.c | 18 src/backend/access/transam/xlog.c | 2 +- src/backend/commands/vacuum.c | 10 ++--- src/backend/postmaster/autovacuum.c| 2 +- src/backend/storage/buffer/README | 2 +- src/backend/storage/ipc/procarray.c| 4 +- src/backend/utils/time/tqual.c | 6 +-- src/bin/initdb/initdb.c| 2 +- src/bin/pg_upgrade/exec.c | 79 +- src/bin/pg_upgrade/pg_upgrade.c| 30 +++-- src/include/access/slru.h | 4 +- 28 files changed, 127 insertions(+), 102 deletions(-) diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml index 0f09d82..18d4bd5 100644 --- a/doc/src/sgml/backup.sgml +++ b/doc/src/sgml/backup.sgml @@ -382,10 +382,10 @@ tar -cf backup.tar /usr/local/pgsql/data directories. This will not work because the information contained in these files is not usable without the commit log files, - pg_clog/*, which contain the commit status of + pg_trans/*, which contain the commit status of all transactions. A table file is only usable with this information. Of course it is also impossible to restore only a - table and the associated pg_clog data + table and the associated pg_trans data because that would render all other tables in the database cluster useless. So file system backups only work for complete backup and restoration of an entire database cluster. diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index 4e09e06..783d49c 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -1847,7 +1847,7 @@ All transaction IDs before this one have been replaced with a permanent (frozen) transaction ID in this table. This is used to track whether the table needs to be vacuumed in order to prevent transaction - ID wraparound or to allow pg_clog to be shrunk. Zero + ID wraparound or to allow pg_trans to be shrunk. Zero (InvalidTransactionId) if the relation is not a table. @@ -2514,7 +2514,7 @@ All transaction IDs before this one have been replaced with a permanent (frozen) transaction ID in this database. This is used to track whether the database needs to be vacuumed in order to prevent - transaction ID wraparound or to allow pg_clog to be shrunk. + transaction ID wraparound or to allow pg_trans to be shrunk. It is the minimum of the per-table pg_class.relfrozenxid values. diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 7c483c6..c32da94 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -5816,7 +5816,7 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; Vacuum also allows removal of old files from the -pg_clog subdirectory, which is why the default +pg_trans subdirectory, which is why the default is a relatively low 200 million transactions. This parameter can only be set at server start, but the setting ca
Re: [HACKERS] pg_xlogdump fails to handle WAL file with multi-page XLP_FIRST_IS_CONTRECORD data
On Fri, Aug 26, 2016 at 10:03 PM, Fujii Masao wrote: > On Wed, Mar 23, 2016 at 7:04 PM, Pavan Deolasee > wrote: >> >> >> On Wed, Mar 23, 2016 at 1:13 PM, Michael Paquier >> wrote: >>> >>> >>> + /* >>> +* Compute targetRecOff. It should typically be greater than short >>> +* page-header since a valid record can't , but can also be zero >>> when >>> +* caller has supplied a page-aligned address or when we are >>> skipping >>> +* multi-page continuation record. It doesn't matter though >>> because >>> +* ReadPageInternal() will read at least short page-header worth >>> of >>> +* data >>> +*/ >>> This needs some polishing, there is an unfinished sentence here. >>> >>> + targetRecOff = tmpRecPtr % XLOG_BLCKSZ; >>> targetRecOff, pageHeaderSize and targetPagePtr could be declared >>> inside directly the new while loop. >> >> >> Thanks Michael for reviewing the patch. I've fixed these issues and new >> version is attached. > > The patch looks good to me. Barring any objections, > I'll push this and back-patch to 9.3 where pg_xlogdump was added. Pushed. Thanks! Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers