Re: [HACKERS] Ancient bug in formatting.c/to_char()
On 05/29/2014 04:59 AM, Peter Geoghegan wrote: Consider this example: [local]/postgres=# SELECT to_char(1e9::float8,'9D9'); to_char -- 10.0 (1 row) [local]/postgres=# SELECT to_char(1e20::float8,'9D9'); to_char 1 (1 row) [local]/postgres=# SELECT to_char(1e20,'9D9'); to_char -- 1.0 (1 row) There is access to uninitialized memory here. #define DEBUG_TO_FROM_CHAR shows this to be the case (garbage is printed to stdout). Valgrind brought this to my attention. I propose the attached patch as a fix for this issue. The direct cause appears to be that float8_to_char() feels entitled to truncate Number description post-digits, while that doesn't happen within numeric_to_char(), say. This is very old code, from the original to_char() patch back in 2000, and the interactions here are not obvious. I think that that should be okay to not do this truncation, since the value of MAXDOUBLEWIDTH was greatly increased in 2007 as part of a round of fixes to bugs of similar vintage. There is still a defensive measure against over-sized input (we bail), but that ought to be unnecessary, strictly speaking. postgres=# select to_char('0.1'::float8, '9D' || repeat('9', 1000)); ERROR: double precision for character conversion is too wide Yeah, the code is pretty crufty, I'm not sure what the proper fix would be. I wonder if psprintf would be useful. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On 29 May 2014 01:07, Bruce Momjian wrote: > On Wed, May 28, 2014 at 04:04:13PM +0100, Simon Riggs wrote: >> On 28 May 2014 15:34, Fujii Masao wrote: >> >> >> Also, compress_backup_block GUC needs to be merged with full_page_writes. >> > >> > Basically I agree with you because I don't want to add new GUC very >> > similar to >> > the existing one. >> > >> > But could you imagine the case where full_page_writes = off. Even in this >> > case, >> > FPW is forcibly written only during base backup. Such FPW also should be >> > compressed? Which compression algorithm should be used? If we want to >> > choose the algorithm for such FPW, we would not be able to merge those two >> > GUCs. IMO it's OK to always use the best compression algorithm for such FPW >> > and merge them, though. >> >> I'd prefer a new name altogether >> >> torn_page_protection = 'full_page_writes' >> torn_page_protection = 'compressed_full_page_writes' >> torn_page_protection = 'none' >> >> this allows us to add new techniques later like >> >> torn_page_protection = 'background_FPWs' >> >> or >> >> torn_page_protection = 'double_buffering' >> >> when/if we add those new techniques > > Uh, how would that work if you want to compress the background_FPWs? > Use compressed_background_FPWs? We've currently got 1 technique for torn page protection, soon to have 2 and with a 3rd on the horizon and likely to receive effort in next release. It seems sensible to have just one parameter to describe the various techniques, as suggested. I'm suggesting that we plan for how things will look when we have the 3rd one as well. Alternate suggestions welcome. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Jsonb: jbvBinary usage in the convertJsonbValue?
Hi all, I'm little confused by the *convertJsonbValue* functon at *jsonb_utils.c* Maybe I misunderstood something, so I need help =) >>> if (IsAJsonbScalar(val) || val->type == jbvBinary) >>> convertJsonbScalar(buffer, header, val); As I can see, the *convertJsonbScalar* function is used for scalar and binary jsonb values. But this function doesn't handle the jbvBinary type. >>> switch (scalarVal->type) >>> case jbvNull: >>> ... >>> case jbvString: >>> ... >>> case jbvNumeric: >>> . >>> case jbvBool: >>> . >>> default: >>> elog(ERROR, "invalid jsonb scalar type"); Does this mean, that binary type is incorrect for *convertJsonbValue *? Or am I missed something?
Re: [HACKERS] Proposing pg_hibernate
On May 29, 2014 12:12 AM, "Amit Kapila" wrote: > > I agree with you that there are only few corner cases where evicting > shared buffers by this utility would harm, but was wondering if we could > even save those, say if it would only use available free buffers. I think > currently there is no such interface and inventing a new interface for this > case doesn't seem to reasonable unless we see any other use case of > such a interface. +1
[HACKERS] json/jsonb inconsistence
# select '"\uaBcD"'::json; json -- "\uaBcD" but # select '"\uaBcD"'::jsonb; ERROR: invalid input syntax for type json LINE 1: select '"\uaBcD"'::jsonb; ^ DETAIL: Unicode escape values cannot be used for code point values above 007F when the server encoding is not UTF8. CONTEXT: JSON data, line 1: ... and # select '"\uaBcD"'::json -> 0; ERROR: invalid input syntax for type json DETAIL: Unicode escape values cannot be used for code point values above 007F when the server encoding is not UTF8. CONTEXT: JSON data, line 1: ... Time: 0,696 ms More than, json looks strange: # select '["\uaBcD"]'::json; json ["\uaBcD"] but # select '["\uaBcD"]'::json->0; ERROR: invalid input syntax for type json DETAIL: Unicode escape values cannot be used for code point values above 007F when the server encoding is not UTF8. CONTEXT: JSON data, line 1: [... Looks like random parse rules. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] json/jsonb inconsistence - 2
postgres=# select '["\u"]'::json->0; ?column? -- "\u" (1 row) Time: 1,294 ms postgres=# select '["\u"]'::jsonb->0; ?column? --- "\\u" (1 row) It seems to me that escape_json() is wrongly used in jsonb_put_escaped_value(), right name of escape_json() is a escape_to_json(). -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
>Thanks for extending and revising the FPW-compress patch! Could you add >your patch into next CF? Sure. I will make improvements and add it to next CF. >Isn't it worth measuring the recovery performance for each compression >algorithm? Yes I will post this soon. On Wed, May 28, 2014 at 8:04 PM, Fujii Masao wrote: > On Tue, May 27, 2014 at 12:57 PM, Rahila Syed > wrote: > > Hello All, > > > > 0001-CompressBackupBlock_snappy_lz4_pglz extends patch on compression of > > full page writes to include LZ4 and Snappy . Changes include making > > "compress_backup_block" GUC from boolean to enum. Value of the GUC can be > > OFF, pglz, snappy or lz4 which can be used to turn off compression or set > > the desired compression algorithm. > > > > 0002-Support_snappy_lz4 adds support for LZ4 and Snappy in PostgreSQL. It > > uses Andres’s patch for getting Makefiles working and has a few wrappers > to > > make the function calls to LZ4 and Snappy compression functions and > handle > > varlena datatypes. > > Patch Courtesy: Pavan Deolasee > > Thanks for extending and revising the FPW-compress patch! Could you add > your patch into next CF? > > > Also, compress_backup_block GUC needs to be merged with full_page_writes. > > Basically I agree with you because I don't want to add new GUC very > similar to > the existing one. > > But could you imagine the case where full_page_writes = off. Even in this > case, > FPW is forcibly written only during base backup. Such FPW also should be > compressed? Which compression algorithm should be used? If we want to > choose the algorithm for such FPW, we would not be able to merge those two > GUCs. IMO it's OK to always use the best compression algorithm for such FPW > and merge them, though. > > > Tests use JDBC runner TPC-C benchmark to measure the amount of WAL > > compression ,tps and response time in each of the scenarios viz . > > Compression = OFF , pglz, LZ4 , snappy ,FPW=off > > Isn't it worth measuring the recovery performance for each compression > algorithm? > > Regards, > > -- > Fujii Masao >
Re: [HACKERS] json/jsonb inconsistence
On 05/29/2014 07:55 AM, Teodor Sigaev wrote: # select '"\uaBcD"'::json; json -- "\uaBcD" but # select '"\uaBcD"'::jsonb; ERROR: invalid input syntax for type json LINE 1: select '"\uaBcD"'::jsonb; ^ DETAIL: Unicode escape values cannot be used for code point values above 007F when the server encoding is not UTF8. CONTEXT: JSON data, line 1: ... and # select '"\uaBcD"'::json -> 0; ERROR: invalid input syntax for type json DETAIL: Unicode escape values cannot be used for code point values above 007F when the server encoding is not UTF8. CONTEXT: JSON data, line 1: ... Time: 0,696 ms More than, json looks strange: # select '["\uaBcD"]'::json; json ["\uaBcD"] but # select '["\uaBcD"]'::json->0; ERROR: invalid input syntax for type json DETAIL: Unicode escape values cannot be used for code point values above 007F when the server encoding is not UTF8. CONTEXT: JSON data, line 1: [... Looks like random parse rules. It is documented that for json we don't check the validity of unicode escapes until we try to use them. That's because the original json parser didn't check them, and if we started doing so now users who pg_upgraded would find themselves with invalid data in the database. The rules for jsonb are more strict, because we actually resolve the unicode escapes when constructing the jsonb object. There is nothing at all random about it, although I agree it's slightly inconsistent. It's been the subject of some discussion on -hackers previously, IIRC. I actually referred to this difference in my talk at pgCon last Friday. Frankly, if you want to use json/jsonb, you are going to be best served by using a UTF8-encoded database, or not using non-ascii unicode escapes in json at all. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] backup_label revisited
So I ran into the case again where a system crashed while a hot backup was being taken. Postgres couldn't start up automatically because the backup_label was present. This has come up before e.g. http://www.postgresql.org/message-id/caazkufap1gxcojtyzch13rvevjevwro1vvfbysqtwgud9is...@mail.gmail.com but I believe no progress was made. I was trying to think if we could somehow identify if the backup_label was from a backup in progress or a restore in progress. Obvious choices like putting the server ip address in it are obviously not going to work for several reasons. However, at least on Linux wouldn't it be sufficient to put the inode number of the backup_label file in the backup_label? If it's still the same inode then it's just restarting, not a restore since afaik there's no way for tar or the like to recreate the file with the same inode on any filesystem. That would even protect against another restore on the same host. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Replacement for OSSP-UUID for Linux and BSD
On 5/27/14, 10:37 PM, Tom Lane wrote: > I'm not terribly happy about pushing such a change post-beta1 either, > but it's not like this isn't something we've known was needed. Anyway, > what's the worst case if we find a bug here? Tell people not to use > uuid-ossp? Mainly some more discussion time would have been nice. Also, while the old ossp-based uuid was broken in some cases, it had a well-defined workaround. The new code introduces a whole new dimension of potential portability issues. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Replacement for OSSP-UUID for Linux and BSD
On 5/27/14, 10:37 PM, Tom Lane wrote: > If you don't like this change, we can revert it and also revert the upgrade > to 2.69. Nobody else appears to be concerned, but I would have preferred this option. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json/jsonb inconsistence - 2
On 05/29/2014 08:00 AM, Teodor Sigaev wrote: postgres=# select '["\u"]'::json->0; ?column? -- "\u" (1 row) Time: 1,294 ms postgres=# select '["\u"]'::jsonb->0; ?column? --- "\\u" (1 row) It seems to me that escape_json() is wrongly used in jsonb_put_escaped_value(), right name of escape_json() is a escape_to_json(). That's a bug. I will look into it. I think we might need to special-case \u on output, just as we do on input. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json/jsonb inconsistence
inconsistent. It's been the subject of some discussion on -hackers previously, IIRC. I actually referred to this difference in my talk at pgCon last Friday. I see, and I wasn't on your talk :( I'm playing around jsquery and now trying to implement UTF escapes there. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Replacement for OSSP-UUID for Linux and BSD
On 2014-05-29 08:14:48 -0400, Peter Eisentraut wrote: > On 5/27/14, 10:37 PM, Tom Lane wrote: > > If you don't like this change, we can revert it and also revert the upgrade > > to 2.69. > > Nobody else appears to be concerned, but I would have preferred this option. I am pretty concerned actually. But I don't see downgrading to an earlier autoconf as something really helpful. There already was a huge portability problem with the current code. Avoiding the autoconf update for a while wouldn't have solved it. And in 5 years time the amount of portability problems will be much larger. Yes, it'd have been nice if this were done a month+ ago. But nobody stepped up :(. Seems like the least bad choice :/ Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix bogus %name-prefix option syntax in all our Bison files.
On 5/28/14, 7:02 PM, Andres Freund wrote: > That's a good idea. What i've been thinking about is to add > -Wno-deprecated to the bison rule in the interim. Maybe after a > configure test for the option. All deprecation warnings so far seem to > be pretty unhelpful. Here is a patch. diff --git a/config/programs.m4 b/config/programs.m4 index fd3a9a4..5570e55 100644 --- a/config/programs.m4 +++ b/config/programs.m4 @@ -23,6 +23,10 @@ if test "$BISON"; then *** Bison version 1.875 or later is required, but this is $pgac_bison_version.]) BISON="" fi + if echo "$pgac_bison_version" | $AWK '{ if ([$]4 >= 3) exit 0; else exit 1;}' + then +BISONFLAGS="$BISONFLAGS -Wno-deprecated" + fi fi if test -z "$BISON"; then diff --git a/configure b/configure index 3663e50..5499d56 100755 --- a/configure +++ b/configure @@ -7075,6 +7075,10 @@ $as_echo "$as_me: WARNING: *** Bison version 1.875 or later is required, but this is $pgac_bison_version." >&2;} BISON="" fi + if echo "$pgac_bison_version" | $AWK '{ if ($4 >= 3) exit 0; else exit 1;}' + then +BISONFLAGS="$BISONFLAGS -Wno-deprecated" + fi fi if test -z "$BISON"; then -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Replacement for OSSP-UUID for Linux and BSD
On 05/29/2014 08:21 AM, Andres Freund wrote: On 2014-05-29 08:14:48 -0400, Peter Eisentraut wrote: On 5/27/14, 10:37 PM, Tom Lane wrote: If you don't like this change, we can revert it and also revert the upgrade to 2.69. Nobody else appears to be concerned, but I would have preferred this option. I am pretty concerned actually. But I don't see downgrading to an earlier autoconf as something really helpful. There already was a huge portability problem with the current code. Avoiding the autoconf update for a while wouldn't have solved it. And in 5 years time the amount of portability problems will be much larger. Yes, it'd have been nice if this were done a month+ ago. But nobody stepped up :(. Seems like the least bad choice :/ The most worrying thing is that we didn't find the occasioning problem when we switched to autoconf 2.69 back in December, so we end up dealing with bad choices now. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Replacement for OSSP-UUID for Linux and BSD
On 5/29/14, 8:21 AM, Andres Freund wrote: > But I don't see downgrading to an > earlier autoconf as something really helpful. Well, we could have just hacked up that particular header check to do what we want. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Replacement for OSSP-UUID for Linux and BSD
On 2014-05-29 08:49:38 -0400, Peter Eisentraut wrote: > On 5/29/14, 8:21 AM, Andres Freund wrote: > > But I don't see downgrading to an > > earlier autoconf as something really helpful. > > Well, we could have just hacked up that particular header check to do > what we want. Still wouldn't have solved that ossp already didn't work on several platforms at all anymore and is likely to work on even fewer in 5 years. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch
Thanks for looking at it! > Date: Thu, 29 May 2014 00:19:33 +0300 > From: hlinnakan...@vmware.com > To: johnlu...@hotmail.com; pgsql-hackers@postgresql.org > CC: klaussfre...@gmail.com > Subject: Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal > and patch > > On 05/28/2014 11:52 PM, John Lumby wrote: > > > > The patch seems to assume that you can put the aiocb struct in shared > memory, initiate an asynchronous I/O request from one process, and wait > for its completion from another process. I'm pretty surprised if that > works on any platform. It works on linux.Actually this ability allows the asyncio implementation to reduce complexity in one respect (yes I know it looks complex enough) : it makes waiting for completion of an in-progress IO simpler than for the existing synchronous IO case,. since librt takes care of the waiting. specifically, no need for extra wait-for-io control blocks such as in bufmgr's WaitIO() This also plays very nicely with the syncscan where the cohorts run close together. If anyone would like to advise whether this also works on MacOS/BSD , windows, that would be good, as I can't verify it myself. > > How portable is POSIX aio nowadays? Googling around, it still seems that > on Linux, it's implemented using threads. Does the thread-emulation > implementation cause problems with the rest of the backend, which > assumes that there is only a single thread? In any case, I think we'll Good question, I am not aware of any dependency on a backend having only a single thread.Can you please point me to such dependencies? > want to encapsulate the AIO implementation behind some kind of an API, > to allow other implementations to co-exist. It is already -it follows the smgr(stg mgr) -> md (mag disk) -> fd (filesystem) layering used for the existing filesystem ops including posix-fadvise. > > Benchmarks? I will see if I can package mine up somehow. Would be great if anyone else would like to benchmark it on theirs ... > - Heikki
Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch
I have pasted below the EXPLAIN of one of my benchmark queries (the one I reference in the README). Plenty of nested loop joins. However I think I understand your question as to how effective it would be if the outer is not sorted, and I will see if I can dig into that if I get time (and it sounds as though Claudio is on it too). The area of exactly what the best prefetch strategy should be for each particular type of scan and context is a good one to work on. John > Date: Wed, 28 May 2014 18:12:23 -0700 > Subject: Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal > and patch > From: p...@heroku.com > To: klaussfre...@gmail.com > CC: johnlu...@hotmail.com; pgsql-hackers@postgresql.org > > On Wed, May 28, 2014 at 5:59 PM, Claudio Freire > wrote: > > For nestloop, correct me if I'm wrong, but index scan nodes don't have > > visibility of the next tuple to be searched for. > > Nested loop joins are considered a particularly compelling case for > prefetching, actually. > > -- > Peter Geoghegan - QUERY PLAN Limit (cost=801294.81..801294.81 rows=2 width=532) CTE deploy_zone_down -> Recursive Union (cost=1061.25..2687.40 rows=11 width=573) -> Nested Loop (cost=1061.25..1423.74 rows=1 width=41) -> Nested Loop (cost=1061.11..1421.22 rows=14 width=49) -> Bitmap Heap Scan on entity zone_tree (cost=1060.67..1175.80 rows=29 width=40) Recheck Cond: ((name >= 'testZone-4375'::text) AND (name <= 'testZone-5499'::text) AND ((discriminator)::text = 'ZONE'::text)) -> BitmapAnd (cost=1060.67..1060.67 rows=29 width=0) -> Bitmap Index Scan on entity_name (cost=0.00..139.71 rows=5927 width=0) Index Cond: ((name >= 'testZone-4375'::text) AND (name <= 'testZone-5499'::text)) -> Bitmap Index Scan on entity_discriminatorx (cost=0.00..920.70 rows=49636 width=0) Index Cond: ((discriminator)::text = 'ZONE'::text) -> Index Scan using metadata_value_owner_id on metadata_value mv (cost=0.43..8.45 rows=1 width=17) Index Cond: (owner_id = zone_tree.id) -> Index Scan using metadata_field_pkey on metadata_field mf (cost=0.15..0.17 rows=1 width=8) Index Cond: (id = mv.field_id) Filter: ((name)::text = 'deployable'::text) -> Nested Loop (cost=0.87..126.34 rows=1 width=573) -> Nested Loop (cost=0.72..125.44 rows=5 width=581) -> Nested Loop (cost=0.29..83.42 rows=10 width=572) -> WorkTable Scan on deploy_zone_down dzd (cost=0.00..0.20 rows=10 width=540) -> Index Scan using entity_discriminator_parent_zone on entity ch (cost=0.29..8.31 rows=1 width=40) Index Cond: ((parent_id = dzd.zone_tree_id) AND ((discriminator)::text = 'ZONE'::text)) -> Index Scan using metadata_value_owner_id on metadata_value mv_1 (cost=0.43..4.19 rows=1 width=17) Index Cond: (owner_id = ch.id) -> Index Scan using metadata_field_pkey on metadata_field mf_1 (cost=0.15..0.17 rows=1 width=8) Index Cond: (id = mv_1.field_id) Filter: ((name)::text = 'deployable'::text) CTE deploy_zone_tree -> Recursive Union (cost=0.00..9336.89 rows=21 width=1105) -> CTE Scan on deploy_zone_down dzd_1 (cost=0.00..0.22 rows=11 width=1105) -> Nested Loop (cost=0.43..933.63 rows=1 width=594) -> WorkTable Scan on deploy_zone_tree dzt (cost=0.00..2.20 rows=110 width=581) -> Index Scan using entity_pkey on entity pt (cost=0.43..8.46 rows=1 width=21) Index Cond: (id = dzt.zone_tree_ancestor_parent_id) Filter: ((discriminator)::text = ANY ('{VIEW,ZONE}'::text[])) CTE forward_host_ip -> Nested Loop (cost=1.30..149.65 rows=24 width=88) -> Nested Loop (cost=0.87..121.69 rows=48 width=56) -> Nested Loop
Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix bogus %name-prefix option syntax in all our Bison files.
On 2014-05-29 08:33:05 -0400, Peter Eisentraut wrote: > On 5/28/14, 7:02 PM, Andres Freund wrote: > > That's a good idea. What i've been thinking about is to add > > -Wno-deprecated to the bison rule in the interim. Maybe after a > > configure test for the option. All deprecation warnings so far seem to > > be pretty unhelpful. > > Here is a patch. FWIW, I vote for applying something like it. It seems better to collect the -Wno-deprecated in one place (i.e. configure) instead of having it in every developer's Makefile.custom. The latter will be harder to get rid of. I'd add a comment about why it's been added though. I won't remember why at least... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch
El 28/05/2014 22:12, "Peter Geoghegan" escribió: > > On Wed, May 28, 2014 at 5:59 PM, Claudio Freire wrote: > > For nestloop, correct me if I'm wrong, but index scan nodes don't have > > visibility of the next tuple to be searched for. > > Nested loop joins are considered a particularly compelling case for > prefetching, actually. Of course. I only doubt it can be implemented without not so small changes to all execution nodes. I'll look into it > > -- > Peter Geoghegan
Re: [HACKERS] [PATCH] Replacement for OSSP-UUID for Linux and BSD
Andres Freund writes: > On 2014-05-29 08:49:38 -0400, Peter Eisentraut wrote: >> Well, we could have just hacked up that particular header check to do >> what we want. > Still wouldn't have solved that ossp already didn't work on several > platforms at all anymore and is likely to work on even fewer in 5 years. Yeah. The problem is not with the header check, it's with the fact that OSSP UUID is basically broken on several modern platforms.[1] We were going to have to do something about that pretty soon anyway. I agree that this isn't the most ideal time in the dev cycle to do something about it, but fixing portability issues is one of the expected beta-time activities, no? That's really what this is. regards, tom lane [1] Quite aside from compilation problems, yesterday's testing results suggest that it can't read the system MAC address at all on Debian: http://www.postgresql.org/message-id/29063.1401333...@sss.pgh.pa.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix bogus %name-prefix option syntax in all our Bison files.
Andres Freund writes: > FWIW, I vote for applying something like it. It seems better to collect > the -Wno-deprecated in one place (i.e. configure) instead of having it > in every developer's Makefile.custom. The latter will be harder to get > rid of. Yeah, that's a good point. > I'd add a comment about why it's been added though. I won't remember why > at least... +1 regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Replacement for OSSP-UUID for Linux and BSD
Andrew Dunstan writes: > On 05/29/2014 08:21 AM, Andres Freund wrote: >> Yes, it'd have been nice if this were done a month+ ago. But nobody >> stepped up :(. Seems like the least bad choice :/ > The most worrying thing is that we didn't find the occasioning problem > when we switched to autoconf 2.69 back in December, so we end up dealing > with bad choices now. I think the main reason for that is that none of the buildfarm animals were trying to build contrib/uuid-ossp on machines where OSSP UUID is shaky. Some of our packagers do so, though, and so once the beta got into their hands we found out about the problem. Short of adding "try to package according to Debian, Red Hat, etc packaging recipes" to the buildfarm requirements, there doesn't seem to be much that we could do to find out about such issues earlier. (And no, I'm not proposing that as a good idea. As an ex-packager, I know that those recipes are moving targets because of constantly changing external requirements. We don't want to take over that maintenance.) So IMO we just have to expect that beta is going to turn up portability issues, and we're going to have to do what it takes to resolve them, even if it's nontrivial. The good news on this front is that we have substantially more buildfarm coverage of contrib/uuid-ossp than we had two days ago. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SP-GiST bug.
I don't think that checkAllTheSame is broken, but it probably would be after this patch --- ISTM you're breaking the corner case described in the header comment for checkAllTheSame, where we need to force splitting of a set of existing tuples that the opclass can't distinguish. INHO, this patch will not break - because it forces (in corner case) to create a node for new tuple but exclude it from list to insert. On next iteration we will find a needed node with empty list. Comment: * If we know that the leaf tuples wouldn't all fit on one page, then we * exclude the last tuple (which is the incoming new tuple that forced a split) * from the check to see if more than one node is used. The reason for this * is that if the existing tuples are put into only one chain, then even if * we move them all to an empty page, there would still not be room for the * new tuple, so we'd get into an infinite loop of picksplit attempts. If we reserve a node for new tuple then on next iteration we will not fall into picksplit again - we will have an empty list. So new tuple will fall into another page. BTW, look for following example: create table xxx (p point); insert into xxx select '(1,1)'::point from generate_series(1, 226) as v; insert into xxx values ('(3,3)'::point); create index xxxidx on xxx using spgist (p quad_point_ops); easy to see that the single node is allTheSame although underlying leafs are different. Due to allTheSame search logic scans are correct but inefficient. And patch fixes it too. What actually is broken, I think, is the spgist text opclass, because it's using a zero node label for two different situations. The scenario we're looking at here is: Agree for different situations, but disagree that's broken :) > It seems like we have to > distinguish the case of zero-length string from that of a dummy label > for a pushed-down allTheSame tuple. Actually, we do this: if allTheSame is true then nodes have a dummy labels. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json/jsonb inconsistence - 2
On 05/29/2014 08:15 AM, Andrew Dunstan wrote: On 05/29/2014 08:00 AM, Teodor Sigaev wrote: postgres=# select '["\u"]'::json->0; ?column? -- "\u" (1 row) Time: 1,294 ms postgres=# select '["\u"]'::jsonb->0; ?column? --- "\\u" (1 row) It seems to me that escape_json() is wrongly used in jsonb_put_escaped_value(), right name of escape_json() is a escape_to_json(). That's a bug. I will look into it. I think we might need to special-case \u on output, just as we do on input. Actually, this is just the tip of the iceberg. Here's what 9.3 does: andrew=# select array_to_json(array['a','\u','b']::text[]); array_to_json - ["a","\\u","b"] I'm now wondering if we should pass though any unicode escape (presumably validated to some extent). I guess we can't change this in 9.2/9.3 because it would be a behaviour change. These unicode escapes have given us more trouble than any other part of the JSON spec :-( cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Ancient bug in formatting.c/to_char()
On Thu, May 29, 2014 at 3:09 AM, Heikki Linnakangas wrote: > Yeah, the code is pretty crufty, I'm not sure what the proper fix would be. > I wonder if psprintf would be useful. I don't know that it's worth worrying about the case you highlight. It is certainly worth worrying about the lack of a similar fix in float4_to_char(), though. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix bogus %name-prefix option syntax in all our Bison files.
On Thu, May 29, 2014 at 08:33:05AM -0400, Peter Eisentraut wrote: > On 5/28/14, 7:02 PM, Andres Freund wrote: > > That's a good idea. What i've been thinking about is to add > > -Wno-deprecated to the bison rule in the interim. Maybe after a > > configure test for the option. All deprecation warnings so far seem to > > be pretty unhelpful. > > Here is a patch. > Does this need a comment indicating why it's needed and when it can be removed? Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > He who writes carelessly confesses thereby at the very outset that he does > not attach much importance to his own thoughts. -- Arthur Schopenhauer signature.asc Description: Digital signature
[HACKERS] recovery testing for beta
What features in 9.4 need more beta testing for recovery? I've applied my partial-write testing harness to several scenarios in 9.4. So far its found a recovery bug for gin indexes, a recovery bug for btree, a vacuum bug for btree indexes (with foreign keys, but that is not relevant to the bug), and nothing of interest for gist index, although it only tested "where text_array @@ to_tsquery(?)" queries. It also implicitly tested the xlog parallel write slots thing, as that is common code to all recovery. I also applied the foreign key test retroactively to 9.3, and it quickly re-found the multixact bugs up until commit 9a57858f1103b89a5674. The test was designed only with the knowledge that the bugs involved foreign keys and the consumption of multixacts. I had no deeper knowledge of the details of those bugs when designing the test, so I have a reasonable amount of confidence that this could have found them in real time had I bothered to try to test the feature during the previous beta cycle. So, what else in 9.4 needs testing for recovery from crashes in general or partial-writes in particular? One thing is that I want to find a way to drive multixact in fast forward, so that the freezing cycle gets a good workout. Currently I can't consume enough of them to make them wrap around within the time frame of a test. It looks like jsonb stuff only makes new operators for use by existing indexes types, so probably is not a high risk for recovery bugs. I will probably try to test it anyway as a way to become more familiar with the feature. I don't really know about the logical streaming stuff. These are the recent threads on hackers. The first one has a link to the harness variant which is set up for the foreign key testing. "9.4 btree index corruption" "9.4 checksum error in recovery with btree index" "9.4 checksum errors in recovery with gin index" Cheers, Jeff
Re: [HACKERS] json/jsonb inconsistence - 2
Actually, this is just the tip of the iceberg. Funny, but postgres=# select '\u'::text; text \u (1 row) postgres=# select array['\u']::text[]; array - {"\\u"} postgres=# select '{"\u"}'::text[]; text - {u} postgres=# select '{"\\u"}'::text[]; text - {"\\u"} &%) I'm crazy about that. Suppose, we shouldn't worry about array type, just make output the same for json/jsonb -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Odd uuid-ossp behavior on smew and shearwater
On Wed, May 28, 2014 at 11:23 PM, Tom Lane wrote: > Buildfarm critters smew and shearwater are reporting regression test > failures that suggest that the UUID library can't get a system MAC > address: > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=smew&dt=2014-05-28%2023%3A38%3A28 > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=shearwater&dt=2014-05-29%2000%3A24%3A32 > > I've just committed regression test adjustments to prevent that from > being a failure case, but I am confused about why it's happening. > I wouldn't be surprised at not getting a MAC address on a machine that > lacks any internet connection, but that surely can't describe the > buildfarm environment. Are you curious enough to poke into it and > see what's going on? It might be useful to strace a backend that's > trying to execute uuid_generate_v1() and see what the kernel interaction > looks like exactly. Here's the result of attaching strace to an idle backend, then running SELECT uuid_generate_v1(). AFAIR shearwater is a cheaply-hosted OpenVZ VPS under the hood. Josh josh@ease1:~$ strace -p 6818 Process 6818 attached - interrupt to quit recv(10, "Q\0\0\0\37SELECT uuid_generate_v1();\0", 8192, 0) = 32 gettimeofday({1401383296, 920282}, NULL) = 0 gettimeofday({1401383296, 920313}, NULL) = 0 mmap2(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xae80 stat64("/home/josh/runtime/lib/postgresql/uuid-ossp", 0xbfdf87d0) = -1 ENOENT (No such file or directory) stat64("/home/josh/runtime/lib/postgresql/uuid-ossp.so", {st_mode=S_IFREG|0755, st_size=46685, ...}) = 0 stat64("/home/josh/runtime/lib/postgresql/uuid-ossp.so", {st_mode=S_IFREG|0755, st_size=46685, ...}) = 0 open("/home/josh/runtime/lib/postgresql/uuid-ossp.so", O_RDONLY) = 7 read(7, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0p\f\0\0004\0\0\0"..., 512) = 512 fstat64(7, {st_mode=S_IFREG|0755, st_size=46685, ...}) = 0 mmap2(NULL, 16796, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 7, 0) = 0xae7fb000 mmap2(0xae7ff000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 7, 0x3) = 0xae7ff000 close(7)= 0 open("/home/josh/runtime/lib/libuuid.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("tls/i686/sse2/cmov/libuuid.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("tls/i686/sse2/libuuid.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("tls/i686/cmov/libuuid.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("tls/i686/libuuid.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("tls/sse2/cmov/libuuid.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("tls/sse2/libuuid.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("tls/cmov/libuuid.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("tls/libuuid.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("i686/sse2/cmov/libuuid.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("i686/sse2/libuuid.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("i686/cmov/libuuid.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("i686/libuuid.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("sse2/cmov/libuuid.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("sse2/libuuid.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("cmov/libuuid.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("libuuid.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("/home/josh/runtime/lib/libuuid.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY) = 7 fstat64(7, {st_mode=S_IFREG|0644, st_size=30628, ...}) = 0 mmap2(NULL, 30628, PROT_READ, MAP_PRIVATE, 7, 0) = 0xae7f3000 close(7)= 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/lib/i386-linux-gnu/libuuid.so.1", O_RDONLY) = 7 read(7, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\260\22\0\0004\0\0\0"..., 512) = 512 fstat64(7, {st_mode=S_IFREG|0644, st_size=18000, ...}) = 0 mmap2(NULL, 20712, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 7, 0) = 0xae7ed000 mmap2(0xae7f1000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 7, 0x3) = 0xae7f1000 close(7)= 0 mprotect(0xae7f1000, 4096, PROT_READ) = 0 munmap(0xae7f3000, 30628) = 0 socket(PF_FILE, SOCK_STREAM, 0) = 7 connect(7, {sa_family=AF_FILE, path="/var/run/uuidd/request"}, 110) = -1 ENOENT (No such file or directory) access("/usr/sbin/uuidd", X_OK) = -1 ENOENT (No such file or directory) close(7)= 0 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 7 ioctl(7, SIOCGIFCONF, {96, {{"lo", {AF_INET, inet_addr("127.0.0.1")}}, {"venet0", {AF_INET, inet_addr("127.0.0.2")}}, {"venet0:0", {AF_INET, inet_addr("198.204.250.34")) = 0 i
Re: [HACKERS] Compression of full-page-writes
On Thu, May 29, 2014 at 11:21:44AM +0100, Simon Riggs wrote: > > Uh, how would that work if you want to compress the background_FPWs? > > Use compressed_background_FPWs? > > We've currently got 1 technique for torn page protection, soon to have > 2 and with a 3rd on the horizon and likely to receive effort in next > release. > > It seems sensible to have just one parameter to describe the various > techniques, as suggested. I'm suggesting that we plan for how things > will look when we have the 3rd one as well. > > Alternate suggestions welcome. I was just pointing out that we might need compression to be a separate boolean variable from the type of page tear protection. I know I am usually anti-adding-variables, but in this case it seems trying to have one variable control several things will lead to confusion. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Replacement for OSSP-UUID for Linux and BSD
On Thu, May 29, 2014 at 02:21:57PM +0200, Andres Freund wrote: > On 2014-05-29 08:14:48 -0400, Peter Eisentraut wrote: > > On 5/27/14, 10:37 PM, Tom Lane wrote: > > > If you don't like this change, we can revert it and also revert the > > > upgrade to 2.69. > > > > Nobody else appears to be concerned, but I would have preferred this option. > > I am pretty concerned actually. But I don't see downgrading to an > earlier autoconf as something really helpful. There already was a huge > portability problem with the current code. Avoiding the autoconf update > for a while wouldn't have solved it. And in 5 years time the amount of > portability problems will be much larger. > Yes, it'd have been nice if this were done a month+ ago. But nobody > stepped up :(. Seems like the least bad choice :/ One thing that concerns me is that we already had the problem that users creating the uuid-ossp extension had to double-quote the name because of the dash, and we have regularly questioned the viability of the uuid-ossp codebase. Now that we know we have to support alternatives, we are changing the build API to support those alternatives, but doing nothing to decouple the extension name from uuid-ossp and the dash issue. Seems this would be the logical time to just break compatibility and get a sane API for UUID generation. We could hack pg_dump --binary-upgrade to map a dump of "uuid-ossp" to some other name, or hack the backend to accept "uuid-ossp" and map that to something more sane. I basically feel this change is making our bad API even more confusing. I think the only good thing about this change is that the _user_ API, as odd as it is, remains the same. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] recovery testing for beta
On Thu, May 29, 2014 at 09:39:56AM -0700, Jeff Janes wrote: > What features in 9.4 need more beta testing for recovery? > > I've applied my partial-write testing harness to several scenarios in 9.4. So > far its found a recovery bug for gin indexes, a recovery bug for btree, a > vacuum bug for btree indexes (with foreign keys, but that is not relevant to > the bug), and nothing of interest for gist index, although it only tested > "where text_array @@ to_tsquery(?)" queries. > > It also implicitly tested the xlog parallel write slots thing, as that is > common code to all recovery. > > I also applied the foreign key test retroactively to 9.3, and it quickly > re-found the multixact bugs up until commit 9a57858f1103b89a5674. The test > was > designed only with the knowledge that the bugs involved foreign keys and the > consumption of multixacts. I had no deeper knowledge of the details of those > bugs when designing the test, so I have a reasonable amount of confidence that > this could have found them in real time had I bothered to try to test the > feature during the previous beta cycle. Wow, that is impressive! We are looking for a ways to find bugs like the ones the appeared in 9.3.X, and it seems you might have found a way. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Replacement for OSSP-UUID for Linux and BSD
Bruce Momjian writes: > One thing that concerns me is that we already had the problem that users > creating the uuid-ossp extension had to double-quote the name because of > the dash, and we have regularly questioned the viability of the > uuid-ossp codebase. > Now that we know we have to support alternatives, we are changing the > build API to support those alternatives, but doing nothing to decouple > the extension name from uuid-ossp and the dash issue. Well, if you've got a proposal for how to rename the extension without creating major compatibility problems, let's hear it. > Seems this would be the logical time to just break compatibility and get > a sane API for UUID generation. Most people think the "sane API" would be to put the functionality in core, and forget about any extension name at all. The compatibility problems with that approach aren't exactly trivial either, but I suspect that's where we'll end up in the long run. So I'm not that excited about kluge solutions for renaming the extension. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Odd uuid-ossp behavior on smew and shearwater
Josh Kupershmidt writes: > On Wed, May 28, 2014 at 11:23 PM, Tom Lane wrote: >> I've just committed regression test adjustments to prevent that from >> being a failure case, but I am confused about why it's happening. >> I wouldn't be surprised at not getting a MAC address on a machine that >> lacks any internet connection, but that surely can't describe the >> buildfarm environment. Are you curious enough to poke into it and >> see what's going on? It might be useful to strace a backend that's >> trying to execute uuid_generate_v1() and see what the kernel interaction >> looks like exactly. > Here's the result of attaching strace to an idle backend, then running > SELECT uuid_generate_v1(). AFAIR shearwater is a cheaply-hosted OpenVZ > VPS under the hood. Interesting. Looks like you have access only to virtual network interfaces, and they report all-zero MAC addresses, which the UUID library is smart enough to ignore. If smew is also in a virtual environment then that's probably the explanation. (There are some other buildfarm critters that are reporting MAC addresses with the local-admin bit set, which I suspect also means they've got virtual network interfaces, but with a different treatment of the what-to-report problem.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Replacement for OSSP-UUID for Linux and BSD
On Thu, May 29, 2014 at 01:56:22PM -0400, Tom Lane wrote: > Bruce Momjian writes: > > One thing that concerns me is that we already had the problem that users > > creating the uuid-ossp extension had to double-quote the name because of > > the dash, and we have regularly questioned the viability of the > > uuid-ossp codebase. > > > Now that we know we have to support alternatives, we are changing the > > build API to support those alternatives, but doing nothing to decouple > > the extension name from uuid-ossp and the dash issue. > > Well, if you've got a proposal for how to rename the extension without > creating major compatibility problems, let's hear it. Well, the only two I could think of were the pg_dump and backend mapping changes; pretty hacky. > > Seems this would be the logical time to just break compatibility and get > > a sane API for UUID generation. > > Most people think the "sane API" would be to put the functionality in > core, and forget about any extension name at all. The compatibility > problems with that approach aren't exactly trivial either, but I suspect > that's where we'll end up in the long run. So I'm not that excited about > kluge solutions for renaming the extension. OK. I was just worried that users are now using a badly-named uuid-ossp extension that isn't even using uuid-ossp in many cases. If we have a long-term plan to fix this, then you are right that it isn't worth worrying about it now. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] recovery testing for beta
Jeff Janes wrote: > One thing is that I want to find a way to drive multixact in fast forward, > so that the freezing cycle gets a good workout. Currently I can't consume > enough of them to make them wrap around within the time frame of a test. IIRC I lobotomized it up by removing the XLogInsert() call. That allows you to generate large amounts of multixacts quickly. In my laptop this was able to do four or five wraparound cycles in two-three hours or so, using the "burn multixact" utility here: http://www.postgresql.org/message-id/20131231035913.gu22...@eldon.alvh.no-ip.org -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: Typo fixes in Solution.pm, part of MSVC scripts
On Sun, May 25, 2014 at 11:34 PM, Michael Paquier wrote: > I noticed a couple of typos in Solution.pm, fixed by the patch > attached. Those things have been introduced by commits cec8394 (visual > 2013 support) and 0a78320 (latest pgident run, not actually a problem > of pgident, because of a misuse of commas). > The tab problems should not be backpatched to 9.3, but the use of > commas instead of semicolons should be. I suspect it would be good, then, to separate this into two patches, one for back-patch and one not. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] recovery testing for beta
On 05/29/2014 09:39 AM, Jeff Janes wrote: > > I've applied my partial-write testing harness to several scenarios in > 9.4. So far its found a recovery bug for gin indexes, a recovery bug > for btree, a vacuum bug for btree indexes (with foreign keys, but that > is not relevant to the bug), and nothing of interest for gist index, > although it only tested "where text_array @@ to_tsquery(?)" queries. This is awesome. Thanks for doing it! -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SP-GiST bug.
Teodor Sigaev writes: >> I don't think that checkAllTheSame is broken, but it probably would be >> after this patch --- ISTM you're breaking the corner case described in the >> header comment for checkAllTheSame, where we need to force splitting of a >> set of existing tuples that the opclass can't distinguish. > INHO, this patch will not break - because it forces (in corner case) to > create a > node for new tuple but exclude it from list to insert. On next iteration we > will > find a needed node with empty list. > Comment: > * If we know that the leaf tuples wouldn't all fit on one page, then we > * exclude the last tuple (which is the incoming new tuple that forced a > split) > * from the check to see if more than one node is used. The reason for this > * is that if the existing tuples are put into only one chain, then even if > * we move them all to an empty page, there would still not be room for the > * new tuple, so we'd get into an infinite loop of picksplit attempts. > If we reserve a node for new tuple then on next iteration we will not fall > into > picksplit again - we will have an empty list. So new tuple will fall into > another page. The point I'm making is that the scenario your test case exposes is not an infinite loop of picksplits, but an infinite loop of choose calls. There are certainly ways to get into that loop that don't involve this specific checkAllTheSame logic. Here's an example: regression=# create table xxx ( t text); CREATE TABLE regression=# create index xxxidx on xxx using spgist (t); CREATE INDEX regression=# insert into xxx select repeat('x',4000); INSERT 0 1 regression=# insert into xxx select repeat('x',4000); INSERT 0 1 regression=# insert into xxx select repeat('x',4000); INSERT 0 1 regression=# insert into xxx select repeat('x',3996); In this example, we get a picksplit at the third insert, and here we *must* take the allTheSame path because the values are, in fact, all the same (the SPGIST_MAX_PREFIX_LENGTH constraint means we can only put 3996 characters into the prefix, and then the 3997'th character of each value is 'x'). Then the fourth insert triggers this same infinite loop, because spg_text_choose is asked how to insert the slightly-shorter value into the allTheSame tuple, and it does the wrong thing. It's certainly possible that we could/should change what checkAllTheSame is doing --- on re-reading the comment, I'm not really sure that the scenario it's concerned about can happen. However, that will not fix the bug in spgtextproc.c; it will only block one avenue for reaching the bug. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] recovery testing for beta
Alvaro Herrera writes: > Jeff Janes wrote: >> One thing is that I want to find a way to drive multixact in fast forward, >> so that the freezing cycle gets a good workout. Currently I can't consume >> enough of them to make them wrap around within the time frame of a test. > IIRC I lobotomized it up by removing the XLogInsert() call. That allows > you to generate large amounts of multixacts quickly. In my laptop this > was able to do four or five wraparound cycles in two-three hours or so, > using the "burn multixact" utility here: > http://www.postgresql.org/message-id/20131231035913.gu22...@eldon.alvh.no-ip.org Another possibility is to use pg_resetxlog to manually advance the multixact counter to a point near wraparound. I think you have to manually create appropriate slru segment files as well when doing that (someday we should hack pg_resetxlog to do that for you). Still, it might beat waiting hours to burn multixacts one by one. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Odd uuid-ossp behavior on smew and shearwater
On 05/29/2014 02:38 PM, Tom Lane wrote: Josh Kupershmidt writes: On Wed, May 28, 2014 at 11:23 PM, Tom Lane wrote: I've just committed regression test adjustments to prevent that from being a failure case, but I am confused about why it's happening. I wouldn't be surprised at not getting a MAC address on a machine that lacks any internet connection, but that surely can't describe the buildfarm environment. Are you curious enough to poke into it and see what's going on? It might be useful to strace a backend that's trying to execute uuid_generate_v1() and see what the kernel interaction looks like exactly. Here's the result of attaching strace to an idle backend, then running SELECT uuid_generate_v1(). AFAIR shearwater is a cheaply-hosted OpenVZ VPS under the hood. Interesting. Looks like you have access only to virtual network interfaces, and they report all-zero MAC addresses, which the UUID library is smart enough to ignore. If smew is also in a virtual environment then that's probably the explanation. (There are some other buildfarm critters that are reporting MAC addresses with the local-admin bit set, which I suspect also means they've got virtual network interfaces, but with a different treatment of the what-to-report problem.) Almost all my critters run in VMs (all but jacana and bowerbird). cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index-only scans for GIST
On Sun, May 25, 2014 at 6:12 AM, Anastasia Lubennikova wrote: > Hi, hackers! > There are first results of my work on GSoC project "Index-only scans for > GIST". Cool. > 1. Version of my project code is in forked repository > https://github.com/lubennikovaav/postgres/tree/indexonlygist2 > Patch is in attachments > - This version is only for one-column indexes That's probably a limitation that needs to be fixed before this can be committed. > - fetch() method is realized only for box opclass (because it's trivial) That might not need to be fixed before this can be committed. > 2. I test Index-only scans with SQL script box_test.sql > and it works really faster. (results in box_test_out) > > I'll be glad to get your feedback about this feature. Since this is a GSoC project, it would be nice if one of the people who is knowledgeable about GIST (Heikki, Alexander, etc.) could weigh in on this before too much time goes by, so that Anastasia can press forward with this work. I don't know enough to offer too many substantive comments, but I think you should remove all of the //-style comments (most of which are debugging leftovers) and add some more comments describing what you're actually doing and, more importantly, why. This comment doesn't appear to make sense: + /* +* The offset number on tuples on internal pages is unused. For historical +* reasons, it is set 0x. +*/ The reason this doesn't make sense is because the tuple in question is not on an internal page, or indeed any page at all. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch
On 05/29/2014 04:12 PM, John Lumby wrote: Thanks for looking at it! Date: Thu, 29 May 2014 00:19:33 +0300 From: hlinnakan...@vmware.com To: johnlu...@hotmail.com; pgsql-hackers@postgresql.org CC: klaussfre...@gmail.com Subject: Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch On 05/28/2014 11:52 PM, John Lumby wrote: The patch seems to assume that you can put the aiocb struct in shared memory, initiate an asynchronous I/O request from one process, and wait for its completion from another process. I'm pretty surprised if that works on any platform. It works on linux.Actually this ability allows the asyncio implementation to reduce complexity in one respect (yes I know it looks complex enough) : it makes waiting for completion of an in-progress IO simpler than for the existing synchronous IO case,. since librt takes care of the waiting. specifically, no need for extra wait-for-io control blocks such as in bufmgr's WaitIO() [checks]. No, it doesn't work. See attached test program. It kinda seems to work sometimes, because of the way it's implemented in glibc. The aiocb struct has a field for the result value and errno, and when the I/O is finished, the worker thread fills them in. aio_error() and aio_return() just return the values of those fields, so calling aio_error() or aio_return() do in fact happen to work from a different process. aio_suspend(), however, is implemented by sleeping on a process-local mutex, which does not work from a different process. Even if it worked on Linux today, it would be a bad idea to rely on it from a portability point of view. No, the only sane way to make this work is that the process that initiates an I/O request is responsible for completing it. If another process needs to wait for an async I/O to complete, we must use some other means to do the waiting. Like the io_in_progress_lock that we already have, for the same purpose. - Heikki /* * Test program to test if POSIX aio functions work across processes */ #include #include #include #include #include #include #include #include #include char *shmem; void processA(void) { int fd; struct aiocb *aiocbp = (struct aiocb *) shmem; char *buf = shmem + sizeof(struct aiocb); fd = open("aio-shmem-test-file", O_CREAT | O_WRONLY | O_SYNC, S_IRWXU); if (fd == -1) { fprintf(stderr, "open() failed\n"); exit(1); } printf("processA starting AIO\n"); strcpy(buf, "foobar"); memset(aiocbp, 0, sizeof(struct aiocb)); aiocbp->aio_fildes = fd; aiocbp->aio_offset = 0; aiocbp->aio_buf = buf; aiocbp->aio_nbytes = strlen(buf); aiocbp->aio_reqprio = 0; aiocbp->aio_sigevent.sigev_notify = SIGEV_NONE; if (aio_write(aiocbp) != 0) { fprintf(stderr, "aio_write() failed\n"); exit(1); } } void processB(void) { struct aiocb *aiocbp = (struct aiocb *) shmem; const struct aiocb * const pl[1] = { aiocbp }; int rv; printf("waiting for the write to finish in process B\n"); if (aio_suspend(pl, 1, NULL) != 0) { fprintf(stderr, "aio_suspend() failed\n"); exit(1); } printf("aio_suspend() returned 0\n"); rv = aio_error(aiocbp); if (rv != 0) { fprintf(stderr, "aio_error returned %d: %s\n", rv, strerror(rv)); exit(1); } rv = aio_return(aiocbp); printf("aio_return returned %d\n", rv); } int main(int argc, char **argv) { int pidB; shmem = mmap(NULL, sizeof(struct aiocb) + 1000, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); if (shmem == MAP_FAILED) { fprintf(stderr, "mmap() failed\n"); exit(1); } #ifdef SINGLE_PROCESS /* this works */ processA(); processB(); #else /* * Start the I/O request in parent process, then fork and try to wait * for it to finish from the child process. (doesn't work, it will hang * forever) */ processA(); pidB = fork(); if (pidB == -1) { fprintf(stderr, "fork() failed\n"); exit(1); } if (pidB != 0) { /* parent */ wait (pidB); } else { /* child */ processB(); } #endif } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] CHECK_FOR_INTERRUPTS in spgdoinsert() isn't helpful
One of the annoying things about the SPGiST bug Teodor pointed out is that once you're in the infinite loop, query cancel doesn't work to get you out of it. There are a couple of CHECK_FOR_INTERRUPTS() calls in spgdoinsert() that are meant to get you out of exactly this sort of buggy-opclass situation --- but they don't help. The reason is that when we reach that point we're holding one or more buffer locks, which means the HOLD_INTERRUPTS count is positive, so ProcessInterrupts is afraid to do anything. In point of fact, we'd be happy to give up the locks and then throw the error. So I was thinking about inventing some macro or out-of-line function that went about like this: if (InterruptPending && (QueryCancelPending || ProcDiePending)) { LWLockReleaseAll(); ProcessInterrupts(); elog(ERROR, "ProcessInterrupts failed to throw an error"); } where we don't expect to reach the final elog; it's just there to make sure we don't return control after yanking locks out from under the caller. You might object that this is risky since it might release locks other than the buffer locks spgdoinsert() is specifically concerned with. However, if spgdoinsert() were to throw an error for a reason other than query cancel, any such locks would get released anyway as the first step during error cleanup, so I think this is not a real concern. There may be other places in the index AMs or other low-level code where this'd be appropriate; I've not really looked. Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Logical slots not mentioned in CREATE_REPLICATION_SLOT for replication protocol docs
On Thu, May 22, 2014 at 12:18 AM, Michael Paquier wrote: > On Thu, May 22, 2014 at 12:44 PM, Michael Paquier > wrote: >> Hi all, >> >> As written in subject, replication protocol documentation lacks >> details about logical slots in CREATE_REPLICATION_SLOT command: >> http://www.postgresql.org/docs/devel/static/protocol-replication.html >> Attached is a patch correcting that. > An additional thing I noticed: START_REPLICATION does not mention that > it is possible to specify options for the output plugin. All the fixes > are included in the patch attached. Thanks, this looks good. But shouldn't the bit about output plugin options mention say something like: ( option_name option_argument [, ...] ) ...instead of just: ( option [, ...] ) ? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CHECK_FOR_INTERRUPTS in spgdoinsert() isn't helpful
On 05/29/2014 11:25 PM, Tom Lane wrote: One of the annoying things about the SPGiST bug Teodor pointed out is that once you're in the infinite loop, query cancel doesn't work to get you out of it. There are a couple of CHECK_FOR_INTERRUPTS() calls in spgdoinsert() that are meant to get you out of exactly this sort of buggy-opclass situation --- but they don't help. The reason is that when we reach that point we're holding one or more buffer locks, which means the HOLD_INTERRUPTS count is positive, so ProcessInterrupts is afraid to do anything. In point of fact, we'd be happy to give up the locks and then throw the error. So I was thinking about inventing some macro or out-of-line function that went about like this: if (InterruptPending && (QueryCancelPending || ProcDiePending)) { LWLockReleaseAll(); ProcessInterrupts(); elog(ERROR, "ProcessInterrupts failed to throw an error"); } where we don't expect to reach the final elog; it's just there to make sure we don't return control after yanking locks out from under the caller. Also checking that CritSectionCount == 0 seems like a good idea... - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch
On Thu, May 29, 2014 at 5:23 PM, Heikki Linnakangas wrote: > On 05/29/2014 04:12 PM, John Lumby wrote: >> >> Thanks for looking at it! >> >>> Date: Thu, 29 May 2014 00:19:33 +0300 >>> From: hlinnakan...@vmware.com >>> To: johnlu...@hotmail.com; pgsql-hackers@postgresql.org >>> CC: klaussfre...@gmail.com >>> Subject: Re: [HACKERS] Extended Prefetching using Asynchronous IO - >>> proposal and patch >>> >>> On 05/28/2014 11:52 PM, John Lumby wrote: >>> >>> The patch seems to assume that you can put the aiocb struct in shared >>> memory, initiate an asynchronous I/O request from one process, and wait >>> for its completion from another process. I'm pretty surprised if that >>> works on any platform. >> >> >> It works on linux.Actually this ability allows the asyncio >> implementation to >> reduce complexity in one respect (yes I know it looks complex enough) : >> it makes waiting for completion of an in-progress IO simpler than for >> the existing synchronous IO case,. since librt takes care of the >> waiting. >> specifically, no need for extra wait-for-io control blocks >> such as in bufmgr's WaitIO() > > > [checks]. No, it doesn't work. See attached test program. > > It kinda seems to work sometimes, because of the way it's implemented in > glibc. The aiocb struct has a field for the result value and errno, and when > the I/O is finished, the worker thread fills them in. aio_error() and > aio_return() just return the values of those fields, so calling aio_error() > or aio_return() do in fact happen to work from a different process. > aio_suspend(), however, is implemented by sleeping on a process-local mutex, > which does not work from a different process. > > Even if it worked on Linux today, it would be a bad idea to rely on it from > a portability point of view. No, the only sane way to make this work is that > the process that initiates an I/O request is responsible for completing it. > If another process needs to wait for an async I/O to complete, we must use > some other means to do the waiting. Like the io_in_progress_lock that we > already have, for the same purpose. But calls to it are timeouted by 10us, effectively turning the thing into polling mode. Which is odd now that I read carefully, is how come 256 waits of 10us amounts to anything? That's just 2.5ms, not enough IIUC for any normal I/O to complete. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CHECK_FOR_INTERRUPTS in spgdoinsert() isn't helpful
On Thu, May 29, 2014 at 1:34 PM, Heikki Linnakangas wrote: > Also checking that CritSectionCount == 0 seems like a good idea... What do you do if it isn't? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch
On 05/29/2014 11:34 PM, Claudio Freire wrote: On Thu, May 29, 2014 at 5:23 PM, Heikki Linnakangas wrote: On 05/29/2014 04:12 PM, John Lumby wrote: On 05/28/2014 11:52 PM, John Lumby wrote: The patch seems to assume that you can put the aiocb struct in shared memory, initiate an asynchronous I/O request from one process, and wait for its completion from another process. I'm pretty surprised if that works on any platform. It works on linux.Actually this ability allows the asyncio implementation to reduce complexity in one respect (yes I know it looks complex enough) : it makes waiting for completion of an in-progress IO simpler than for the existing synchronous IO case,. since librt takes care of the waiting. specifically, no need for extra wait-for-io control blocks such as in bufmgr's WaitIO() [checks]. No, it doesn't work. See attached test program. It kinda seems to work sometimes, because of the way it's implemented in glibc. The aiocb struct has a field for the result value and errno, and when the I/O is finished, the worker thread fills them in. aio_error() and aio_return() just return the values of those fields, so calling aio_error() or aio_return() do in fact happen to work from a different process. aio_suspend(), however, is implemented by sleeping on a process-local mutex, which does not work from a different process. Even if it worked on Linux today, it would be a bad idea to rely on it from a portability point of view. No, the only sane way to make this work is that the process that initiates an I/O request is responsible for completing it. If another process needs to wait for an async I/O to complete, we must use some other means to do the waiting. Like the io_in_progress_lock that we already have, for the same purpose. But calls to it are timeouted by 10us, effectively turning the thing into polling mode. We don't want polling... And even if we did, calling aio_suspend() in a way that's known to be broken, in a loop, is a pretty crappy way of polling. Which is odd now that I read carefully, is how come 256 waits of 10us amounts to anything? That's just 2.5ms, not enough IIUC for any normal I/O to complete Wild guess: the buffer you're reading is already in OS cache. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CHECK_FOR_INTERRUPTS in spgdoinsert() isn't helpful
Heikki Linnakangas writes: > On 05/29/2014 11:25 PM, Tom Lane wrote: >> In point of fact, we'd be happy to give up the locks and then throw >> the error. So I was thinking about inventing some macro or out-of-line >> function that went about like this: >> >> if (InterruptPending && (QueryCancelPending || ProcDiePending)) >> { >> LWLockReleaseAll(); >> ProcessInterrupts(); >> elog(ERROR, "ProcessInterrupts failed to throw an error"); >> } > Also checking that CritSectionCount == 0 seems like a good idea... Yeah, and there may be a couple other details like that. Right now I'm just thinking about not allowing LWLocks to block the cancel. I guess something else to consider is whether InterruptHoldoffCount could be larger than the number of held LWLocks; if so, that would prevent this from working as desired. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CHECK_FOR_INTERRUPTS in spgdoinsert() isn't helpful
Peter Geoghegan writes: > On Thu, May 29, 2014 at 1:34 PM, Heikki Linnakangas > wrote: >> Also checking that CritSectionCount == 0 seems like a good idea... > What do you do if it isn't? Not throw the error. (If we did, it'd turn into a PANIC, which doesn't seem like what we want.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch
On Thu, May 29, 2014 at 5:39 PM, Heikki Linnakangas wrote: > On 05/29/2014 11:34 PM, Claudio Freire wrote: >> >> On Thu, May 29, 2014 at 5:23 PM, Heikki Linnakangas >> wrote: >>> >>> On 05/29/2014 04:12 PM, John Lumby wrote: > On 05/28/2014 11:52 PM, John Lumby wrote: > > The patch seems to assume that you can put the aiocb struct in shared > memory, initiate an asynchronous I/O request from one process, and wait > for its completion from another process. I'm pretty surprised if that > works on any platform. It works on linux.Actually this ability allows the asyncio implementation to reduce complexity in one respect (yes I know it looks complex enough) : it makes waiting for completion of an in-progress IO simpler than for the existing synchronous IO case,. since librt takes care of the waiting. specifically, no need for extra wait-for-io control blocks such as in bufmgr's WaitIO() >>> >>> >>> [checks]. No, it doesn't work. See attached test program. >>> >>> It kinda seems to work sometimes, because of the way it's implemented in >>> glibc. The aiocb struct has a field for the result value and errno, and >>> when >>> the I/O is finished, the worker thread fills them in. aio_error() and >>> aio_return() just return the values of those fields, so calling >>> aio_error() >>> or aio_return() do in fact happen to work from a different process. >>> aio_suspend(), however, is implemented by sleeping on a process-local >>> mutex, >>> which does not work from a different process. >>> >>> Even if it worked on Linux today, it would be a bad idea to rely on it >>> from >>> a portability point of view. No, the only sane way to make this work is >>> that >>> the process that initiates an I/O request is responsible for completing >>> it. >>> If another process needs to wait for an async I/O to complete, we must >>> use >>> some other means to do the waiting. Like the io_in_progress_lock that we >>> already have, for the same purpose. >> >> >> But calls to it are timeouted by 10us, effectively turning the thing >> into polling mode. > > > We don't want polling... And even if we did, calling aio_suspend() in a way > that's known to be broken, in a loop, is a pretty crappy way of polling. Agreed. Just saying that that's why it works. The PID of the process responsible for the aio should be there somewhere, and other processes should explicitly synchronize (or initiate their own I/O and let the OS merge them - which also works). > > >> Which is odd now that I read carefully, is how come 256 waits of 10us >> amounts to anything? That's just 2.5ms, not enough IIUC for any normal >> I/O to complete > > > Wild guess: the buffer you're reading is already in OS cache. My wild guess as well. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CHECK_FOR_INTERRUPTS in spgdoinsert() isn't helpful
On Thu, May 29, 2014 at 1:42 PM, Tom Lane wrote: > Not throw the error. (If we did, it'd turn into a PANIC, which doesn't > seem like what we want.) Of course, but it isn't obvious to me that that isn't what we'd want, if the alternative is definitely that we'd be stuck in an infinite uninterruptible loop (I guess we couldn't just avoid throwing the error, thereby risking proceeding without locks -- and so I guess we can basically do nothing like this in critical sections). Now, that probably isn't the actual choice that we must face, but it also isn't obvious to me how you can do better than not throwing the error (and doing nothing extra) when in a critical section. That was the intent of my original question. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CHECK_FOR_INTERRUPTS in spgdoinsert() isn't helpful
Peter Geoghegan writes: > On Thu, May 29, 2014 at 1:42 PM, Tom Lane wrote: >> Not throw the error. (If we did, it'd turn into a PANIC, which doesn't >> seem like what we want.) > Of course, but it isn't obvious to me that that isn't what we'd want, > if the alternative is definitely that we'd be stuck in an infinite > uninterruptible loop (I guess we couldn't just avoid throwing the > error, thereby risking proceeding without locks -- and so I guess we > can basically do nothing like this in critical sections). Now, that > probably isn't the actual choice that we must face, but it also isn't > obvious to me how you can do better than not throwing the error (and > doing nothing extra) when in a critical section. That was the intent > of my original question. I don't feel a need for special behavior for this inside critical sections; a critical section that lasts long enough for query cancel response to be problematic is probably already broken by design. The other case of InterruptHoldoffCount being too high might be more relevant. I have not gone around and looked at retail uses of HOLD_INTERRUPTS to see whether there are any that might be holding counts for long periods. But I have checked that for the case in spgdoinsert, the only thing staying ProcessInterrupts' hand is the holdoff count associated with the buffer lock on the current index page. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch
Claudio Freire writes: > Didn't fix that, but the attached patch does fix regression tests when > scanning over index types other than btree (was invoking elog when the > index am didn't have ampeeknexttuple) "ampeeknexttuple"? That's a bit scary. It would certainly be unsafe for non-MVCC snapshots (read about vacuum vs indexscan interlocks in nbtree/README). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Odd uuid-ossp behavior on smew and shearwater
On Thu, May 29, 2014 at 4:06 PM, Andrew Dunstan wrote: > On 05/29/2014 02:38 PM, Tom Lane wrote: >> Josh Kupershmidt writes: >> Interesting. Looks like you have access only to virtual network >> interfaces, and they report all-zero MAC addresses, which the UUID library >> is smart enough to ignore. If smew is also in a virtual environment >> then that's probably the explanation. (There are some other buildfarm >> critters that are reporting MAC addresses with the local-admin bit set, >> which I suspect also means they've got virtual network interfaces, but >> with a different treatment of the what-to-report problem.) > Almost all my critters run in VMs (all but jacana and bowerbird). They're not running on OpenVZ, are they? `ifconfig` on shearwater says: venet0Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 inet addr:127.0.0.2 P-t-P:127.0.0.2 Bcast:0.0.0.0 Mask:255.255.255.255 UP BROADCAST POINTOPOINT RUNNING NOARP MTU:1500 Metric:1 RX packets:1409294 errors:0 dropped:0 overruns:0 frame:0 TX packets:1488401 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:751149524 (716.3 MiB) TX bytes:740573200 (706.2 MiB) venet0:0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 inet addr:198.204.250.34 P-t-P:198.204.250.34 Bcast:0.0.0.0 Mask:255.255.255.255 UP BROADCAST POINTOPOINT RUNNING NOARP MTU:1500 Metric:1 and it seems this all-zeros MAC address is a common (mis-?)configuration on OpenVZ: https://github.com/robertdavidgraham/masscan/issues/43 http://stackoverflow.com/questions/5838225/how-do-i-get-a-default-gridgain-node-in-openvz-discover-other-nodes-on-the-same http://forum.openvz.org/index.php?t=msg&goto=8117 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch
On Thu, May 29, 2014 at 6:19 PM, Tom Lane wrote: > Claudio Freire writes: >> Didn't fix that, but the attached patch does fix regression tests when >> scanning over index types other than btree (was invoking elog when the >> index am didn't have ampeeknexttuple) > > "ampeeknexttuple"? That's a bit scary. It would certainly be unsafe > for non-MVCC snapshots (read about vacuum vs indexscan interlocks in > nbtree/README). It's not really the tuple, just the tid -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch
On Thu, May 29, 2014 at 6:43 PM, Claudio Freire wrote: > On Thu, May 29, 2014 at 6:19 PM, Tom Lane wrote: >> Claudio Freire writes: >>> Didn't fix that, but the attached patch does fix regression tests when >>> scanning over index types other than btree (was invoking elog when the >>> index am didn't have ampeeknexttuple) >> >> "ampeeknexttuple"? That's a bit scary. It would certainly be unsafe >> for non-MVCC snapshots (read about vacuum vs indexscan interlocks in >> nbtree/README). > > > It's not really the tuple, just the tid And, furthermore, it's used only to do prefetching, so even if the tid was invalid when the tuple needs to be accessed, it wouldn't matter, because the indexam wouldn't use the result of ampeeknexttuple to do anything at that time. Though, your comment does illustrate the need to document that on ampeeknexttuple, for future users. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch
> Date: Thu, 29 May 2014 18:00:28 -0300 > Subject: Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal > and patch > From: klaussfre...@gmail.com > To: hlinnakan...@vmware.com > CC: johnlu...@hotmail.com; pgsql-hackers@postgresql.org > > On Thu, May 29, 2014 at 5:39 PM, Heikki Linnakangas > wrote: > > On 05/29/2014 11:34 PM, Claudio Freire wrote: > >> > >> On Thu, May 29, 2014 at 5:23 PM, Heikki Linnakangas > >> wrote: > >>> > >>> On 05/29/2014 04:12 PM, John Lumby wrote: > > > > On 05/28/2014 11:52 PM, John Lumby wrote: > > > > The patch seems to assume that you can put the aiocb struct in shared > > memory, initiate an asynchronous I/O request from one process, and wait > > for its completion from another process. I'm pretty surprised if that > > works on any platform. > > > It works on linux.Actually this ability allows the asyncio > implementation to > reduce complexity in one respect (yes I know it looks complex enough) : > it makes waiting for completion of an in-progress IO simpler than for > the existing synchronous IO case,. since librt takes care of the > waiting. > specifically, no need for extra wait-for-io control blocks > such as in bufmgr's WaitIO() > >>> > >>> > >>> [checks]. No, it doesn't work. See attached test program. Thanks for checkingand thanks for coming up with that test program. However, yes, it really does work -- always (on linux). Your test program is doing things in the wrong order - it calls aio_suspend *before* aio_error. However, the rule is, call aio_suspend *after* aio_error and *only* if aio_error returns EINPROGRESS. See the code changes to fd.c function FileCompleteaio() to see how we have done it. And I am attaching corrected version of your test program which runs just fine. > >>> > >>> It kinda seems to work sometimes, because of the way it's implemented in > >>> glibc. The aiocb struct has a field for the result value and errno, and > >>> when > >>> the I/O is finished, the worker thread fills them in. aio_error() and > >>> aio_return() just return the values of those fields, so calling > >>> aio_error() > >>> or aio_return() do in fact happen to work from a different process. > >>> aio_suspend(), however, is implemented by sleeping on a process-local > >>> mutex, > >>> which does not work from a different process. > >>> > >>> Even if it worked on Linux today, it would be a bad idea to rely on it > >>> from > >>> a portability point of view. No, the only sane way to make this work is > >>> that > >>> the process that initiates an I/O request is responsible for completing > >>> it. > >>> If another process needs to wait for an async I/O to complete, we must > >>> use > >>> some other means to do the waiting. Like the io_in_progress_lock that we > >>> already have, for the same purpose. > >> > >> > >> But calls to it are timeouted by 10us, effectively turning the thing > >> into polling mode. > > > > > > We don't want polling... And even if we did, calling aio_suspend() in a way > > that's known to be broken, in a loop, is a pretty crappy way of polling. Well, as mentioned earlier, it is not broken. Whether it is efficient I am not sure. I have looked at the mutex in aio_suspend that you mentioned and I am not quite convinced that, if caller is not the original aio_read process, it renders the suspend() into an instant timeout. I will see if I can verify that. Where are you (Claudio) seeing 10us? > > > Didn't fix that, but the attached patch does fix regression tests when > scanning over index types other than btree (was invoking elog when the > index am didn't have ampeeknexttuple) /* * Test program to test if POSIX aio functions work across processes */ #include #include #include #include #include #include #include #include #include #include char *shmem; void processA(void) { int fd; struct aiocb *aiocbp = (struct aiocb *) shmem; char *buf = shmem + sizeof(struct aiocb); fd = open("aio-shmem-test-file", O_CREAT | O_WRONLY | O_SYNC, S_IRWXU); if (fd == -1) { fprintf(stderr, "open() failed\n"); exit(1); } printf("processA starting AIO\n"); strcpy(buf, "foobar"); memset(aiocbp, 0, sizeof(struct aiocb)); aiocbp->aio_fildes = fd; aiocbp->aio_offset = 0; aiocbp->aio_buf = buf; aiocbp->aio_nbytes = strlen(buf); aiocbp->aio_reqprio = 0; aiocbp->aio_sigevent.sigev_notify = SIGEV_NONE; if (aio_write(aiocbp) != 0) { fprintf(stderr, "aio_write() failed\n"); exit(1); } } void processB(void) { struct aiocb *aiocbp = (struct aiocb *) shmem; const struct aiocb * const pl[1] = { aiocbp }; int rv; int returnCode; struc
Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch
Claudio Freire writes: > On Thu, May 29, 2014 at 6:43 PM, Claudio Freire > wrote: >> On Thu, May 29, 2014 at 6:19 PM, Tom Lane wrote: >>> "ampeeknexttuple"? That's a bit scary. It would certainly be unsafe >>> for non-MVCC snapshots (read about vacuum vs indexscan interlocks in >>> nbtree/README). >> It's not really the tuple, just the tid > And, furthermore, it's used only to do prefetching, so even if the tid > was invalid when the tuple needs to be accessed, it wouldn't matter, > because the indexam wouldn't use the result of ampeeknexttuple to do > anything at that time. Nonetheless, getting the next tid out of the index may involve stepping to the next index page, at which point you've lost your interlock guaranteeing that the *previous* tid will still mean something by the time you arrive at its heap page. I presume that the ampeeknexttuple call is issued before trying to visit the heap (otherwise you're not actually getting much I/O overlap), so I think there's a real risk here. Having said that, it's probably OK as long as this mode is only invoked for user queries (with MVCC snapshots) and not for system indexscans. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch
> From: t...@sss.pgh.pa.us > To: klaussfre...@gmail.com > CC: hlinnakan...@vmware.com; johnlu...@hotmail.com; > pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal > and patch > Date: Thu, 29 May 2014 17:56:57 -0400 > > Claudio Freire writes: > > On Thu, May 29, 2014 at 6:43 PM, Claudio Freire > > wrote: > >> On Thu, May 29, 2014 at 6:19 PM, Tom Lane wrote: > >>> "ampeeknexttuple"? That's a bit scary. It would certainly be unsafe > >>> for non-MVCC snapshots (read about vacuum vs indexscan interlocks in > >>> nbtree/README). > > >> It's not really the tuple, just the tid > > > And, furthermore, it's used only to do prefetching, so even if the tid > > was invalid when the tuple needs to be accessed, it wouldn't matter, > > because the indexam wouldn't use the result of ampeeknexttuple to do > > anything at that time. > > Nonetheless, getting the next tid out of the index may involve stepping > to the next index page, at which point you've lost your interlock I think we are ok as peeknexttuple (yes bad name, sorry, can change it ... never advances to another page : *btpeeknexttuple() -- peek at the next tuple different from any blocknum in pfch_list * without reading a new index page * and without causing any side-effects such as altering values in control blocks * if found, store blocknum in next element of pfch_list > guaranteeing that the *previous* tid will still mean something by the time > you arrive at its heap page. I presume that the ampeeknexttuple call is > issued before trying to visit the heap (otherwise you're not actually > getting much I/O overlap), so I think there's a real risk here. > > Having said that, it's probably OK as long as this mode is only invoked > for user queries (with MVCC snapshots) and not for system indexscans. > > regards, tom lane
Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch
On Thu, May 29, 2014 at 6:56 PM, Tom Lane wrote: > Claudio Freire writes: >> On Thu, May 29, 2014 at 6:43 PM, Claudio Freire >> wrote: >>> On Thu, May 29, 2014 at 6:19 PM, Tom Lane wrote: "ampeeknexttuple"? That's a bit scary. It would certainly be unsafe for non-MVCC snapshots (read about vacuum vs indexscan interlocks in nbtree/README). > >>> It's not really the tuple, just the tid > >> And, furthermore, it's used only to do prefetching, so even if the tid >> was invalid when the tuple needs to be accessed, it wouldn't matter, >> because the indexam wouldn't use the result of ampeeknexttuple to do >> anything at that time. > > Nonetheless, getting the next tid out of the index may involve stepping > to the next index page, at which point you've lost your interlock > guaranteeing that the *previous* tid will still mean something by the time No, no... that's exactly why a new regproc is needed, because for prefetching, we need to get the next tid that satisfies some conditions *without* walking the index. This, in nbtree, only looks through the tid array to find the suitable tid, or just return false if the array is exhausted. > Having said that, it's probably OK as long as this mode is only invoked > for user queries (with MVCC snapshots) and not for system indexscans. I think system index scans will also invoke this. There's no rule excluding that possibility. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch
On Thu, May 29, 2014 at 7:11 PM, John Lumby wrote: >> Nonetheless, getting the next tid out of the index may involve stepping >> to the next index page, at which point you've lost your interlock > > I think we are ok as peeknexttuple (yes bad name, sorry, can change it > ... > never advances to another page : > > *btpeeknexttuple() -- peek at the next tuple different from any > blocknum in pfch_list > * without reading a new index page > * and without causing any side-effects such as > altering values in control blocks > * if found, store blocknum in next element of pfch_list Yeah, I was getting to that conclusion myself too. We could call it amprefetchnextheap, since it does just prefetch, and is good for nothing *but* prefetch. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch
Hi, On 2014-05-29 17:53:51 -0400, John Lumby wrote: > to see how we have done it. And I am attaching corrected version > of your test program which runs just fine. It's perfectly fine to not be up to the coding style at this point, but trying to adhere to it to some degree will make code review later less painfull... * comments with ** * line length * tabs vs spaces * ... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch
> Date: Thu, 29 May 2014 18:00:28 -0300 > Subject: Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal > and patch > From: klaussfre...@gmail.com > To: hlinnakan...@vmware.com > CC: johnlu...@hotmail.com; pgsql-hackers@postgresql.org > > >>> > >>> Even if it worked on Linux today, it would be a bad idea to rely on it > >>> from > >>> a portability point of view. No, the only sane way to make this work is > >>> that > >>> the process that initiates an I/O request is responsible for completing > >>> it. I meant to add - it is really a significant benefit that a bkend can wait on the aio of a different bkend's original prefeetching aio_read. Remember that we check completion only when the bkend decides it really wants the block in a buffer,i.e ReadBuffer and friends, which might be a very long time after it had issued the prefetch request, or even never (see below).We don't want other bkends which want that block to have to wait for the originator to get around to reading it. *Especially* since the originator may *never* read it if it quits its scan early leaving prefetched but unread blocks behind. (Which is also taken care of in the patch). > >>> If another process needs to wait for an async I/O to complete, we must > >>> use > >>> some other means to do the waiting. Like the io_in_progress_lock that we > >>> already have, for the same purpose.
Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch
On Thu, May 29, 2014 at 6:53 PM, John Lumby wrote: > Well, as mentioned earlier, it is not broken. Whether it is efficient > I am not sure. > I have looked at the mutex in aio_suspend that you mentioned and I am not > quite convinced that, if caller is not the original aio_read process, > it renders the suspend() into an instant timeout. I will see if I can > verify that. > Where are you (Claudio) seeing 10us? fd.c, in FileCompleteaio, sets timeout to: my_timeout.tv_sec = 0; my_timeout.tv_nsec = 1; Which is 10k ns, which is 10 us. It loops 256 times at most, so it's polling 256 times with a 10 us timeout. Sounds wasteful. I'd: 1) If it's the same process, wait for the full timeout (no looping). If you have to loop (EAGAIN or EINTR - which I just noticed you don't check for), that's ok. 2) If it's not, just fall through, don't wait, issue the I/O. The kernel will merge the requests. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch
On Thu, May 29, 2014 at 7:26 PM, Claudio Freire wrote: > 1) If it's the same process, wait for the full timeout (no looping). > If you have to loop (EAGAIN or EINTR - which I just noticed you don't > check for), that's ok. Sorry, meant to say just looping on EINTR. About the style guidelines, no, I just copy the style of surrounding code usually. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Odd uuid-ossp behavior on smew and shearwater
On 05/29/2014 05:41 PM, Josh Kupershmidt wrote: On Thu, May 29, 2014 at 4:06 PM, Andrew Dunstan wrote: Almost all my critters run in VMs (all but jacana and bowerbird). They're not running on OpenVZ, are they? `ifconfig` on shearwater says: [...] and it seems this all-zeros MAC address is a common (mis-?)configuration on OpenVZ: https://github.com/robertdavidgraham/masscan/issues/43 http://stackoverflow.com/questions/5838225/how-do-i-get-a-default-gridgain-node-in-openvz-discover-other-nodes-on-the-same http://forum.openvz.org/index.php?t=msg&goto=8117 Yeah, that's a bit ugly. Mine are on qemu, one (sitella) is on Xen. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Logical slots not mentioned in CREATE_REPLICATION_SLOT for replication protocol docs
On Fri, May 30, 2014 at 5:31 AM, Robert Haas wrote: > Thanks, this looks good. But shouldn't the bit about output plugin > options mention say something like: > > ( option_name option_argument [, ...] ) > > ...instead of just: > > ( option [, ...] ) > ? Yes, true. Here is an updated patch. -- Michael diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml index 3a2421b..43861d0 100644 --- a/doc/src/sgml/protocol.sgml +++ b/doc/src/sgml/protocol.sgml @@ -1428,10 +1428,10 @@ The commands accepted in walsender mode are: -CREATE_REPLICATION_SLOT slotname PHYSICALCREATE_REPLICATION_SLOT +CREATE_REPLICATION_SLOT slotname { PHYSICAL | LOGICAL output_plugin } CREATE_REPLICATION_SLOT - Create a physical replication + Create a physical or logical replication slot. See for more about replication slots. @@ -1445,6 +1445,16 @@ The commands accepted in walsender mode are: + + + output_plugin + + + The name of the output plugin used for logical decoding + (see ). + + + @@ -1778,7 +1788,7 @@ The commands accepted in walsender mode are: -START_REPLICATION SLOT slotname LOGICAL XXX/XXX +START_REPLICATION SLOT slotname LOGICAL XXX/XXX [ ( option_name value [, ... ] ) ] Instructs server to start streaming WAL for logical replication, starting @@ -1811,6 +1821,22 @@ The commands accepted in walsender mode are: + + option_name + + + Custom option name for logical decoding plugin. + + + + + value + + + Value associated with given option for logical decoding plugin. + + + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] json/jsonb and unicode escapes
Here is a draft patch for some of the issues to do with unicode escapes that Teodor raised the other day. I think it does the right thing, although I want to add a few more regression cases before committing it. Comments welcome. cheers andrew diff --git a/src/backend/utils/adt/json.c b/src/backend/utils/adt/json.c index a7364f3..47ab9be 100644 --- a/src/backend/utils/adt/json.c +++ b/src/backend/utils/adt/json.c @@ -2274,7 +2274,27 @@ escape_json(StringInfo buf, const char *str) appendStringInfoString(buf, "\\\""); break; case '\\': -appendStringInfoString(buf, ""); +/* + * Unicode escapes are passed through as is. There is no + * requirement that they denote a valid character in the + * server encoding - indeed that is a big part of their + * usefulness. + * + * All we require is that they consist of \u where + * the Xs are hexadecimal digits. It is the responsibility + * of the caller of, say, to_json() to make sure that the + * unicode escape is valid. + * + * In the case of a jsonb string value beng escaped, the + * only unicode escape that should be present is \u, + * all the other unicode escapes will have been resolved. + * + */ +if (p[1] == 'u' && isxdigit(p[2]) && isxdigit(p[3]) + && isxdigit(p[4]) && isxdigit(p[5])) + appendStringInfoCharMacro(buf, *p); +else + appendStringInfoString(buf, ""); break; default: if ((unsigned char) *p < ' ') diff --git a/src/test/regress/expected/jsonb.out b/src/test/regress/expected/jsonb.out index ae7c506..1e46939 100644 --- a/src/test/regress/expected/jsonb.out +++ b/src/test/regress/expected/jsonb.out @@ -61,9 +61,9 @@ LINE 1: SELECT '"\u000g"'::jsonb; DETAIL: "\u" must be followed by four hexadecimal digits. CONTEXT: JSON data, line 1: "\u000g... SELECT '"\u"'::jsonb; -- OK, legal escape - jsonb - "\\u" + jsonb +-- + "\u" (1 row) -- use octet_length here so we don't get an odd unicode char in the diff --git a/src/test/regress/expected/jsonb_1.out b/src/test/regress/expected/jsonb_1.out index 38a95b4..955dc42 100644 --- a/src/test/regress/expected/jsonb_1.out +++ b/src/test/regress/expected/jsonb_1.out @@ -61,9 +61,9 @@ LINE 1: SELECT '"\u000g"'::jsonb; DETAIL: "\u" must be followed by four hexadecimal digits. CONTEXT: JSON data, line 1: "\u000g... SELECT '"\u"'::jsonb; -- OK, legal escape - jsonb - "\\u" + jsonb +-- + "\u" (1 row) -- use octet_length here so we don't get an odd unicode char in the -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] recovery testing for beta
On Thu, May 29, 2014 at 10:09 PM, Jeff Janes wrote: > > What features in 9.4 need more beta testing for recovery? Another feature which have interaction with recovery is reduced WAL for Update operation: http://www.postgresql.org/message-id/e1wnqjm-0003cz...@gemulon.postgresql.org Commit: a3115f With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] fix worker_spi to run as non-dynamic background worker
On Tue, May 27, 2014 at 12:11 AM, Michael Paquier wrote: > The feature itself is already mentioned in the release notes in a very > general manner, by telling that bgworkers can be dynamically > "registered" and "started". Users can still refer to the documentation > to see more precisely what kind of infrastructure is in place in 9.4. > > So... As I'm on it... What about something like the attached patch? Hearing no comment from Bruce, committed. Bruce, feel free to modify or revert if you don't think this is appropriate. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spreading full-page writes
On Tue, May 27, 2014 at 8:15 AM, Heikki Linnakangas wrote: > Since you will be flushing the buffers one "redo partition" at a time, you > would want to allow the OS to do merge the writes within a partition as much > as possible. So my even-odd split would in fact be pretty bad. Some sort of > striping, e.g. mapping each contiguous 1 MB chunk to the same partition, > would be better. I suspect you'd actually want to stripe by segment (1GB partition). If you striped by 1MB partitions, there might still be writes to the parts of the file you weren't checkpointing that would be flushed by the fsync(). That would lead to more physical I/O overall, if those pages were written again before you did the next half-checkpoint. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgres_fdw and connection management
On Mon, May 26, 2014 at 10:47 PM, Shigeru Hanada wrote: > 2014-05-24 0:09 GMT+09:00 Sandro Santilli : >> Indeed I tried "DISCARD ALL" in hope it would have helped, so I find >> good your idea of allowing extensions to register an hook there. >> >> Still, I'd like the FDW handler itself to possibly be configured >> to disable the pool completely as a server-specific configuration. > > Connection management seems FDW-specific feature to me. How about to > add FDW option, say pool_connection=true|false, to postgres_fdw which > allows per-server configuration? Right... or you could have an option to close the connection at end-of-statement, end-of-transaction, or end-of-session. But quite apart from that, it seems like there ought to be a way to tell an FDW to flush its state. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_sleep() doesn't work well with recovery conflict interrupts.
On Wed, May 28, 2014 at 8:53 PM, Andres Freund wrote: > Hi, > > Since a64ca63e59c11d8fe6db24eee3d82b61db7c2c83 pg_sleep() uses > WaitLatch() to wait. That's fine in itself. But > procsignal_sigusr1_handler, which is used e.g. when resolving recovery > conflicts, doesn't unconditionally do a SetLatch(). > That means that we'll we'll currently not be able to cancel conflicting > backends during recovery for 10min. Now, I don't think that'll happen > too often in practice, but it's still annoying. How will such a situation occur, aren't we using pg_usleep during RecoveryConflict functions (ex. in ResolveRecoveryConflictWithVirtualXIDs)? With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com