Re: [HACKERS] pg_rewind failure by file deletion in source server
On Thu, Jun 11, 2015 at 5:48 PM, Fujii Masao masao.fu...@gmail.com wrote: On Thu, Jun 11, 2015 at 2:14 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Thu, Jun 11, 2015 at 1:51 AM, Fujii Masao masao.fu...@gmail.com wrote: Shouldn't pg_rewind ignore that failure of operation? If the file is not found in source server, the file doesn't need to be copied to destination server obviously. So ISTM that pg_rewind safely can skip copying that file. Thought? I think that you should fail. Let's imagine that the master to be rewound has removed a relation file before being stopped cleanly after its standby has been promoted that was here at the last checkpoint before forking, and that the standby still has the relation file after promotion. You should be able to copy it to be able to replay WAL on it. If the standby has removed a file in the file map after taking the file map, I guess that the best thing to do is fail because the file that should be here for the rewound node cannot be fetched. In this case, why do you think that the file should exist in the old master? Even if it doesn't exist, ISTM that the old master can safely replay the WAL records related to the file when it restarts. So what's the problem if the file doesn't exist in the old master? Well, some user may want to rewind the master down to the point where WAL forked, and then recover it immediately when a consistent point is reached just at restart instead of replugging it into the cluster. In this case I think that you need the relation file of the dropped relation to get a consistent state. That's still cheaper than recreating a node from a fresh base backup in some cases, particularly if the last base backup taken is far in the past for this cluster. Documentation should be made clearer about that with a better error message... I'm wondering how we can recover (or rewind again) the old master from that error. This also would need to be documented if we decide not to fix any code regarding this problem... FWIW, here is a scenario able to trigger the error with 1 master (port 5432, data at ~/data/5432) and 1 standby (port 5433, data at ~/data/5433). $ psql -c 'create table aa as select generate_series(1,100)' # Promote standby $ pg_ctl promote -D ~/data/5433/ # Drop table on master $ psql -c 'drop table aa' DROP TABLE $ pg_ctl stop -D ~/data/5432/ At this point there is no more relation file on master for 'aa', it is still present on standby. Running pg_rewind at this point will work, the relation file would be copied from the promoted standby to master. $ lldb -- pg_rewind -D 5432 --source-server=port=5433 dbname=postgres Breakpoint pg_rewind after fetchSourceFileList() and before replaying the changes from the block map, drop table 'aa' on standby and checkpoint it, then the source file list is inconsistent and pg_rewind will fail. This can just happen with --source-server, with --source-pgdata Adding a sleep() of a couple of seconds in pg_rewind may be better to trigger directly the error ;), with DROP DATABASE for example. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] skipping pg_log in basebackup (was Re: pg_basebackup and pg_stat_tmp directory)
At 2015-06-11 14:38:03 +0900, langote_amit...@lab.ntt.co.jp wrote: On the other hand, I don't like the idea of doing (3) by adding command line arguments to pg_basebackup and adding a new option to the command. I don't think that level of flexibility is justified; it would also make it easier to end up with a broken base backup (by inadvertently excluding more than you meant to). Maybe a combination of (2) and part of (3). In absence of any command line argument, the behavior is (2), to exclude. Provide an option to *include* it (-S/--serverlog). I don't like that idea any more than having the command-line argument to exclude pg_log. (And people who store torrented files in PGDATA may like it even less.) -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Comfortably check BackendPID with psql
Not a big fan of that abbreviation itself. What I'd wondered about instead - and actually had patched into my psql at some point - is adding an appropriate escape to psql's PROMPT. I think that'd serve your purpose as well? +3.14159; that would be hugely helpful when using gdb. You can get that today. In ~/.psqlrc: SELECT pg_catalog.pg_backend_pid() AS backend_pid \gset \set PROMPT1 '%m %:backend_pid: %/%R%# ' It doesn't update after \connect, but the overlap between my use of \connect and my use of debuggers is tiny. Thank you all! My hack is going to be much smoother. Regards, Naoya --- Naoya Anzai Engineering Department NEC Solution Inovetors, Ltd. E-Mail: nao-an...@xc.jp.nec.com --- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_rewind failure by file deletion in source server
On Fri, Jun 12, 2015 at 3:17 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Thu, Jun 11, 2015 at 5:48 PM, Fujii Masao masao.fu...@gmail.com wrote: On Thu, Jun 11, 2015 at 2:14 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Thu, Jun 11, 2015 at 1:51 AM, Fujii Masao masao.fu...@gmail.com wrote: Shouldn't pg_rewind ignore that failure of operation? If the file is not found in source server, the file doesn't need to be copied to destination server obviously. So ISTM that pg_rewind safely can skip copying that file. Thought? I think that you should fail. Let's imagine that the master to be rewound has removed a relation file before being stopped cleanly after its standby has been promoted that was here at the last checkpoint before forking, and that the standby still has the relation file after promotion. You should be able to copy it to be able to replay WAL on it. If the standby has removed a file in the file map after taking the file map, I guess that the best thing to do is fail because the file that should be here for the rewound node cannot be fetched. In this case, why do you think that the file should exist in the old master? Even if it doesn't exist, ISTM that the old master can safely replay the WAL records related to the file when it restarts. So what's the problem if the file doesn't exist in the old master? Well, some user may want to rewind the master down to the point where WAL forked, and then recover it immediately when a consistent point is reached just at restart instead of replugging it into the cluster. In this case I think that you need the relation file of the dropped relation to get a consistent state. That's still cheaper than recreating a node from a fresh base backup in some cases, particularly if the last base backup taken is far in the past for this cluster. So it's the case where a user wants to recover old master up to the point BEFORE the file in question is deleted in new master. At that point, since the file must exist, pg_rewind should fail if the file cannot be copied from new master. Is my understanding right? As far as I read the code of pg_rewind, ISTM that your scenario never happens. Because pg_rewind sets the minimum recovery point to the latest WAL location in new master, i.e., AFTER the file is deleted. So old master cannot stop recovering before the file is deleted in new master. If the recovery stops at that point, it fails because the minimum recovery point is not reached yet. IOW, after pg_rewind runs, the old master has to replay the WAL records which were generated by the deletion of the file in the new master. So it's okay if the old master doesn't have the file after pg_rewind runs, I think. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The Future of Aggregation
On 11 June 2015 at 01:39, Kevin Grittner kgri...@ymail.com wrote: David Rowley david.row...@2ndquadrant.com wrote: /* setup */ create table millionrowtable as select generate_series(1,100)::numeric as x; /* test 1 */ SELECT sum(x) / count(x) from millionrowtable; /* test 2 */ SELECT avg(x) from millionrowtable; Test 1: 274.979 ms 272.104 ms 269.915 ms Test 2: 229.619 ms 220.703 ms 234.743 ms (About 19% slower) Of course, with Tom's approach you would see the benefit; the two statements should run at about the same speed. I am a little curious what sort of machine you're running on, because my i7 is much slower. I ran a few other tests with your table for perspective. Assert enabled build? My hardware is very unimpressive... an i5 from Q1 2010. Due to be replaced very soon. One question that arose in my mind running this was whether might be able to combine sum(x) with count(*) if x was NOT NULL, even though the arguments don't match. It might not be worth the gymnastics of recognizing the special case, and I certainly wouldn't recommend looking at that optimization in a first pass; but it might be worth jotting down on a list somewhere I think it's worth looking into that at some stage. I think I might have some of the code that would be required for the NULL checking over here - http://www.postgresql.org/message-id/CAApHDvqRB-iFBy68=dcgqs46arep7aun2pou4ktwl8kx9yo...@mail.gmail.com I'm just not so sure what the logic would be to decide when we could apply this. The only properties I can see that may be along the right lines are pg_proc.pronargs for int8inc and inc8inc_any. Regards David Rowley -- David Rowley http://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [HACKERS] Collection of memory leaks for ECPG driver
On Mon, Jun 08, 2015 at 10:50:25PM +0900, Michael Paquier wrote: And the caller needs to be sure that str actually allocates enough space.. That's not directly ECPG's business, still it feels But there is no way for us to fix this as we want to implement the API as defined in Informix. uncomfortable. I fixed it with the attached. Thanks, committed. The world is full of surprises, and even if free(NULL) is a noop on modern platforms, I would rather take it defensively and check for a NULL pointer as Postgres supports many platforms. Other code paths tend to do already so, per se for example ECPGconnect. Right, that's because they were developed before free received the safeguard, or the programmer simply didn't know at that point in time. Unless I'm mistaken we have other code that only calls free() without an additional safeguard, so why shuld ecpglib be special? If you don't like it using both approaches, feel free to remove the check in the other case. :) More seriously, though, does anyone know of any platform where free(NULL) is *not* a noop? I don't like adding stuff just because of the world being full of surprises. There may also be other functions not working as advertized. Wouldn't that then mean we cannot use any function nor provided by ourselves? Perhaps I am overdoing it. I would have them for correctness (some other code paths do so) but if you think that the impact is minimal there then I am fine to not modify descriptor.c. Sorry, I wasn't referring to descriptor.c but to the whole preproc/ subtree. But as I already said, since we went through the effort we can of course apply it. Will be in my next push. Thanks. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at gmail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Aggregate Supporting Functions
On 10 June 2015 at 02:26, Tom Lane t...@sss.pgh.pa.us wrote: Kevin Grittner kgri...@ymail.com writes: David Rowley david.row...@2ndquadrant.com wrote: [ avoid duplicate calculations for related aggregates ] From the information you have proposed storing, with cost factors associated with the functions, it seems technically possible to infer that you could run (for example) the avg() aggregate to accumulate both but only run the final functions of the aggregates referenced by the query. That seems like an optimization to try hard to forget about until you have at least one real-world use case where it would yield a significant benefit. It seems premature to optimize for that before having the rest working. Actually, I would suggest that you forget about all the other aspects and *just* do that, because it could be made to work today on existing aggregate functions, and it would not require hundreds-to-thousands of lines of boilerplate support code in the grammar, catalog support, pg_dump, yadda yadda. That is, look to see which aggregates use the same transition function and run that just once. We already have the rule that the final function can't destroy the transition output, so running two different final functions on the same transition result should Just Work. Good idea. I believe the only extra check, besides do they use the same transfn, would be the initvalue of the aggregate. I'll write a patch and post in the next couple of days. Regards David Rowley -- David Rowley http://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
[HACKERS] [Proposal] Progress bar for pg_dump/pg_restore
Hi, all. I am newbie in hackers. I have an idea from my point of view as one user, I would like to propose the following. Progress bar for pg_dump / pg_restore = Motivation -- pg_dump and pg_restore show nothing if users don't specify verbose (-v) option. In too large table to finish in a few minutes, this behavior worries some users about if this situation (nothing shows up) is all right. I propose this feature to free these users from worrying. Design API When pg_dump / pg_restore is running, progress bar and estimated time to finish is shown on screen like following. = (50%) 15:50 The bar (= in above) and percentage value (50% in above) show percentage of progress, and the time (15:50 in above) shows estimated time to finish. (This percentage is the ratio for the whole processing.) Percentage and time are calculated and shown for every 1 second. In pg_dump, the information, which is required for calculating percentage and time, is from pg_class. In pg_restore, to calculate the same things, I want to record total amount of command lines into pg_dump file, thus I would like to add a new element to Archive structure. (This means that version number of archive format is changed.) Usage -- To use this feature, user must specify -P option in command line. (This definition is also temporary, so this is changeable if this leads problem.) $ pg_dump -Fc -P -f foo.pgdump foo I also think it's better that this feature is enabled as the default and does not force users to specify any options, but it means changing the default behavior, and can make problem in some programs expecting no output on stdout. I will implement this feature if this proposal is accepted by hackers. (Maybe, I will not use ncurses for implementing this feature, because ncurses can not be used with standard printf family functions.) Any comments are welcome. Best Regards, -- Taiki Kondo -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
Hi, + listitem + para +Add typeJSONB/ functions functionjsonb_set()/ and +functionjsonb_pretty/ (Dmitry Dolgov, Andrew Dunstan) + /para + /listitem I believe I should be 3rd author for this one as I rewrote large parts of this functionality as part of the review. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why does replication need the old history file?
On Fri, Jun 12, 2015 at 5:18 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Fri, Jun 12, 2015 at 4:56 AM, Josh Berkus j...@agliodbs.com wrote: Hackers, Sequence of events: 1. PITR backup of server on timeline 2. 2. Restored the backup to a new server, new-master. 3. Restored the backup to another new server, new-replica. 4. Started and promoted new-master (now on Timeline 3). 5. Started new-replica, connecting over streaming to new-master. 6. Get error message: 2015-06-11 12:24:14.503 PDT,,,7465,,5579e05e.1d29,1,,2015-06-11 12:24:14 PDT,,0,LOG,0,fetching timeline history file for timeline 2 from primary server, 2015-06-11 12:24:14.503 PDT,,,7465,,5579e05e.1d29,2,,2015-06-11 12:24:14 PDT,,0,FATAL,XX000,could not receive timeline history file from the primary server: ERROR: could not open file pg_xlog/0002.history: No such file or directory Questions: A. Why does the replica need 0002.history? Shouldn't it only need 0003.history? From where is the base backup taken in case of the node started at 5? The related source code comment says /* * Get any missing history files. We do this always, even when we're * not interested in that timeline, so that if we're promoted to * become the master later on, we don't select the same timeline that * was already used in the current master. This isn't bullet-proof - * you'll need some external software to manage your cluster if you * need to ensure that a unique timeline id is chosen in every case, * but let's avoid the confusion of timeline id collisions where we * can. */ WalRcvFetchTimeLineHistoryFiles(startpointTLI, primaryTLI); B. Did something change in this regard in 9.3.6, 9.3.7 or 9.3.8? It was working in our previous setup, on 9.3.5, although that could have just been that the history file hadn't been removed from the backups yet. At quick glance, I can see nothing in xlog.c between those releases. Yep, I could reproduce the trouble even in 9.3.5 in my laptop. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On columnar storage
Merlin Moncure wrote: On Thu, Jun 11, 2015 at 6:03 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: We hope to have a chance to discuss this during the upcoming developer unconference in Ottawa. Here are some preliminary ideas to shed some light on what we're trying to do. Quick thought. We already support out of line columnar storage in a fashion with 'toast'. Obviously that's a long way from where you want to go, but have you ruled out extending that system? TOAST uses pointers in the heap row. A toasted column is still present in the heap -- there's no way to get across the 1600-column limitation if we rely on anything stored in the heap. Hence the design of the feature at hand is that the columnar storage somehow points to the heap, rather than the other way around. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash index creation warning
On 18 October 2014 at 15:36, Bruce Momjian br...@momjian.us wrote: On Fri, Oct 17, 2014 at 02:36:55PM -0400, Bruce Momjian wrote: On Fri, Oct 17, 2014 at 12:56:52PM -0400, Tom Lane wrote: David G Johnston david.g.johns...@gmail.com writes: The question is whether we explain the implications of not being WAL-logged in an error message or simply state the fact and let the documentation explain the hazards - basically just output: hash indexes are not WAL-logged and their use is discouraged +1. The warning message is not the place to be trying to explain all the details. OK, updated patch attached. Patch applied. I only just noticed this item when I read the release notes. Should we bother warning when used on an unlogged table? -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Further issues with jsonb semantics, documentation
On Thu, Jun 4, 2015 at 5:43 PM, Peter Geoghegan p...@heroku.com wrote: BTW, there is a bug here -- strtol() needs additional defenses [1] (before casting to int): postgres=# select jsonb_set('[1, 2, 3, 4, 5,6,7,8,9,10,11,12,13,14,15,16,17,18]', '{9223372036854775806}'::text[], 'Input unsanitized', false) ; jsonb_set -- [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, Input unsanitized, 18] (1 row) [1] https://www.securecoding.cert.org/confluence/display/cplusplus/INT06-CPP.+Use+strtol()+or+a+related+function+to+convert+a+string+token+to+an+integer I attach a fix for this bug. The commit message explains everything. -- Peter Geoghegan From 2f2042d93d00f85e52612bd7d7499c3238579d4d Mon Sep 17 00:00:00 2001 From: Peter Geoghegan peter.geoghega...@gmail.com Date: Fri, 12 Jun 2015 14:55:32 -0700 Subject: [PATCH 1/2] Fix path infrastructure bug affecting jsonb_set() jsonb_set() and other clients of the setPathArray() utility function could get spurious results when an array integer subscript is provided that is not within the range of int. To fix, ensure that the value returned by strtol() within setPathArray() is within the range of int; when it isn't, assume an invalid input in line with existing, similar cases. The path-orientated operators that appeared in PostgreSQL 9.3 and 9.4 do not call setPathArray(), and already independently take this precaution, so no change there. --- src/backend/utils/adt/jsonfuncs.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c index c14d3f7..13d5b7a 100644 --- a/src/backend/utils/adt/jsonfuncs.c +++ b/src/backend/utils/adt/jsonfuncs.c @@ -3814,11 +3814,14 @@ setPathArray(JsonbIterator **it, Datum *path_elems, bool *path_nulls, if (level path_len !path_nulls[level]) { char *c = VARDATA_ANY(path_elems[level]); + long lindex; errno = 0; - idx = (int) strtol(c, badp, 10); - if (errno != 0 || badp == c) + lindex = strtol(c, badp, 10); + if (errno != 0 || badp == c || lindex INT_MAX || lindex INT_MIN) idx = nelems; + else + idx = lindex; } else idx = nelems; -- 1.9.1 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Further issues with jsonb semantics, documentation
On Fri, Jun 12, 2015 at 12:26 PM, Andrew Dunstan and...@dunslane.net wrote: Here's some code for the count piece of that. Thanks. I'll look into integrating this with what I have. BTW, on reflection I'm not so sure about my decision to not touch the logic within jsonb_delete_idx() (commit b81c7b409). I probably should have changed it in line with the attached patch as part of that commit. What do you think? -- Peter Geoghegan From 8232f360a0696eb9279c29dfa7464edde726c5ae Mon Sep 17 00:00:00 2001 From: Peter Geoghegan peter.geoghega...@gmail.com Date: Fri, 5 Jun 2015 13:55:48 -0700 Subject: [PATCH 1/2] Remove dead code for jsonb subscript deletion --- src/backend/utils/adt/jsonfuncs.c | 12 +++- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c index c14d3f7..3fb8327 100644 --- a/src/backend/utils/adt/jsonfuncs.c +++ b/src/backend/utils/adt/jsonfuncs.c @@ -3411,10 +3411,8 @@ jsonb_delete_idx(PG_FUNCTION_ARGS) it = JsonbIteratorInit(in-root); r = JsonbIteratorNext(it, v, false); - if (r == WJB_BEGIN_ARRAY) - n = v.val.array.nElems; - else - n = v.val.object.nPairs; + Assert (r == WJB_BEGIN_ARRAY); + n = v.val.array.nElems; if (idx 0) { @@ -3431,14 +3429,10 @@ jsonb_delete_idx(PG_FUNCTION_ARGS) while ((r = JsonbIteratorNext(it, v, true)) != 0) { - if (r == WJB_ELEM || r == WJB_KEY) + if (r == WJB_ELEM) { if (i++ == idx) - { -if (r == WJB_KEY) - JsonbIteratorNext(it, v, true); /* skip value */ continue; - } } res = pushJsonbValue(state, r, r WJB_BEGIN_ARRAY ? v : NULL); -- 1.9.1 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On columnar storage
On Thu, Jun 11, 2015 at 7:03 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: We hope to have a chance to discuss this during the upcoming developer unconference in Ottawa. Here are some preliminary ideas to shed some light on what we're trying to do. I've been trying to figure out a plan to enable native column stores (CS or colstore) for Postgres. Motivations: * avoid the 32 TB limit for tables * avoid the 1600 column limit for tables * increased performance Are you looking to avoid all hardware-based limits, or would using a 64 bit row pointer be possible? That would give you 2^64 or 1.8 E19 unique rows over whatever granularity/uniqueness you use (per table, per database, etc.) -- Mike Nolan.
Re: [HACKERS] The Future of Aggregation
David Rowley david.row...@2ndquadrant.com wrote: I am a little curious what sort of machine you're running on, because my i7 is much slower. I ran a few other tests with your table for perspective. Assert enabled build? Mystery solved. Too often I forget to reconfigure with optimization and without cassert for quick tests like that. Hopefully the results are not skewed *too* badly by that in this case. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Entities created in one query not available in another in extended protocol
The JDBC driver tries to handle this by estimating how much data has been buffered. It mainly comes up when executing batch INSERTS as a large number of statements may be sent to the backend prior to reading back any results. There's a nice write up of the potential deadlock and the driver's logic to avoid it here: https://github.com/pgjdbc/pgjdbc/blob/7c0655b3683efa38cbe0d029385d8889f6392f98/org/postgresql/core/v3/QueryExecutorImpl.java#L300 Regards, -- Sehrope Sarkuni Founder CEO | JackDB, Inc. | https://www.jackdb.com/
Re: [HACKERS] Time to fully remove heap_formtuple() and friends?
Peter Geoghegan p...@heroku.com writes: Attached patch actually removes heap_formtuple() and friends. Does this seem worthwhile? Seems reasonable, but at this point I would say this is 9.6 material; third-party-module authors have enough to do with the API breaks we've already created for 9.5. Please enter this in commitfest-next. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time to fully remove heap_formtuple() and friends?
On Fri, Jun 12, 2015 at 8:47 PM, Tom Lane t...@sss.pgh.pa.us wrote: Peter Geoghegan p...@heroku.com writes: Attached patch actually removes heap_formtuple() and friends. Does this seem worthwhile? Seems reasonable, but at this point I would say this is 9.6 material; third-party-module authors have enough to do with the API breaks we've already created for 9.5. Please enter this in commitfest-next. Fair enough. I have done so. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collection of memory leaks for ECPG driver
On Fri, Jun 12, 2015 at 10:01 PM, Michael Meskes mes...@postgresql.org wrote: On Mon, Jun 08, 2015 at 10:50:25PM +0900, Michael Paquier wrote: Right, that's because they were developed before free received the safeguard, or the programmer simply didn't know at that point in time. Unless I'm mistaken we have other code that only calls free() without an additional safeguard, so why shuld ecpglib be special? If you don't like it using both approaches, feel free to remove the check in the other case. :) Well, I can send patches... More seriously, though, does anyone know of any platform where free(NULL) is *not* a noop? I recall reading that some past versions of SunOS crashed, but it is rather old... -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collection of memory leaks for ECPG driver
Michael Paquier michael.paqu...@gmail.com writes: On Fri, Jun 12, 2015 at 10:01 PM, Michael Meskes mes...@postgresql.org wrote: More seriously, though, does anyone know of any platform where free(NULL) is *not* a noop? I recall reading that some past versions of SunOS crashed, but it is rather old... Yeah, SunOS 4.x had issues here, but it's long dead. More to the point, both C89 and Single Unix Spec v2 clearly say that free(NULL) is a no-op; and it's been many years since we agreed that we had no interest in supporting platforms that didn't meet at least those minimum standards. So there is no need to worry about any code of ours that does free(NULL). But having said that, I would not be in a hurry to remove any existing if-guards for the case. For one thing, it makes the code look more similar to backend code that uses palloc/pfree, where we do *not* allow pfree(NULL). There's also the point that changing longstanding code creates back-patching hazards, so unless there's a clear gain it's best not to. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Incompatible trig error handling
On Wed, Apr 29, 2015 at 05:11:48PM -0700, Tom Lane wrote: John Gorman johngorm...@gmail.com writes: Two of the trigonometry functions have differing error condition behavior between Linux and OSX. The Linux behavior follows the standard set by the other trig functions. We have never considered it part of Postgres' charter to try to hide platform-specific variations in floating-point behavior. If we did, we'd spend all our time doing that rather than more productive stuff. In particular, it appears to me that both of these behaviors are allowed per the POSIX standard, which makes it very questionable why we should insist that one is correct and the other is not. In addition, the proposed patch turns *all* cases that return NaN into errors, which is wrong at least for the case where the input is NaN. OS X is a MATH_ERREXCEPT, !MATH_ERRNO platform. PostgreSQL wrongly assumes that all platforms are MATH_ERRNO platforms. The correct fix is to use fetestexcept() on !MATH_ERRNO platforms. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Thu, Jun 11, 2015 at 01:31:01PM -0400, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Thu, Jun 11, 2015 at 11:32 AM, Bruce Momjian br...@momjian.us wrote: Improve hash creation and lookup performance (Tomas Vondra, Teodor Sigaev, Tom Lane, Robert Haas) I suggest haveing two separate items. One of those is about the Hash executor node and the other is about our dynahash stuff. So they're completely different code bases. As far as 4a14f13a0 goes, I would think that ought to be mentioned under Source Code since it's a change in a rather widely used API. I doubt that the performance aspect of it is really all that exciting to end users, but third-party modules calling the dynahash code would care. The hash join changes are a completely different thing. Applied patch attached. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + commit 8bf51ad0cc26e80cbd082c111f45428db7a2f73b Author: Bruce Momjian br...@momjian.us Date: Fri Jun 12 22:16:08 2015 -0400 release notes: split apart hash items Report by Tom Lane, Robert Haas diff --git a/doc/src/sgml/release-9.5.sgml b/doc/src/sgml/release-9.5.sgml new file mode 100644 index 17301a4..283d061 *** a/doc/src/sgml/release-9.5.sgml --- b/doc/src/sgml/release-9.5.sgml *** *** 221,228 listitem para ! Improve hash creation and lookup performance (Tomas Vondra, ! Teodor Sigaev, Tom Lane, Robert Haas) /para /listitem --- 221,227 listitem para ! Improve in-memory hash performance (Tomas Vondra, Robert Haas) /para /listitem *** *** 1795,1800 --- 1794,1805 /para /listitem + listitem +para + Improve dynahash capabilities (Teodor Sigaev, Tom Lane) +/para + /listitem + listitem para Improve parallel execution infrastructure (Robert Haas, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Thu, Jun 11, 2015 at 09:02:35PM +0200, Magnus Hagander wrote: On Thu, Jun 11, 2015 at 8:56 PM, Josh Berkus j...@agliodbs.com wrote: On 06/10/2015 09:50 PM, Amit Kapila wrote: Also shall we mention about below in Migrations to 9.5 section pg_basebackup will not not work in tar mode against 9.4 and older servers, as we have introduced a new protocol option in that mode. AFAIK, pg_basebackup has never worked across versions. So there's no reason for this note. It has. The resulting backup has not been usable cross version, but pg_basebackup itself has. Though not always, and I'm not sure we've ever claimed it was supported, but it has worked. So we should still mention it in the release notes? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Thu, Jun 11, 2015 at 01:27:23PM -0400, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: On Thu, Jun 11, 2015 at 05:16:07PM +1200, David Rowley wrote: Would you also be able to mention something about�f15821e and�d222585 ? I am going to defer to Tom on that. I have added optimizer changes in the past but he didn't feel it made sense unless there was some user-visible change. I'd be inclined to document both of those. We mentioned outer join removal when it was first added, in 9.0, so making a significant improvement in it seems worthy of mention also. Both of these things are user visible to the extent that they affect EXPLAIN output. I'm not sure whether we need to document the semantic hazard that the second commit message worries about. OK, I have added two items; applied patch attached. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + diff --git a/doc/src/sgml/release-9.5.sgml b/doc/src/sgml/release-9.5.sgml new file mode 100644 index 98f2107..17301a4 *** a/doc/src/sgml/release-9.5.sgml --- b/doc/src/sgml/release-9.5.sgml *** *** 242,247 --- 242,262 listitem para + Allow the optimizer to remove unnecessary references to left + outer join subqueries (David Rowley) +/para + /listitem + + listitem +para + Allow pushdown of query restrictions into link + linkend=functions-windowwindow functions/, where appropriate + (David Rowley) +/para + /listitem + + listitem +para Speed up acronymCRC/ (cyclic redundancy check) computations (Abhijit Menon-Sen, Heikki Linnakangas) /para -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] could not truncate directory pg_subtrans: apparent wraparound
Hi Since the multixact equivalent of this problem[1] fell through the cracks on the multixact mega-thread, here is an updated patch that addresses this problem for both pg_subtrans and pg_multixact/offsets using the same approach: always step back one multixact/xid (rather than doing so only if oldest == next, which seemed like an unnecessary complication, and a bit futile since the result of such a test is only an instantaneous snapshot). I've added this to the commitfest[2]. I am also attaching a new set of repro scripts including a pair to test the case where next multixact/xid == first valid ID (the scripts with 'wraparound' in the name, which use dirty pg_resetxlog tricks to get into that situation). In my previous patch I naively subtracted one, which didn't work for those (even rarer!) cases. The new patch steps over the special ID values. This is a low priority bug: it just produces low probability bogus (but perhaps alarming) LOG messages and skips truncation during checkpoints on low activity systems. There have been occasional reports of these pg_subtrans messages going back as far as 2007 (and Alvaro was barking up the correct tree[3] back in 2010), so I figured it was worth following up. I also took a look at the pg_clog and pg_commit_ts truncation functions. You could argue that they have the same problem in theory (they pass a page number derived from the oldest xid to SimpleLruTruncate, and maybe there is a way for that to be an xid that hasn't been issued yet), but in practice I don't think it's a reachable condition. They use the frozen xid that is updated by vacuuming, but vacuum itself advances the next xid counter in the process. Is there a path though the vacuum code that ever exposes frozen xid == next xid? In contrast, for pg_subtrans we use GetOldestXmin(), which is equal to the next xid if there are no running transactions, and for pg_multixact we use the oldest multixact, which can be equal to the next multixact ID after a wraparound vacuum because vacuum itself doesn't always consume multixacts. [1] http://www.postgresql.org/message-id/CAEepm=0DqAtnM=23oq44bbnwvn3g6+dxx+s5g4jrbp-vy8g...@mail.gmail.com [2] https://commitfest.postgresql.org/5/265/ [3] http://www.postgresql.org/message-id/1274373980-sup-3...@alvh.no-ip.org -- Thomas Munro http://www.enterprisedb.com repro-bogus-multixact-error.sh Description: Bourne shell script repro-bogus-subtrans-error.sh Description: Bourne shell script repro-bogus-multixact-error-wraparound.sh Description: Bourne shell script repro-bogus-subtrans-error-wraparound.sh Description: Bourne shell script fix-bogus-truncation-errors.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Thu, Jun 11, 2015 at 05:17:54PM -0400, Robert Haas wrote: On Thu, Jun 11, 2015 at 4:23 PM, Peter Geoghegan p...@heroku.com wrote: Secondly, Robert didn't credit himself as an author in his commit message for the abbreviated keys infrastructure + text opclass support *at all*. However, I think that Robert should be listed as a secondary author of the abbreviated keys infrastructure, and that he would agree that I am clearly the primary author. Andrew Gierth did work on the datum case for sortsupport + abbreviation, so I agree he should be listed as a secondary author of the infrastructure too, after Robert. I'd probably say Peter, Andrew, me. Order changed, thanks. This entry was a consolidation of several commits so the proper order wasn't clear to me. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Fri, Jun 12, 2015 at 02:47:22PM +0200, Petr Jelinek wrote: Hi, + listitem + para +Add typeJSONB/ functions functionjsonb_set()/ and +functionjsonb_pretty/ (Dmitry Dolgov, Andrew Dunstan) + /para + /listitem I believe I should be 3rd author for this one as I rewrote large parts of this functionality as part of the review. Added. Thanks. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] 9.4.1 - 9.4.2 problem: could not access status of transaction 1
Just wanted to report that I rolled back my VM to where it was with 9.4.2 installed and it wouldn't start. I installed 9.4.4 and now it starts up just fine: 2015-06-12 16:05:58 PDT [6453]: [1-1] LOG: database system was shut down at 2015-05-27 13:12:55 PDT 2015-06-12 16:05:58 PDT [6453]: [2-1] LOG: MultiXact member wraparound protections are disabled because oldest checkpointed MultiXact 1 does not exist on disk 2015-06-12 16:05:58 PDT [6457]: [1-1] LOG: autovacuum launcher started 2015-06-12 16:05:58 PDT [6452]: [1-1] LOG: database system is ready to accept connections done server started And this is showing up in my serverlog periodically as the emergency autovacuums are running: 2015-06-12 16:13:44 PDT [6454]: [1-1] LOG: MultiXact member wraparound protections are disabled because oldest checkpointed MultiXact 1 does not exist on disk **Thank you Robert and all involved for the resolution to this.** With the fixes introduced in this release, such a situation will result in immediate emergency autovacuuming until a correct oldestMultiXid value can be determined Okay, I notice these vacuums are of the to prevent wraparound type (like VACUUM FREEZE), that do hold locks preventing ALTER TABLEs and such. Good to know, we'll plan our software updates accordingly. Is there any risk until these autovacuums finish?
Re: [HACKERS] Further issues with jsonb semantics, documentation
On Fri, Jun 12, 2015 at 4:31 PM, Andrew Dunstan and...@dunslane.net wrote: OK, pushed, although you'd have to be trying really hard to break this. Still, it's reasonable to defend against. I was trying really hard. :-) Thanks -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Time to fully remove heap_formtuple() and friends?
Commit 902d1cb3, made in 2008, established that the functions heap_formtuple(), heap_modifytuple(), and heap_deformtuple() were deprecated. The commit also actually removed those routines, replacing them with simple wrappers around their real replacements, which are spelled slightly differently and have a slightly different interface (their real replacements are heap_deform_tuple() and friends). The wrappers fulfilled the old, legacy interface, and were added for compatibility with third party code -- the old char 'n'/' ' convention for indicating nulls was bolted on top of calls to the new variants. Bolting that on to the new variants involves a new palloc() + pfree(), which isn't cheap. Attached patch actually removes heap_formtuple() and friends. Does this seem worthwhile? Does this seem like something that there should be a compatibility release note item for if we proceed? I have not added one yet. I think that enough time has passed that these wrappers are clearly 100% deadwood. If any external extensions are still using heap_formtuple(), they ought to be updated to work with Postgres 9.5 by using the new variants -- a straight switch-over of callers in the style of 902d1cb3 should now work against all supported versions of PostgreSQL, and without macro hacks. Affected external code building against the removed legacy interface will reliably fail to build, forcing the third party code to conform in a non-surprising way. Removing the code seems very low risk. -- Peter Geoghegan diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c index 09aea79..045e0a7 100644 --- a/src/backend/access/common/heaptuple.c +++ b/src/backend/access/common/heaptuple.c @@ -777,39 +777,6 @@ heap_form_tuple(TupleDesc tupleDescriptor, } /* - * heap_formtuple - * - * construct a tuple from the given values[] and nulls[] arrays - * - * Null attributes are indicated by a 'n' in the appropriate byte - * of nulls[]. Non-null attributes are indicated by a ' ' (space). - * - * OLD API with char 'n'/' ' convention for indicating nulls. - * This is deprecated and should not be used in new code, but we keep it - * around for use by old add-on modules. - */ -HeapTuple -heap_formtuple(TupleDesc tupleDescriptor, - Datum *values, - char *nulls) -{ - HeapTuple tuple; /* return tuple */ - int numberOfAttributes = tupleDescriptor-natts; - bool *boolNulls = (bool *) palloc(numberOfAttributes * sizeof(bool)); - int i; - - for (i = 0; i numberOfAttributes; i++) - boolNulls[i] = (nulls[i] == 'n'); - - tuple = heap_form_tuple(tupleDescriptor, values, boolNulls); - - pfree(boolNulls); - - return tuple; -} - - -/* * heap_modify_tuple * form a new tuple from an old tuple and a set of replacement values. * @@ -880,44 +847,6 @@ heap_modify_tuple(HeapTuple tuple, } /* - * heap_modifytuple - * - * forms a new tuple from an old tuple and a set of replacement values. - * returns a new palloc'ed tuple. - * - * OLD API with char 'n'/' ' convention for indicating nulls, and - * char 'r'/' ' convention for indicating whether to replace columns. - * This is deprecated and should not be used in new code, but we keep it - * around for use by old add-on modules. - */ -HeapTuple -heap_modifytuple(HeapTuple tuple, - TupleDesc tupleDesc, - Datum *replValues, - char *replNulls, - char *replActions) -{ - HeapTuple result; - int numberOfAttributes = tupleDesc-natts; - bool *boolNulls = (bool *) palloc(numberOfAttributes * sizeof(bool)); - bool *boolActions = (bool *) palloc(numberOfAttributes * sizeof(bool)); - int attnum; - - for (attnum = 0; attnum numberOfAttributes; attnum++) - { - boolNulls[attnum] = (replNulls[attnum] == 'n'); - boolActions[attnum] = (replActions[attnum] == 'r'); - } - - result = heap_modify_tuple(tuple, tupleDesc, replValues, boolNulls, boolActions); - - pfree(boolNulls); - pfree(boolActions); - - return result; -} - -/* * heap_deform_tuple * Given a tuple, extract data into values/isnull arrays; this is * the inverse of heap_form_tuple. @@ -1025,46 +954,6 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc, } /* - * heap_deformtuple - * - * Given a tuple, extract data into values/nulls arrays; this is - * the inverse of heap_formtuple. - * - * Storage for the values/nulls arrays is provided by the caller; - * it should be sized according to tupleDesc-natts not - * HeapTupleHeaderGetNatts(tuple-t_data). - * - * Note that for pass-by-reference datatypes, the pointer placed - * in the Datum will point into the given tuple. - * - * When all or most of a tuple's fields need to be extracted, - * this routine will be significantly quicker than a loop around - * heap_getattr; the loop will become O(N^2) as soon as any - * noncacheable attribute offsets are involved. - * - * OLD API with char 'n'/' ' convention for indicating nulls. - * This is deprecated and should not be used in new code, but we keep it - *
Re: [HACKERS] 9.5 release notes
On Sat, Jun 13, 2015 at 7:47 AM, Bruce Momjian br...@momjian.us wrote: On Thu, Jun 11, 2015 at 09:02:35PM +0200, Magnus Hagander wrote: On Thu, Jun 11, 2015 at 8:56 PM, Josh Berkus j...@agliodbs.com wrote: On 06/10/2015 09:50 PM, Amit Kapila wrote: Also shall we mention about below in Migrations to 9.5 section pg_basebackup will not not work in tar mode against 9.4 and older servers, as we have introduced a new protocol option in that mode. AFAIK, pg_basebackup has never worked across versions. So there's no reason for this note. It has. The resulting backup has not been usable cross version, but pg_basebackup itself has. Though not always, and I'm not sure we've ever claimed it was supported, but it has worked. So we should still mention it in the release notes? If it has never lead to usable backup's for cross version backup, then I think there is no pressing need to mention it. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] 9.5 release notes
On Fri, Jun 12, 2015 at 12:49:11PM +0900, Fujii Masao wrote: On Thu, Jun 11, 2015 at 1:15 PM, Bruce Momjian br...@momjian.us wrote: I have committed the first draft of the 9.5 release notes. You can view the output here: http://momjian.us/pgsql_docs/release-9-5.html and it will eventually appear here: http://www.postgresql.org/docs/devel/static/release.html I found some minor issues. e.g. literalIDENTIFY_COMMAND/, are not logged, even when varnamelog_statements/ is set to literalall/. Typos. s/IDENTIFY_COMMAND/IDENTIFY_SYSTEM s/log_statements/log_statement Updated. RETURN WHERE /para Looks like garbage. It is actually a question; I have reworded it. Add literalVERBOSE/ option to commandREINDEX/ (Fujii Masao) Could you change the author name to Sawada Masahiko? He is the author of the feature. Thanks, done. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On columnar storage
On Thu, Jun 11, 2015 at 6:03 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: We hope to have a chance to discuss this during the upcoming developer unconference in Ottawa. Here are some preliminary ideas to shed some light on what we're trying to do. Quick thought. We already support out of line columnar storage in a fashion with 'toast'. Obviously that's a long way from where you want to go, but have you ruled out extending that system? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git push hook to check for outdated timestamps
On 06/12/2015 09:31 AM, Robert Haas wrote: Could we update our git hook to refuse a push of a new commit whose timestamp is more than, say, 24 hours in the past? Our commit history has some timestamps in it now that are over a month off, and it's really easy to do, because when you rebase a commit, it keeps the old timestamp. If you then merge or cherry-pick that into an official branch rather than patch + commit, you end up with this problem unless you are careful to fix it by hand. It would be nice to prevent further mistakes of this sort, as they create confusion. +1. I think 24 hours is probably fairly generous, but in principle this seems right - the timestamp would ideally be pretty close to the time it hits the public tree. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Further issues with jsonb semantics, documentation
On 06/10/2015 04:02 PM, Peter Geoghegan wrote: On Wed, Jun 10, 2015 at 11:48 AM, Andrew Dunstan and...@dunslane.net wrote: Sorry for the delay on this. I've been mostly off the grid, having an all too rare visit from Tom Mr Enum Dunstan, and I misunderstood what you were suggesting, Thank you for working with me to address this. I've been busy with other things the past few days too. Please submit a patch to adjust the treatment of negative integers in the old functions to be consistent with their treatment in the new functions. i.e. in the range [-n,-1] they should refer to the corresponding element counting from the right. I've almost finished that patch. I'm currently blocked on deciding what to do about the old path-orientated operators (# and # for json and jsonb types). It's rather painful to support pushing down negative subscripting there -- maybe we should just not do so for those variants, especially given that they're already notationally inconsistent with the other operators that I'll be updating. What do you think? Maybe I'll come up with a simpler way of making that work by taking a fresh look at it, but haven't done that yet. My current, draft approach to making subscripting work with the json variants (not the jsonb variants) is to use a second get_worker() call in the event of a negative subscript, while making the first such call (the existing get_worker() call) establish the number of top-level array elements. That isn't beautiful, and involves some amount of redundant work, but it's probably better than messing with get_worker() in a more invasive way. Besides, this second get_worker() call only needs to happen in the event of a negative subscript, and I'm only really updating this (that is, updating routines like json_array_element()) to preserve consistency with jsonb. What do you think of that idea? Just took a quick look. My impression is that the jsonb case should be fairly easy. If the index is negative, add JB_ROOT_COUNT(container) to it and use that as the argument to getIthJsonbValueFromContainer(). I agree that the json case looks a bit nasty. Maybe a better approach would be to provide a function that, given a JsonLexContext, returns the number of array elements of the current array. In get_array_start we could call that if the relevant path element is negative and adjust it accordingly. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Proposal] Progress bar for pg_dump/pg_restore
Hi, On 2015-06-12 12:45:50 +, Taiki Kondo wrote: Design API When pg_dump / pg_restore is running, progress bar and estimated time to finish is shown on screen like following. = (50%) 15:50 The bar (= in above) and percentage value (50% in above) show percentage of progress, and the time (15:50 in above) shows estimated time to finish. (This percentage is the ratio for the whole processing.) Percentage and time are calculated and shown for every 1 second. In pg_dump, the information, which is required for calculating percentage and time, is from pg_class. In pg_restore, to calculate the same things, I want to record total amount of command lines into pg_dump file, thus I would like to add a new element to Archive structure. (This means that version number of archive format is changed.) The question is how to actually get useful estimates. As there's no progress report for indvidiual COPY and CREATE INDEX commands you'll, in many cases, have very irregular progress updates. In many many cases most of the time is spent on a very small subset of the total objects. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_rewind failure by file deletion in source server
On Fri, Jun 12, 2015 at 4:29 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Fri, Jun 12, 2015 at 3:50 PM, Fujii Masao masao.fu...@gmail.com wrote: On Fri, Jun 12, 2015 at 3:17 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Thu, Jun 11, 2015 at 5:48 PM, Fujii Masao masao.fu...@gmail.com wrote: On Thu, Jun 11, 2015 at 2:14 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Thu, Jun 11, 2015 at 1:51 AM, Fujii Masao masao.fu...@gmail.com wrote: Shouldn't pg_rewind ignore that failure of operation? If the file is not found in source server, the file doesn't need to be copied to destination server obviously. So ISTM that pg_rewind safely can skip copying that file. Thought? I think that you should fail. Let's imagine that the master to be rewound has removed a relation file before being stopped cleanly after its standby has been promoted that was here at the last checkpoint before forking, and that the standby still has the relation file after promotion. You should be able to copy it to be able to replay WAL on it. If the standby has removed a file in the file map after taking the file map, I guess that the best thing to do is fail because the file that should be here for the rewound node cannot be fetched. In this case, why do you think that the file should exist in the old master? Even if it doesn't exist, ISTM that the old master can safely replay the WAL records related to the file when it restarts. So what's the problem if the file doesn't exist in the old master? Well, some user may want to rewind the master down to the point where WAL forked, and then recover it immediately when a consistent point is reached just at restart instead of replugging it into the cluster. In this case I think that you need the relation file of the dropped relation to get a consistent state. That's still cheaper than recreating a node from a fresh base backup in some cases, particularly if the last base backup taken is far in the past for this cluster. So it's the case where a user wants to recover old master up to the point BEFORE the file in question is deleted in new master. At that point, since the file must exist, pg_rewind should fail if the file cannot be copied from new master. Is my understanding right? Yep. We are on the same line. As far as I read the code of pg_rewind, ISTM that your scenario never happens. Because pg_rewind sets the minimum recovery point to the latest WAL location in new master, i.e., AFTER the file is deleted. So old master cannot stop recovering before the file is deleted in new master. If the recovery stops at that point, it fails because the minimum recovery point is not reached yet. IOW, after pg_rewind runs, the old master has to replay the WAL records which were generated by the deletion of the file in the new master. So it's okay if the old master doesn't have the file after pg_rewind runs, I think. Ah, right. I withdraw, indeed what I thought can not happen: /* * Update control file of target. Make it ready to perform archive * recovery when restarting. * * minRecoveryPoint is set to the current WAL insert location in the * source server. Like in an online backup, it's important that we recover * all the WAL that was generated while we copied the files over. */ So a rewound node will replay WAL up to the current insert location of the source, and will fail at recovery if recovery target is older than this insert location.. You want to draft a patch? Should I? Please feel free to try that! :) I think that we should have a test case as well in pg_rewind/t/. Maybe. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git push hook to check for outdated timestamps
Andrew Dunstan and...@dunslane.net writes: On 06/12/2015 09:31 AM, Robert Haas wrote: Could we update our git hook to refuse a push of a new commit whose timestamp is more than, say, 24 hours in the past? Our commit history has some timestamps in it now that are over a month off, and it's really easy to do, because when you rebase a commit, it keeps the old timestamp. If you then merge or cherry-pick that into an official branch rather than patch + commit, you end up with this problem unless you are careful to fix it by hand. It would be nice to prevent further mistakes of this sort, as they create confusion. I think 24 hours is probably fairly generous, Yeah ... if we're going to try to enforce a linear-looking history, ISTM the requirement ought to be newer than the latest commit on the same branch. Perhaps that would be unduly hard to implement though. FWIW, our git_changelog script tries to avoid this problem by paying attention to CommitDate not Date. But I agree that it's confusing when those fields are far apart. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The purpose of the core team
On Fri, Jun 12, 2015 at 1:21 AM, Simon Riggs si...@2ndquadrant.com wrote: Deciding WHAT goes in the next release? is what Committers do, by definition. It seems strange to have a different mailing list for WHEN is the next release needed?, so those two things should be combined. Core team members have sometimes taken the position that this is already so; that releases should be discussed on pgsql-hackers and pgsql-security as appropriate to the driver. In theory, this may be fine, but in practice, it doesn't seem to be working very well right now. Packagers should be about HOW do we make the next release, which is separate from the above. Ultimately, How effects When, but When is it needed? is an earlier thought. +1. Completely agreed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why does replication need the old history file?
Questions: A. Why does the replica need 0002.history? Shouldn't it only need 0003.history? From where is the base backup taken in case of the node started at 5? It is the same backup used to restore the master, restored to a point in time 5 minutes earlier just to make sure the replica isn't ahead of the master. The related source code comment says /* * Get any missing history files. We do this always, even when we're * not interested in that timeline, so that if we're promoted to * become the master later on, we don't select the same timeline that * was already used in the current master. This isn't bullet-proof - * you'll need some external software to manage your cluster if you * need to ensure that a unique timeline id is chosen in every case, * but let's avoid the confusion of timeline id collisions where we * can. */ WalRcvFetchTimeLineHistoryFiles(startpointTLI, primaryTLI); So this seems to be something we're doing just in case which is preventing a useful way to spin up large master/replica clusters from PITR backup. Might this be something we want to change, and simply error that we can't find the history file instead of FATAL? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On columnar storage
Amit Kapila wrote: On Fri, Jun 12, 2015 at 4:33 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: There are several parts to this: 1. the CSM API 2. Cataloguing column stores 3. Query processing: rewriter, optimizer, executor I think another important point is about the format of column stores, in Page format used by index/heap and how are they organised? Not really. That stuff is part of the column store implementation itself; we're not tackling that part just yet. Eventually there might be an implementation using ORC or other formats. That doesn't matter at this point -- we only need something that implements the specified API. One critical detail is what will be used to identify a heap row when talking to a CS implementation. There are two main possibilities: 1. use CTIDs 2. use some logical tuple identifier Using CTIDs is simpler. One disadvantage is that every UPDATE of a row needs to let the CS know about the new location of the tuple, so that the value is known associated with the new tuple location as well as the old. This needs to happen even if the value of the column itself is not changed. Isn't this somewhat similar to index segment? Not sure what you mean with index segment. A column store is not an index -- it is the primary storage for the column in question. The heap does not have a copy of the data. Will the column store obey snapshot model similar to current heap tuples, if so will it derive the transaction information from heap tuple? Yes, visibility will be tied to the heap tuple -- a value is accessed only when its corresponding heap row has already been determined to be visible. One interesting point that raises from this is about vacuum: when are we able to remove a value from the store? I have some not-completely-formed ideas about this. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reconsidering the behavior of ALTER COLUMN TYPE
Noah Misch n...@leadboat.com writes: On Thu, Jun 11, 2015 at 03:41:49PM -0500, Merlin Moncure wrote: On Thu, Jun 11, 2015 at 3:12 PM, Tom Lane t...@sss.pgh.pa.us wrote: In any case, we oughta use two different error messages for the two cases, as per my comment in the above thread. That seems like a back-patchable bug fix, though of course any semantics change should only be in HEAD. I have a slight preference to keep it to tightening up the wording on both the hint and the error (for example, Perhaps you meant USING foo::type?) but leaving the behavior alone. +1. The HINT could certainly provide situation-specific help. Fair enough, I'll go fix that but leave the semantics alone. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reaper should restart archiver even on standby
On Thu, Jun 11, 2015 at 1:39 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Fujii Masao wrote: Agreed. The attached patch defines the macro to check whether archiver is allowed to start up or not, and uses it everywhere except sigusr1_handler. I made sigusr1_handler use a different condition because only it tries to start archiver in PM_STARTUP postmaster state and it looks a bit messy to add the check of that state into the centralized check condition. WFM, but do these macros in xlog.h need a one-line comment to state their purpose? Yes, I added the comments and just pushed the patch. Thanks! Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] git push hook to check for outdated timestamps
Could we update our git hook to refuse a push of a new commit whose timestamp is more than, say, 24 hours in the past? Our commit history has some timestamps in it now that are over a month off, and it's really easy to do, because when you rebase a commit, it keeps the old timestamp. If you then merge or cherry-pick that into an official branch rather than patch + commit, you end up with this problem unless you are careful to fix it by hand. It would be nice to prevent further mistakes of this sort, as they create confusion. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Entities created in one query not available in another in extended protocol
On 11 June 2015 at 22:12, Shay Rojansky r...@roji.org wrote: Thanks everyone for your time (or rather sorry for having wasted it). Just in case it's interesting to you... The reason we implemented things this way is in order to avoid a deadlock situation - if we send two queries as P1/D1/B1/E1/P2/D2/B2/E2, and the first query has a large resultset, PostgreSQL may block writing the resultset, since Npgsql isn't reading it at that point. Npgsql on its part may get stuck writing the second query (if it's big enough) since PostgreSQL isn't reading on its end (thanks to Emil Lenngren for pointing this out originally). That part does sound like a problem that we have no good answer to. Sounds worth starting a new thread on that. Of course this isn't an excuse for anything, we're looking into ways of solving this problem differently in our driver implementation. Shay On Thu, Jun 11, 2015 at 6:17 PM, Simon Riggs si...@2ndquadrant.com wrote: On 11 June 2015 at 16:56, Shay Rojansky r...@roji.org wrote: Npgsql (currently) sends Parse for the second command before sending Execute for the first one. Look no further than that. -- Simon Riggshttp://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services
Re: [HACKERS] On columnar storage
Alvaro Herrera alvhe...@2ndquadrant.com writes: Amit Kapila wrote: Will the column store obey snapshot model similar to current heap tuples, if so will it derive the transaction information from heap tuple? Yes, visibility will be tied to the heap tuple -- a value is accessed only when its corresponding heap row has already been determined to be visible. One interesting point that raises from this is about vacuum: when are we able to remove a value from the store? I have some not-completely-formed ideas about this. Hm. This seems not terribly ambitious --- mightn't a column store extension wish to store tables *entirely* in the column store, rather than tying them to a perhaps-vestigial heap table? Perhaps that's a necessary restriction to get to something implementable, but considering that the proposal mentions pluggable column stores I should think you'd not want to tie it down that much. I can't help thinking that this could tie in with the storage level API that I was waving my arms about last year. Or maybe not --- the goals are substantially different --- but I think we ought to reflect on that rather than just doing a narrow hack for column stores used in the particular way you're describing here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers