Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Craig Ringer
On 10 April 2018 at 13:04, Michael Paquier wrote: > On Mon, Apr 09, 2018 at 03:02:11PM -0400, Robert Haas wrote: >> Another consequence of this behavior that initdb -S is never reliable, >> so pg_rewind's use of it doesn't actually fix the problem it was >> intended to solve.

Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)

2018-04-09 Thread Andrey Borodin
> 9 апр. 2018 г., в 23:04, Heikki Linnakangas написал(а): > > On 09/04/18 18:21, Andrey Borodin wrote: >>> 9 апр. 2018 г., в 19:50, Teodor Sigaev >>> написал(а): 3. Why do we *not* lock the entry leaf page, if there is no match? We still need a

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Michael Paquier
On Mon, Apr 09, 2018 at 03:02:11PM -0400, Robert Haas wrote: > Another consequence of this behavior that initdb -S is never reliable, > so pg_rewind's use of it doesn't actually fix the problem it was > intended to solve. It also means that initdb itself isn't crash-safe, > since the data file

Re: [sqlsmith] Failed assertion on pfree() via perform_pruning_combine_step

2018-04-09 Thread Michael Paquier
On Mon, Apr 09, 2018 at 10:59:48AM -0300, Alvaro Herrera wrote: > Amit Langote wrote: >> I have reproduced this and found that the problem is that >> perform_pruning_combine_step forgets to *copy* the bitmapset of the first >> step in the handling of an COMBINE_INTERSECT step. > > Pushed, thanks

Re: lazy detoasting

2018-04-09 Thread Andrew Gierth
> "Chapman" == Chapman Flack writes: Chapman> AFAICS, that is *all* that comment block has to say about why Chapman> there's an active snapshot stack. I believe you are saying it Chapman> has another important function, namely that its top element is Chapman> what

Re: [HACKERS] path toward faster partition pruning

2018-04-09 Thread Ashutosh Bapat
On Mon, Apr 9, 2018 at 8:56 PM, Robert Haas wrote: > On Fri, Apr 6, 2018 at 11:41 PM, Tom Lane wrote: >> David Rowley writes: >>> Sounds like you're saying that if we have too many alternative files >>> then there's a

Re: pruning disabled for array, enum, record, range type partition keys

2018-04-09 Thread Amit Langote
Thanks for the comment. On 2018/04/09 23:22, Tom Lane wrote: > Amit Langote writes: >> I noticed that the newly added pruning does not work if the partition key >> is of one of the types that have a corresponding pseudo-type. > > While I don't recall the details

Gotchas about pg_verify_checksums

2018-04-09 Thread Michael Paquier
Hi all, I have not been giving much attention to the thread about enabling checksums online, which has resulted in the revert of the feature, but there is still pg_verify_checksums around. So I looked at it a bit. I have a couple of questions/gotchas about it: 1) The documentation states that

Re: [sqlsmith] Failed assertion on pfree() via perform_pruning_combine_step

2018-04-09 Thread Amit Langote
On 2018/04/09 22:59, Alvaro Herrera wrote: > Hello, > > Amit Langote wrote: > >> I have reproduced this and found that the problem is that >> perform_pruning_combine_step forgets to *copy* the bitmapset of the first >> step in the handling of an COMBINE_INTERSECT step. > > Pushed, thanks Amit

Re: Warnings and uninitialized variables in TAP tests

2018-04-09 Thread Michael Paquier
On Mon, Apr 09, 2018 at 09:46:29PM +0200, Magnus Hagander wrote: > Applied, thanks. Thanks for the commit. -- Michael signature.asc Description: PGP signature

Re: pgsql: Merge catalog/pg_foo_fn.h headers back into pg_foo.h headers.

2018-04-09 Thread Michael Paquier
On Mon, Apr 09, 2018 at 04:46:34PM -0400, Tom Lane wrote: > After further contemplation I decided that that was, in fact, the only > reasonable way to improve matters. If we have multiple subdirectories > independently firing the "make generated-headers" action, then we have > parallel make

Re: ON CONFLICT DO UPDATE for partitioned tables

2018-04-09 Thread Amit Langote
On 2018/03/27 13:27, Amit Langote wrote: > On 2018/03/26 23:20, Alvaro Herrera wrote: >> The one thing I wasn't terribly in love with is the four calls to >> map_partition_varattnos(), creating the attribute map four times ... but >> we already have it in the TupleConversionMap, no? Looks like we

Re: WIP: Covering + unique indexes.

2018-04-09 Thread Peter Geoghegan
On Sun, Apr 8, 2018 at 11:19 PM, Teodor Sigaev wrote: > Thank you, pushed. I noticed a few more issues following another pass-through of the patch: * There is no pfree() within _bt_buildadd() for truncated tuples, even though that's a context where it's clearly not okay. * It

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Craig Ringer
On 10 April 2018 at 08:41, Andreas Karlsson wrote: > On 04/09/2018 02:16 PM, Craig Ringer wrote: >> >> I'd like a middle ground where the kernel lets us register our interest >> and tells us if it lost something, without us having to keep eight million >> FDs open for some long

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Andres Freund
On April 9, 2018 6:59:03 PM PDT, Craig Ringer wrote: >On 10 April 2018 at 04:37, Andres Freund wrote: >> Hi, >> >> On 2018-04-09 22:30:00 +0200, Tomas Vondra wrote: >>> Maybe. I'd certainly prefer automated recovery from an temporary I/O >>> issues

Re: Excessive PostmasterIsAlive calls slow down WAL redo

2018-04-09 Thread Andres Freund
On April 9, 2018 6:57:23 PM PDT, Alvaro Herrera wrote: >Andres Freund wrote: >> >> On April 9, 2018 6:31:07 PM PDT, Alvaro Herrera > wrote: > >> >Would it work to use this second pipe, to which each child writes a >> >byte that postmaster never

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Craig Ringer
On 10 April 2018 at 04:37, Andres Freund wrote: > Hi, > > On 2018-04-09 22:30:00 +0200, Tomas Vondra wrote: >> Maybe. I'd certainly prefer automated recovery from an temporary I/O >> issues (like full disk on thin-provisioning) without the database >> crashing and restarting.

Re: Excessive PostmasterIsAlive calls slow down WAL redo

2018-04-09 Thread Alvaro Herrera
Andres Freund wrote: > > On April 9, 2018 6:31:07 PM PDT, Alvaro Herrera > wrote: > >Would it work to use this second pipe, to which each child writes a > >byte that postmaster never reads, and then rely on SIGPIPE when > >postmaster dies? Then we never need to do a

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Craig Ringer
On 10 April 2018 at 04:25, Mark Dilger wrote: > I was reading this thread up until now as meaning that the standby could > receive corrupt WAL data and become corrupted. Yes, it can, but not directly through the first error. What can happen is that we think a block got

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Thomas Munro
On Tue, Apr 10, 2018 at 1:44 PM, Craig Ringer wrote: > On 10 April 2018 at 03:59, Andres Freund wrote: >> I don't think that's as hard as some people argued in this thread. We >> could very well open a pipe in postmaster with the write end open in >>

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Craig Ringer
On 10 April 2018 at 03:59, Andres Freund wrote: > On 2018-04-09 14:41:19 -0500, Justin Pryzby wrote: >> On Mon, Apr 09, 2018 at 09:31:56AM +0800, Craig Ringer wrote: >> > You could make the argument that it's OK to forget if the entire file >> > system goes away. But actually,

Re: Excessive PostmasterIsAlive calls slow down WAL redo

2018-04-09 Thread Andres Freund
On April 9, 2018 6:36:19 PM PDT, Thomas Munro wrote: >On Tue, Apr 10, 2018 at 12:53 PM, Andres Freund >wrote: >> I coincidentally got pinged about our current approach causing >> performance problems on FreeBSD and started writing a patch.

Re: Excessive PostmasterIsAlive calls slow down WAL redo

2018-04-09 Thread Thomas Munro
On Tue, Apr 10, 2018 at 12:53 PM, Andres Freund wrote: > I coincidentally got pinged about our current approach causing > performance problems on FreeBSD and started writing a patch. The > problem there appears to be that constantly attaching events to the read > pipe end,

Re: Boolean partitions syntax

2018-04-09 Thread Kyotaro HORIGUCHI
Hello, I returned to this. I'd like to insisnt on prposing to use existing parser element. At Mon, 9 Apr 2018 10:11:08 -0400, "Jonathan S. Katz" wrote in <27021281-2ed7-4cde-9d82-366af10b3...@excoventures.com> > > On Apr 9, 2018, at 10:06 AM, Tom Lane

Re: Excessive PostmasterIsAlive calls slow down WAL redo

2018-04-09 Thread Andres Freund
On April 9, 2018 6:31:07 PM PDT, Alvaro Herrera wrote: >Andres Freund wrote: > >> Another approach, that's simpler to implement, is to simply have a >> second selfpipe, just for WL_POSTMASTER_DEATH. > >Would it work to use this second pipe, to which each child writes a

Re: Excessive PostmasterIsAlive calls slow down WAL redo

2018-04-09 Thread Alvaro Herrera
Andres Freund wrote: > Another approach, that's simpler to implement, is to simply have a > second selfpipe, just for WL_POSTMASTER_DEATH. Would it work to use this second pipe, to which each child writes a byte that postmaster never reads, and then rely on SIGPIPE when postmaster dies? Then we

Re: Excessive PostmasterIsAlive calls slow down WAL redo

2018-04-09 Thread Andres Freund
Hi, On 2018-04-05 12:20:38 -0700, Andres Freund wrote: > > While it's not POSIX, at least some platforms are capable of delivering > > a separate signal on parent process death. Perhaps using that where > > available would be enough of an answer. > > Yea, that'd work on linux. Which is probably

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Andreas Karlsson
On 04/09/2018 02:16 PM, Craig Ringer wrote: I'd like a middle ground where the kernel lets us register our interest and tells us if it lost something, without us having to keep eight million FDs open for some long period. "Tell us about anything that happens under pgdata/" or an inotify-style

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Thomas Munro
On Tue, Apr 10, 2018 at 10:33 AM, Thomas Munro wrote: > I wonder if anyone can tell us what Windows, AIX and HPUX do here. I created a wiki page to track what we know (or think we know) about fsync() on various operating systems:

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Thomas Munro
On Tue, Apr 10, 2018 at 2:22 AM, Anthony Iliopoulos wrote: > On Mon, Apr 09, 2018 at 03:33:18PM +0200, Tomas Vondra wrote: >> Well, there seem to be kernels that seem to do exactly that already. At >> least that's how I understand what this thread says about FreeBSD and >>

Re: [HACKERS] Optional message to user when terminating/cancelling backend

2018-04-09 Thread Andres Freund
Hi, On 2017-06-20 13:01:35 -0700, Andres Freund wrote: > For extensions it'd also be useful if it'd be possible to overwrite the > error code. E.g. for citus there's a distributed deadlock detector, > running out of process because there's no way to interrupt lock waits > locally, and we've to

Re: [HACKERS] Optional message to user when terminating/cancelling backend

2018-04-09 Thread Daniel Gustafsson
> On 09 Apr 2018, at 02:47, Michael Paquier wrote: > > On Fri, Apr 06, 2018 at 11:18:34AM +0200, Daniel Gustafsson wrote: >> Yep, I completely agree. Attached are patches with the quotes removed and >> rebased since Oids were taken etc. > > I still find this idea

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Mark Dilger
> On Apr 9, 2018, at 2:25 PM, Tomas Vondra wrote: > > > > On 04/09/2018 11:08 PM, Andres Freund wrote: >> Hi, >> >> On 2018-04-09 13:55:29 -0700, Mark Dilger wrote: >>> I can also imagine a master and standby that are similarly provisioned, >>> and thus hit an

Re: Fix pg_rewind which can be run as root user

2018-04-09 Thread Michael Paquier
On Mon, Apr 09, 2018 at 09:36:40PM +0200, Magnus Hagander wrote: > Applied, and pushed this way. OK, thanks for the commit. -- Michael signature.asc Description: PGP signature

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Tomas Vondra
On 04/09/2018 11:08 PM, Andres Freund wrote: > Hi, > > On 2018-04-09 13:55:29 -0700, Mark Dilger wrote: >> I can also imagine a master and standby that are similarly provisioned, >> and thus hit an out of disk error at around the same time, resulting in >> corruption on both, even if not the

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Andres Freund
Hi, On 2018-04-09 13:55:29 -0700, Mark Dilger wrote: > I can also imagine a master and standby that are similarly provisioned, > and thus hit an out of disk error at around the same time, resulting in > corruption on both, even if not the same corruption. I think it's a grave mistake conflating

Re: Shared PostgreSQL libraries and symbol versioning

2018-04-09 Thread Tom Lane
Stephen Frost writes: > * Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote: >> On 4/5/18 02:04, Pavel Raiskup wrote: >>> As a followup thought; there are probably two major obstacles ATM >>> - the DSOs' symbols are not yet versioned, and >>> - the build-system doesn't

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Mark Dilger
> On Apr 9, 2018, at 1:43 PM, Tomas Vondra wrote: > > > > On 04/09/2018 10:25 PM, Mark Dilger wrote: >> >>> On Apr 9, 2018, at 12:13 PM, Andres Freund wrote: >>> >>> Hi, >>> >>> On 2018-04-09 15:02:11 -0400, Robert Haas wrote: I think

Re: pgsql: Merge catalog/pg_foo_fn.h headers back into pg_foo.h headers.

2018-04-09 Thread Tom Lane
I wrote: > Michael Paquier writes: >> That takes care of the problem from the root of the directory, but when >> doing the same from src/bin/ then the same issue shows up even if >> src/Makefile is patched to handle install targets. > Hm. Not sure how far we want to go in

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Tomas Vondra
On 04/09/2018 10:25 PM, Mark Dilger wrote: > >> On Apr 9, 2018, at 12:13 PM, Andres Freund wrote: >> >> Hi, >> >> On 2018-04-09 15:02:11 -0400, Robert Haas wrote: >>> I think the simplest technological solution to this problem is to >>> rewrite the entire backend and all

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Andres Freund
Hi, On 2018-04-09 22:30:00 +0200, Tomas Vondra wrote: > Maybe. I'd certainly prefer automated recovery from an temporary I/O > issues (like full disk on thin-provisioning) without the database > crashing and restarting. But I'm not sure it's worth the effort. Oh, I agree on that one. But that's

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Andres Freund
Hi, On 2018-04-09 13:25:54 -0700, Mark Dilger wrote: > I was reading this thread up until now as meaning that the standby could > receive corrupt WAL data and become corrupted. I don't see that as a real problem here. For one the problematic scenarios shouldn't readily apply, for another WAL is

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Tomas Vondra
On 04/09/2018 10:04 PM, Andres Freund wrote: > Hi, > > On 2018-04-09 21:54:05 +0200, Tomas Vondra wrote: >> Isn't the expectation that when a fsync call fails, the next one will >> retry writing the pages in the hope that it succeeds? > > Some people expect that, I personally don't think it's

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Mark Dilger
> On Apr 9, 2018, at 12:13 PM, Andres Freund wrote: > > Hi, > > On 2018-04-09 15:02:11 -0400, Robert Haas wrote: >> I think the simplest technological solution to this problem is to >> rewrite the entire backend and all supporting processes to use >> O_DIRECT everywhere.

Re: Shared PostgreSQL libraries and symbol versioning

2018-04-09 Thread Stephen Frost
Greetings, * Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote: > On 4/5/18 02:04, Pavel Raiskup wrote: > > Hello, for the support of multiple versions of PostgreSQL RPM packages on > > one system, we are thinking about having only one libpq.so.5 > > (libecpg.so.6, libpgtype.so.3

Re: Shared PostgreSQL libraries and symbol versioning

2018-04-09 Thread Peter Eisentraut
On 4/5/18 02:04, Pavel Raiskup wrote: > Hello, for the support of multiple versions of PostgreSQL RPM packages on > one system, we are thinking about having only one libpq.so.5 > (libecpg.so.6, libpgtype.so.3 respectively) supported and about building > (linking) all the PostgreSQL package

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Andres Freund
Hi, On 2018-04-09 21:54:05 +0200, Tomas Vondra wrote: > Isn't the expectation that when a fsync call fails, the next one will > retry writing the pages in the hope that it succeeds? Some people expect that, I personally don't think it's a useful expectation. We should just deal with this by

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Andres Freund
On 2018-04-09 14:41:19 -0500, Justin Pryzby wrote: > On Mon, Apr 09, 2018 at 09:31:56AM +0800, Craig Ringer wrote: > > You could make the argument that it's OK to forget if the entire file > > system goes away. But actually, why is that ok? > > I was going to say that it'd be okay to clear error

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Tomas Vondra
On 04/09/2018 09:37 PM, Andres Freund wrote: > > > On April 9, 2018 12:26:21 PM PDT, Anthony Iliopoulos > wrote: > >> I honestly do not expect that keeping around the failed pages will >> be an acceptable change for most kernels, and as such the >> recommendation >> will

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Anthony Iliopoulos
On Mon, Apr 09, 2018 at 12:37:03PM -0700, Andres Freund wrote: > > > On April 9, 2018 12:26:21 PM PDT, Anthony Iliopoulos > wrote: > > >I honestly do not expect that keeping around the failed pages will > >be an acceptable change for most kernels, and as such the >

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Tomas Vondra
On 04/09/2018 04:22 PM, Anthony Iliopoulos wrote: > On Mon, Apr 09, 2018 at 03:33:18PM +0200, Tomas Vondra wrote: >> >> We already have dirty_bytes and dirty_background_bytes, for example. I >> don't see why there couldn't be another limit defining how much dirty >> data to allow before blocking

Re: Warnings and uninitialized variables in TAP tests

2018-04-09 Thread Magnus Hagander
On Mon, Apr 9, 2018 at 3:15 AM, Michael Paquier wrote: > Hi all, > > While looking at the output of the TAP tests, I have seen warnings like > the following: > Use of uninitialized value $target_lsn in concatenation (.) or string at >

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Anthony Iliopoulos
On Mon, Apr 09, 2018 at 12:29:16PM -0700, Andres Freund wrote: > On 2018-04-09 21:26:21 +0200, Anthony Iliopoulos wrote: > > What about having buffered IO with implied fsync() atomicity via > > O_SYNC? > > You're kidding, right? We could also just add sleep(30)'s all over the > tree, and hope

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Justin Pryzby
On Mon, Apr 09, 2018 at 09:31:56AM +0800, Craig Ringer wrote: > You could make the argument that it's OK to forget if the entire file > system goes away. But actually, why is that ok? I was going to say that it'd be okay to clear error flag on umount, since any opened files would prevent

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Andres Freund
On April 9, 2018 12:26:21 PM PDT, Anthony Iliopoulos wrote: >I honestly do not expect that keeping around the failed pages will >be an acceptable change for most kernels, and as such the >recommendation >will probably be to coordinate in userspace for the fsync(). Why is

Re: Fix pg_rewind which can be run as root user

2018-04-09 Thread Magnus Hagander
On Mon, Apr 9, 2018 at 9:31 PM, Tom Lane wrote: > Magnus Hagander writes: > > Seems simple enough and the right hting to do, but I wonder if we should > > really backpatch it. Yes, the behaviour is not great now, but there is > also > > a non-zero risk

Re: Fix pg_rewind which can be run as root user

2018-04-09 Thread Tom Lane
Magnus Hagander writes: > Seems simple enough and the right hting to do, but I wonder if we should > really backpatch it. Yes, the behaviour is not great now, but there is also > a non-zero risk of breaking peoples automated failover scripts of we > backpatch it, isn't it?

Re: Fix pg_rewind which can be run as root user

2018-04-09 Thread Peter Geoghegan
On Mon, Apr 9, 2018 at 12:23 PM, Magnus Hagander wrote: > Seems simple enough and the right hting to do, but I wonder if we should > really backpatch it. Yes, the behaviour is not great now, but there is also > a non-zero risk of breaking peoples automated failover scripts of

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Andres Freund
On 2018-04-09 21:26:21 +0200, Anthony Iliopoulos wrote: > What about having buffered IO with implied fsync() atomicity via > O_SYNC? You're kidding, right? We could also just add sleep(30)'s all over the tree, and hope that that'll solve the problem. There's a reason we don't permanently fsync

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Anthony Iliopoulos
On Mon, Apr 09, 2018 at 04:29:36PM +0100, Greg Stark wrote: > Honestly I don't think there's *any* way to use the current interface > to implement reliable operation. Even that embedded database using a > single process and keeping every file open all the time (which means > file descriptor limits

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Peter Geoghegan
On Mon, Apr 9, 2018 at 12:13 PM, Andres Freund wrote: > Let's lower the pitchforks a bit here. Obviously a grand rewrite is > absurd, as is some of the proposed ways this is all supposed to > work. But I think the case we're discussing is much closer to a near > irresolvable

Re: Fix pg_rewind which can be run as root user

2018-04-09 Thread Magnus Hagander
On Mon, Apr 9, 2018 at 7:11 AM, Michael Paquier wrote: > Hi all, > > I was just going through pg_rewind's code, and noticed the following > pearl: > /* > * Don't allow pg_rewind to be run as root, to avoid overwriting the > * ownership of files in the data

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Tomas Vondra
On 04/09/2018 08:29 PM, Mark Dilger wrote: > >> On Apr 9, 2018, at 10:26 AM, Joshua D. Drake wrote: > >> We have plenty of YEARS of people not noticing this issue > > I disagree. I have noticed this problem, but blamed it on other things. > For over five years now, I

Re: using index or check in ALTER TABLE SET NOT NULL

2018-04-09 Thread Sergei Kornilov
Hello I notice my patch does not apply again. Here is update to current HEAD regards, Sergeidiff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml index bd22627..db98a98 100644 --- a/doc/src/sgml/ref/alter_table.sgml +++ b/doc/src/sgml/ref/alter_table.sgml @@ -215,8

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Andres Freund
Hi, On 2018-04-09 15:02:11 -0400, Robert Haas wrote: > I think the simplest technological solution to this problem is to > rewrite the entire backend and all supporting processes to use > O_DIRECT everywhere. To maintain adequate performance, we'll have to > write a complete I/O scheduling

Re: Verbosity of genbki.pl

2018-04-09 Thread Tom Lane
Andres Freund writes: > On 2018-04-08 13:33:42 -0400, Tom Lane wrote: >> Traditionally genbki.pl has printed "Writing foo" for every file >> it writes out. >> 2. Print just one message like "Generating postgres.bki and related >> files", and I guess a second one for

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Robert Haas
On Mon, Apr 9, 2018 at 12:45 PM, Robert Haas wrote: > Ouch. If a process exits -- say, because the user typed \q into psql > -- then you're talking about potentially calling fsync() on a really > large number of file descriptor flushing many gigabytes of data to > disk.

Re: Documentation for bootstrap data conversion

2018-04-09 Thread Tom Lane
John Naylor writes: > On 4/9/18, Tom Lane wrote: >> I did note that in some internal comments, but forgot it when writing >> this. I agree that now that the conversion is done, it'd be better >> to remove that special case. Would you send a patch for

Re: [HACKERS] Runtime Partition Pruning

2018-04-09 Thread Alvaro Herrera
Robert Haas wrote: > On Mon, Apr 9, 2018 at 2:28 PM, Alvaro Herrera > wrote: > > Robert Haas wrote: > >> I don't get this. The executor surely had to (and did) open all of > >> the relations somewhere even before this patch. > > > > I was worried that this coding could

Re: [HACKERS] Runtime Partition Pruning

2018-04-09 Thread Robert Haas
On Mon, Apr 9, 2018 at 2:28 PM, Alvaro Herrera wrote: > Robert Haas wrote: >> On Sat, Apr 7, 2018 at 5:13 PM, Alvaro Herrera >> wrote: >> > I had reservations about a relation_open() in the new executor code. It >> > seemed a bit odd; we don't

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Mark Dilger
> On Apr 9, 2018, at 10:26 AM, Joshua D. Drake wrote: > We have plenty of YEARS of people not noticing this issue I disagree. I have noticed this problem, but blamed it on other things. For over five years now, I have had to tell customers not to use thin provisioning,

Re: [HACKERS] Runtime Partition Pruning

2018-04-09 Thread Alvaro Herrera
Robert Haas wrote: > On Sat, Apr 7, 2018 at 5:13 PM, Alvaro Herrera > wrote: > > I had reservations about a relation_open() in the new executor code. It > > seemed a bit odd; we don't have any other relation_open in the executor > > anywhere. However, setting up the

Re: Verbosity of genbki.pl

2018-04-09 Thread Andres Freund
Hi, On 2018-04-08 13:33:42 -0400, Tom Lane wrote: > Traditionally genbki.pl has printed "Writing foo" for every file > it writes out. > 2. Print just one message like "Generating postgres.bki and related > files", and I guess a second one for fmgroids.h and related files. +0.5. - Andres

Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)

2018-04-09 Thread Heikki Linnakangas
On 09/04/18 18:21, Andrey Borodin wrote: 9 апр. 2018 г., в 19:50, Teodor Sigaev написал(а): 3. Why do we *not* lock the entry leaf page, if there is no match? We still need a lock to remember that we probed for that value and there was no match, so that we conflict with a

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Gasper Zejn
On 09. 04. 2018 15:42, Tomas Vondra wrote: > On 04/09/2018 12:29 AM, Bruce Momjian wrote: >> An crazy idea would be to have a daemon that checks the logs and >> stops Postgres when it seems something wrong. >> > That doesn't seem like a very practical way. It's better than nothing, > of course,

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Joshua D. Drake
On 04/09/2018 09:45 AM, Robert Haas wrote: On Mon, Apr 9, 2018 at 8:16 AM, Craig Ringer wrote: In the mean time, I propose that we fsync() on close() before we age FDs out of the LRU on backends. Yes, that will hurt throughput and cause stalls, but we don't seem to have

Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)

2018-04-09 Thread Heikki Linnakangas
On 09/04/18 18:04, Alvaro Herrera wrote: Heikki Linnakangas wrote: Remember, the purpose of predicate locks is to lock key ranges, not physical pages or tuples in the index. We use leaf pages as handy shortcut for "any key value that would belong on this page", but it is just an implementation

Re: Online enabling of checksums

2018-04-09 Thread Magnus Hagander
On Sat, Apr 7, 2018 at 6:22 PM, Andres Freund wrote: > Hi, > > On 2018-04-07 08:57:03 +0200, Magnus Hagander wrote: > > Note however that I'm sans-laptop until Sunday, so I will revert it then > or > > possibly Monday. > > I'll deactive the isolationtester tests until then.

Re: pgsql: Merge catalog/pg_foo_fn.h headers back into pg_foo.h headers.

2018-04-09 Thread Tom Lane
Michael Paquier writes: > On Sun, Apr 08, 2018 at 11:05:09PM -0400, Tom Lane wrote: >> Hm. I'd tested "make -j all", but not going directly to "install". >> Does it help if you add >> $(SUBDIRS:%=install-%-recurse): | submake-generated-headers >> to src/Makefile? > That

Re: Optimization of range queries

2018-04-09 Thread Teodor Sigaev
Hi! 12 years ago I proposed patch to which could "union" OR clauses into one range clause if it's possible. In that time pgsql could not use IS NULL as index clause, so patch doesn't support that https://www.postgresql.org/message-id/flat/45742C51.9020602%40sigaev.ru option number 4), all

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Robert Haas
On Mon, Apr 9, 2018 at 8:16 AM, Craig Ringer wrote: > In the mean time, I propose that we fsync() on close() before we age FDs out > of the LRU on backends. Yes, that will hurt throughput and cause stalls, but > we don't seem to have many better options. At least it'll only

Optimization of range queries

2018-04-09 Thread Konstantin Knizhnik
Hi hackers, Postgres optimizer is not able to build efficient execution plan for the following query: explain select * from  people_raw where not ("ID"<2068113880 AND "INN" is not null) and "ID"<=2068629726 AND "INN" is not null; QUERY PLAN

Re: [WIP PATCH] Index scan offset optimisation using visibility map

2018-04-09 Thread Andrey Borodin
Hi! The work on the patch goes on, where was some discussion of this patch off-list with author. Advise-request is still actual. I think that we should move this patch to next CF. So I'm marking patch as needs review. Best regards, Andrey Borodin.

Re: [HACKERS] Runtime Partition Pruning

2018-04-09 Thread Robert Haas
On Sat, Apr 7, 2018 at 5:13 PM, Alvaro Herrera wrote: > I had reservations about a relation_open() in the new executor code. It > seemed a bit odd; we don't have any other relation_open in the executor > anywhere. However, setting up the pruneinfo needs some stuff from

Re: [HACKERS] [PATCH] Incremental sort

2018-04-09 Thread Alexander Korotkov
On Sat, Apr 7, 2018 at 11:57 PM, Tomas Vondra wrote: > On 04/07/2018 06:23 PM, Tom Lane wrote: > > Teodor Sigaev writes: > >>> I dunno, how would you estimate whether this is actually a win or not? > >>> I don't think our model of sort costs is

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Greg Stark
On 9 April 2018 at 15:22, Anthony Iliopoulos wrote: > On Mon, Apr 09, 2018 at 03:33:18PM +0200, Tomas Vondra wrote: >> > Sure, there could be knobs for limiting how much memory such "zombie" > pages may occupy. Not sure how helpful it would be in the long run > since this

Re: [HACKERS] path toward faster partition pruning

2018-04-09 Thread Robert Haas
On Fri, Apr 6, 2018 at 11:41 PM, Tom Lane wrote: > David Rowley writes: >> Sounds like you're saying that if we have too many alternative files >> then there's a chance that one could pass by luck. > > Yeah, exactly: it passed, but did it pass

Re: Transform for pl/perl

2018-04-09 Thread Tom Lane
ilm...@ilmari.org (Dagfinn Ilmari =?utf-8?Q?Manns=C3=A5ker?=) writes: > Tom Lane writes: >> I think you'd have to convert to text and back. That's kind of icky, >> but it beats failing. > I had a look, and that's what the PL/Python transform does. Attached is > a patch that

Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)

2018-04-09 Thread Teodor Sigaev
Alvaro Herrera wrote: Heikki Linnakangas wrote: Remember, the purpose of predicate locks is to lock key ranges, not physical pages or tuples in the index. We use leaf pages as handy shortcut for "any key value that would belong on this page", but it is just an implementation detail. Hmm

Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)

2018-04-09 Thread Andrey Borodin
> 9 апр. 2018 г., в 19:50, Teodor Sigaev написал(а): >> >> 3. Why do we *not* lock the entry leaf page, if there is no match? We still >> need a lock to remember that we probed for that value and there was no >> match, so that we conflict with a tuple that might be inserted

Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)

2018-04-09 Thread Alvaro Herrera
Heikki Linnakangas wrote: > Remember, the purpose of predicate locks is to lock key ranges, not physical > pages or tuples in the index. We use leaf pages as handy shortcut for "any > key value that would belong on this page", but it is just an implementation > detail. Hmm ... so, thinking about

Re: Transform for pl/perl

2018-04-09 Thread Dagfinn Ilmari Mannsåker
Tom Lane writes: > ilm...@ilmari.org (Dagfinn Ilmari =?utf-8?Q?Manns=C3=A5ker?=) writes: >> I tried fixing this by adding an 'if (SvUV(in))' clause to >> SV_to_JsonbValue, but I couldn't find a function to create a numeric >> value from an uint64. If it's not possible,

Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)

2018-04-09 Thread Alexander Korotkov
Hi! Thank you for taking a look at this patch. I really appreciate your attention over complex subjects like this. On Mon, Apr 9, 2018 at 1:33 PM, Heikki Linnakangas wrote: > On 28/03/18 19:53, Teodor Sigaev wrote: > >> As I understand, scan should lock any visited page, but

Re: lazy detoasting

2018-04-09 Thread Chapman Flack
On 04/08/2018 02:01 AM, Andrew Gierth wrote: > Chapman> (d) some other reason I haven't thought of ? > > It has to be pushed as the active snapshot so that it's the snapshot > that the executor uses to run the query to populate the tuplestore which > becomes the "held" portal content. That

Re: WIP: Covering + unique indexes.

2018-04-09 Thread Teodor Sigaev
Thanks to both of you, pushed Shinoda, Noriyoshi wrote: Hi! Thank you for your response. I think that it is good with your proposal. Regards, Noriyoshi Shinoda *From:*Alexander Korotkov [mailto:a.korot...@postgrespro.ru] *Sent:* Monday, April 9, 2018 11:22 PM *To:* Shinoda, Noriyoshi

Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)

2018-04-09 Thread Teodor Sigaev
Hi! 1. Why do we lock all posting tree pages, even though they all represent the same value? Isn't it enough to lock the root of the posting tree? 2. Why do we lock any posting tree pages at all, if we lock the entry tree page anyway? Isn't the lock on the entry tree page sufficient to cover

RE: WIP: Covering + unique indexes.

2018-04-09 Thread Shinoda, Noriyoshi
Hi! Thank you for your response. I think that it is good with your proposal. Regards, Noriyoshi Shinoda From: Alexander Korotkov [mailto:a.korot...@postgrespro.ru] Sent: Monday, April 9, 2018 11:22 PM To: Shinoda, Noriyoshi Cc: PostgreSQL Hackers

Re: pruning disabled for array, enum, record, range type partition keys

2018-04-09 Thread Tom Lane
Amit Langote writes: > I noticed that the newly added pruning does not work if the partition key > is of one of the types that have a corresponding pseudo-type. While I don't recall the details due to acute caffeine shortage, there are specific coding patterns that

Re: WIP: Covering + unique indexes.

2018-04-09 Thread Alexander Korotkov
Hi! On Mon, Apr 9, 2018 at 5:07 PM, Shinoda, Noriyoshi < noriyoshi.shin...@hpe.com> wrote: > I tested this feature and found a document shortage in the columns added > to the pg_constraint catalog. > The attached patch will add the description of the 'conincluding' column > to the manual of the

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

2018-04-09 Thread Anthony Iliopoulos
On Mon, Apr 09, 2018 at 03:33:18PM +0200, Tomas Vondra wrote: > > We already have dirty_bytes and dirty_background_bytes, for example. I > don't see why there couldn't be another limit defining how much dirty > data to allow before blocking writes altogether. I'm sure it's not that > simple, but

  1   2   >