Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-26 Thread Tom Lane
Thomas Munro writes: > See attached, which also removes the ENOSYS stuff which I believe to > be now useless. Does this make sense? Survives make check-world and > my simple test procedure on a 3.10.0-327.36.1.el7.x86_64 system. Thanks. Works on my RHEL6 box

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-25 Thread Thomas Munro
On Tue, Sep 26, 2017 at 10:12 AM, Tom Lane wrote: > Thomas Munro writes: >> I think the problem here is that posix_fallocate() doesn't set errno. > > Huh. So the fact that it worked for me is likely because glibc's > emulation *does* allow

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-25 Thread Tom Lane
Thomas Munro writes: > I think the problem here is that posix_fallocate() doesn't set errno. Huh. So the fact that it worked for me is likely because glibc's emulation *does* allow errno to get set. > Will write a patch. Thanks, I'm out of time for today.

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-25 Thread Thomas Munro
On Tue, Sep 26, 2017 at 9:57 AM, Thomas Munro wrote: >> On Tue, Sep 26, 2017 at 9:13 AM, Tom Lane wrote: >>> Pushed with that change; we'll soon see what the buildfarm thinks. > > Hmm. One failure in the test modules: > >

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-25 Thread Thomas Munro
> On Tue, Sep 26, 2017 at 9:13 AM, Tom Lane wrote: >> Pushed with that change; we'll soon see what the buildfarm thinks. Hmm. One failure in the test modules: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rhinoceros=2017-09-25%2020%3A45%3A02 2017-09-25

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-25 Thread Thomas Munro
On Tue, Sep 26, 2017 at 9:13 AM, Tom Lane wrote: > I wrote: >> Rather than dig into the guts of glibc to find that out, though, I think >> we should just s/fallocate/posix_fallocate/g on this patch. The argument >> for using the former seemed pretty thin to begin with. > >

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-25 Thread Thomas Munro
On Tue, Sep 26, 2017 at 7:56 AM, Tom Lane wrote: > I wrote: >> Hmm, so I tested this patch on my RHEL6 box (kernel 2.6.32) and it >> immediately fell over with >> 2017-09-25 14:23:48.410 EDT [325] FATAL: could not resize shared memory >> segment "/PostgreSQL.1682054886" to

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-25 Thread Tom Lane
I wrote: > Rather than dig into the guts of glibc to find that out, though, I think > we should just s/fallocate/posix_fallocate/g on this patch. The argument > for using the former seemed pretty thin to begin with. Pushed with that change; we'll soon see what the buildfarm thinks. I suspect

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-25 Thread Tom Lane
I wrote: > Hmm, so I tested this patch on my RHEL6 box (kernel 2.6.32) and it > immediately fell over with > 2017-09-25 14:23:48.410 EDT [325] FATAL: could not resize shared memory > segment "/PostgreSQL.1682054886" to 6928 bytes: Operation not supported > during startup. I wonder whether we

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-25 Thread Tom Lane
Robert Haas writes: > On Mon, Sep 25, 2017 at 10:22 AM, Tom Lane wrote: >> I think we don't really have a lot of choice. I propose applying this >> as far back as 9.6 --- anyone think differently? > +1. If applies to 9.5 and 9.4 without a lot of

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-25 Thread Tom Lane
Robert Haas writes: > On Mon, Sep 25, 2017 at 10:22 AM, Tom Lane wrote: >> Thomas Munro writes: >>> So, do we want this patch? >> I think we don't really have a lot of choice. I propose applying this >> as far back as

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-25 Thread Robert Haas
On Mon, Sep 25, 2017 at 10:22 AM, Tom Lane wrote: > Thomas Munro writes: >> So, do we want this patch? > > I think we don't really have a lot of choice. I propose applying this > as far back as 9.6 --- anyone think differently? +1. If applies

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-25 Thread Tom Lane
Thomas Munro writes: > So, do we want this patch? I think we don't really have a lot of choice. I propose applying this as far back as 9.6 --- anyone think differently? regards, tom lane -- Sent via pgsql-hackers mailing list

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-24 Thread Thomas Munro
On Thu, Aug 17, 2017 at 11:39 AM, Thomas Munro wrote: > On Thu, Jun 29, 2017 at 12:24 PM, Thomas Munro > wrote: >> fallocate-v5.patch > > Added to commitfest so we don't lose track of this. Rebased due to collision with recent

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-08-16 Thread Thomas Munro
On Thu, Jun 29, 2017 at 12:24 PM, Thomas Munro wrote: > fallocate-v5.patch Added to commitfest so we don't lose track of this. I'm mainly concerned about the fact that we have a way for PostgreSQL to die that looks exactly like a bug, when really it's masking an

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-06-28 Thread Thomas Munro
On Thu, Jun 29, 2017 at 11:04 AM, Andres Freund wrote: >> diff --git a/configure.in b/configure.in >> index 11eb9c8acfc..47452bbac43 100644 >> --- a/configure.in >> +++ b/configure.in >> @@ -1429,7 +1429,7 @@ PGAC_FUNC_WCSTOMBS_L >> LIBS_including_readline="$LIBS" >>

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-06-28 Thread Andres Freund
On 2017-06-28 19:07:50 +1200, Thomas Munro wrote: > I think this line is saying that it won't restart automatically: > > https://github.com/torvalds/linux/blob/590dce2d4934fb909b112cd80c80486362337744/mm/shmem.c#L2884 Indeed. > So I think we either need to mask signals with or put in an

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-06-28 Thread Thomas Munro
On Wed, Jun 28, 2017 at 5:19 PM, Thomas Munro wrote: > On Wed, Aug 24, 2016 at 2:58 AM, Robert Haas wrote: >> Now, for bigger segment sizes, I think there actually could be a >> little bit of a noticeable performance hit here, because it's

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-06-27 Thread Thomas Munro
On Wed, Aug 24, 2016 at 2:58 AM, Robert Haas wrote: > Now, for bigger segment sizes, I think there actually could be a > little bit of a noticeable performance hit here, because it's not just > about total elapsed time. Even if the code eventually touches all of > the

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-23 Thread Robert Haas
On Mon, Aug 22, 2016 at 8:18 PM, Thomas Munro wrote: > On Tue, Aug 23, 2016 at 8:41 AM, Robert Haas wrote: >> We could test to see how much it slows things down. But it >> may be worth paying the cost even if it ends up being kinda

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-22 Thread Thomas Munro
On Tue, Aug 23, 2016 at 8:41 AM, Robert Haas wrote: > We could test to see how much it slows things down. But it > may be worth paying the cost even if it ends up being kinda expensive. Here are some numbers from a Xeon E7-8830 @ 2.13GHz running Linux 3.10 running the

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-22 Thread Thomas Munro
On Tue, Aug 23, 2016 at 8:41 AM, Robert Haas wrote: > On Tue, Aug 16, 2016 at 7:41 PM, Thomas Munro > wrote: >> I still think it's worth thinking about something along these lines on >> Linux only, where holey Swiss tmpfs files can bite you.

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-22 Thread Robert Haas
On Tue, Aug 16, 2016 at 7:41 PM, Thomas Munro wrote: > I still think it's worth thinking about something along these lines on > Linux only, where holey Swiss tmpfs files can bite you. Otherwise > disabling overcommit on your OS isn't enough to prevent something >

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-16 Thread Thomas Munro
On Wed, Aug 17, 2016 at 4:50 AM, Robert Haas wrote: > On Fri, Aug 12, 2016 at 9:22 PM, Thomas Munro > wrote: >> On Sat, Aug 13, 2016 at 8:26 AM, Thomas Munro >> wrote: >>> On Sat, Aug 13, 2016 at 2:08 AM, Tom

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-16 Thread Robert Haas
On Fri, Aug 12, 2016 at 9:22 PM, Thomas Munro wrote: > On Sat, Aug 13, 2016 at 8:26 AM, Thomas Munro > wrote: >> On Sat, Aug 13, 2016 at 2:08 AM, Tom Lane wrote: >>> amul sul writes:

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-12 Thread Thomas Munro
On Sat, Aug 13, 2016 at 8:26 AM, Thomas Munro wrote: > On Sat, Aug 13, 2016 at 2:08 AM, Tom Lane wrote: >> amul sul writes: >>> When I am calling dsm_create on Linux using the POSIX DSM implementation >>> can succeed, but

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-12 Thread Thomas Munro
On Sat, Aug 13, 2016 at 2:08 AM, Tom Lane wrote: > amul sul writes: >> When I am calling dsm_create on Linux using the POSIX DSM implementation can >> succeed, but result in SIGBUS when later try to access the memory. This >> happens because of my

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-12 Thread Claudio Freire
On Fri, Aug 12, 2016 at 1:55 PM, amul sul wrote: > No segfault during dsm_create, mmap returns the memory address which is > inaccessible. > > Let me see how can I disable kernel overcommit behaviour, but IMHO, we > should prevent ourselves from crashing, shouldn't we?

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-12 Thread amul sul
No segfault during dsm_create,  mmap returns  the memory address  which is inaccessible.  Let me see how can I disable kernel overcommit behaviour, but  IMHO,  we should prevent ourselves from crashing,  shouldn't we?  Regards,  

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-12 Thread Tom Lane
amul sul writes: > When I am calling dsm_create on Linux using the POSIX DSM implementation can > succeed, but result in SIGBUS when later try to access the memory.  This > happens because of my system does not have enough shm space &  current > allocation in