Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-26 Thread Tom Lane
Thomas Munro writes: > See attached, which also removes the ENOSYS stuff which I believe to > be now useless. Does this make sense? Survives make check-world and > my simple test procedure on a 3.10.0-327.36.1.el7.x86_64 system. Thanks. Works on my RHEL6 box

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-25 Thread Thomas Munro
On Tue, Sep 26, 2017 at 10:12 AM, Tom Lane wrote: > Thomas Munro writes: >> I think the problem here is that posix_fallocate() doesn't set errno. > > Huh. So the fact that it worked for me is likely because glibc's > emulation *does* allow

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-25 Thread Tom Lane
Thomas Munro writes: > I think the problem here is that posix_fallocate() doesn't set errno. Huh. So the fact that it worked for me is likely because glibc's emulation *does* allow errno to get set. > Will write a patch. Thanks, I'm out of time for today.

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-25 Thread Thomas Munro
On Tue, Sep 26, 2017 at 9:57 AM, Thomas Munro wrote: >> On Tue, Sep 26, 2017 at 9:13 AM, Tom Lane wrote: >>> Pushed with that change; we'll soon see what the buildfarm thinks. > > Hmm. One failure in the test modules: > >

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-25 Thread Thomas Munro
> On Tue, Sep 26, 2017 at 9:13 AM, Tom Lane wrote: >> Pushed with that change; we'll soon see what the buildfarm thinks. Hmm. One failure in the test modules: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rhinoceros=2017-09-25%2020%3A45%3A02 2017-09-25

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-25 Thread Thomas Munro
On Tue, Sep 26, 2017 at 9:13 AM, Tom Lane wrote: > I wrote: >> Rather than dig into the guts of glibc to find that out, though, I think >> we should just s/fallocate/posix_fallocate/g on this patch. The argument >> for using the former seemed pretty thin to begin with. > >

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-25 Thread Thomas Munro
On Tue, Sep 26, 2017 at 7:56 AM, Tom Lane wrote: > I wrote: >> Hmm, so I tested this patch on my RHEL6 box (kernel 2.6.32) and it >> immediately fell over with >> 2017-09-25 14:23:48.410 EDT [325] FATAL: could not resize shared memory >> segment "/PostgreSQL.1682054886" to

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-25 Thread Tom Lane
I wrote: > Rather than dig into the guts of glibc to find that out, though, I think > we should just s/fallocate/posix_fallocate/g on this patch. The argument > for using the former seemed pretty thin to begin with. Pushed with that change; we'll soon see what the buildfarm thinks. I suspect

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-25 Thread Tom Lane
I wrote: > Hmm, so I tested this patch on my RHEL6 box (kernel 2.6.32) and it > immediately fell over with > 2017-09-25 14:23:48.410 EDT [325] FATAL: could not resize shared memory > segment "/PostgreSQL.1682054886" to 6928 bytes: Operation not supported > during startup. I wonder whether we

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-25 Thread Tom Lane
Robert Haas writes: > On Mon, Sep 25, 2017 at 10:22 AM, Tom Lane wrote: >> I think we don't really have a lot of choice. I propose applying this >> as far back as 9.6 --- anyone think differently? > +1. If applies to 9.5 and 9.4 without a lot of

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-25 Thread Tom Lane
Robert Haas writes: > On Mon, Sep 25, 2017 at 10:22 AM, Tom Lane wrote: >> Thomas Munro writes: >>> So, do we want this patch? >> I think we don't really have a lot of choice. I propose applying this >> as far back as

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-25 Thread Robert Haas
On Mon, Sep 25, 2017 at 10:22 AM, Tom Lane wrote: > Thomas Munro writes: >> So, do we want this patch? > > I think we don't really have a lot of choice. I propose applying this > as far back as 9.6 --- anyone think differently? +1. If applies

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-25 Thread Tom Lane
Thomas Munro writes: > So, do we want this patch? I think we don't really have a lot of choice. I propose applying this as far back as 9.6 --- anyone think differently? regards, tom lane -- Sent via pgsql-hackers mailing list

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-09-24 Thread Thomas Munro
On Thu, Aug 17, 2017 at 11:39 AM, Thomas Munro wrote: > On Thu, Jun 29, 2017 at 12:24 PM, Thomas Munro > wrote: >> fallocate-v5.patch > > Added to commitfest so we don't lose track of this. Rebased due to collision with recent

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-08-16 Thread Thomas Munro
On Thu, Jun 29, 2017 at 12:24 PM, Thomas Munro wrote: > fallocate-v5.patch Added to commitfest so we don't lose track of this. I'm mainly concerned about the fact that we have a way for PostgreSQL to die that looks exactly like a bug, when really it's masking an

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-06-28 Thread Thomas Munro
On Thu, Jun 29, 2017 at 11:04 AM, Andres Freund wrote: >> diff --git a/configure.in b/configure.in >> index 11eb9c8acfc..47452bbac43 100644 >> --- a/configure.in >> +++ b/configure.in >> @@ -1429,7 +1429,7 @@ PGAC_FUNC_WCSTOMBS_L >> LIBS_including_readline="$LIBS" >>

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-06-28 Thread Andres Freund
On 2017-06-28 19:07:50 +1200, Thomas Munro wrote: > I think this line is saying that it won't restart automatically: > > https://github.com/torvalds/linux/blob/590dce2d4934fb909b112cd80c80486362337744/mm/shmem.c#L2884 Indeed. > So I think we either need to mask signals with or put in an

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-06-28 Thread Thomas Munro
On Wed, Jun 28, 2017 at 5:19 PM, Thomas Munro wrote: > On Wed, Aug 24, 2016 at 2:58 AM, Robert Haas wrote: >> Now, for bigger segment sizes, I think there actually could be a >> little bit of a noticeable performance hit here, because it's

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-06-27 Thread Thomas Munro
On Wed, Aug 24, 2016 at 2:58 AM, Robert Haas wrote: > Now, for bigger segment sizes, I think there actually could be a > little bit of a noticeable performance hit here, because it's not just > about total elapsed time. Even if the code eventually touches all of > the

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-23 Thread Robert Haas
On Mon, Aug 22, 2016 at 8:18 PM, Thomas Munro wrote: > On Tue, Aug 23, 2016 at 8:41 AM, Robert Haas wrote: >> We could test to see how much it slows things down. But it >> may be worth paying the cost even if it ends up being kinda

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-22 Thread Thomas Munro
On Tue, Aug 23, 2016 at 8:41 AM, Robert Haas wrote: > We could test to see how much it slows things down. But it > may be worth paying the cost even if it ends up being kinda expensive. Here are some numbers from a Xeon E7-8830 @ 2.13GHz running Linux 3.10 running the

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-22 Thread Thomas Munro
On Tue, Aug 23, 2016 at 8:41 AM, Robert Haas wrote: > On Tue, Aug 16, 2016 at 7:41 PM, Thomas Munro > wrote: >> I still think it's worth thinking about something along these lines on >> Linux only, where holey Swiss tmpfs files can bite you.

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-22 Thread Robert Haas
On Tue, Aug 16, 2016 at 7:41 PM, Thomas Munro wrote: > I still think it's worth thinking about something along these lines on > Linux only, where holey Swiss tmpfs files can bite you. Otherwise > disabling overcommit on your OS isn't enough to prevent something >

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-16 Thread Thomas Munro
On Wed, Aug 17, 2016 at 4:50 AM, Robert Haas wrote: > On Fri, Aug 12, 2016 at 9:22 PM, Thomas Munro > wrote: >> On Sat, Aug 13, 2016 at 8:26 AM, Thomas Munro >> wrote: >>> On Sat, Aug 13, 2016 at 2:08 AM, Tom

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-16 Thread Robert Haas
On Fri, Aug 12, 2016 at 9:22 PM, Thomas Munro wrote: > On Sat, Aug 13, 2016 at 8:26 AM, Thomas Munro > wrote: >> On Sat, Aug 13, 2016 at 2:08 AM, Tom Lane wrote: >>> amul sul writes:

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-12 Thread Thomas Munro
On Sat, Aug 13, 2016 at 8:26 AM, Thomas Munro wrote: > On Sat, Aug 13, 2016 at 2:08 AM, Tom Lane wrote: >> amul sul writes: >>> When I am calling dsm_create on Linux using the POSIX DSM implementation >>> can succeed, but

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-12 Thread Thomas Munro
On Sat, Aug 13, 2016 at 2:08 AM, Tom Lane wrote: > amul sul writes: >> When I am calling dsm_create on Linux using the POSIX DSM implementation can >> succeed, but result in SIGBUS when later try to access the memory. This >> happens because of my

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-12 Thread Claudio Freire
On Fri, Aug 12, 2016 at 1:55 PM, amul sul wrote: > No segfault during dsm_create, mmap returns the memory address which is > inaccessible. > > Let me see how can I disable kernel overcommit behaviour, but IMHO, we > should prevent ourselves from crashing, shouldn't we?

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-12 Thread amul sul
No segfault during dsm_create,  mmap returns  the memory address  which is inaccessible.  Let me see how can I disable kernel overcommit behaviour, but  IMHO,  we should prevent ourselves from crashing,  shouldn't we?  Regards,  

Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-12 Thread Tom Lane
amul sul writes: > When I am calling dsm_create on Linux using the POSIX DSM implementation can > succeed, but result in SIGBUS when later try to access the memory.  This > happens because of my system does not have enough shm space &  current > allocation in

[HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-12 Thread amul sul
Hi All, When I am calling dsm_create on Linux using the POSIX DSM implementation can succeed, but result in SIGBUS when later try to access the memory.  This happens because of my system does not have enough shm space &  current allocation in dsm_impl_posix does not allocate disk blocks[1]. I