Re: dsa_allocate() faliure

2019-02-18 Thread Jakub Glapa
Hi I just checked the dmesg. The segfault I wrote about is the only one I see, dated Nov 24 last year. Since then no other segfaults happened although dsa_allocated failures happen daily. I'll report if anything occurs. I have the core dumping setup in place. -- regards, pozdrawiam, Jakub Glapa

Re: dsa_allocate() faliure

2019-02-17 Thread Justin Pryzby
Hi, On Mon, Nov 26, 2018 at 09:52:07AM -0600, Justin Pryzby wrote: > Hi, thanks for following through. > > On Mon, Nov 26, 2018 at 04:38:35PM +0100, Jakub Glapa wrote: > > I had a look at dmesg and indeed I see something like: > > > > postgres[30667]: segfault at 0 ip 557834264b16 sp

Re: dsa_allocate() faliure

2019-02-04 Thread Justin Pryzby
On Mon, Feb 04, 2019 at 08:31:47PM +, Arne Roland wrote: > I could take a backup and restore the relevant tables on a throwaway system. > You are just suggesting to replace line 728 > elog(FATAL, > "dsa_allocate could not find %zu free > pages", npages); > by

RE: dsa_allocate() faliure

2019-02-04 Thread Arne Roland
It's definitely a quite a relatively complex pattern. The query I set you last time was minimal with respect to predicates (so removing any single one of the predicates converted that one into a working query). > Huh. Ok well that's a lot more frequent that I thought. Is it always the > same

Re: dsa_allocate() faliure

2019-02-04 Thread Thomas Munro
On Mon, Feb 4, 2019 at 6:52 PM Jakub Glapa wrote: > I see the error showing up every night on 2 different servers. But it's a bit > of a heisenbug because If I go there now it won't be reproducible. Huh. Ok well that's a lot more frequent that I thought. Is it always the same query? Any

Re: dsa_allocate() faliure

2019-02-03 Thread Jakub Glapa
Hi Thomas, I was one of the reporter in the early Dec last year. I somehow dropped the ball and forgot about the issue. Anyhow I upgraded the clusters to pg11.1 and nothing changed. I also have a rule to coredump but a segfault does not happen while this is occurring. I see the error showing up

Re: dsa_allocate() faliure

2019-01-30 Thread Fabio Isabettini
Hi Thomas, it is a Production system and we don’t have permanent access to it. Also to have an auto_explain feature always on, is not an option in production. I will ask the customer to give us notice asap the error present itself to connect immediately and try to get a query plan. Regards

Re: dsa_allocate() faliure

2019-01-29 Thread Thomas Munro
On Tue, Jan 29, 2019 at 10:32 PM Fabio Isabettini wrote: > we are facing a similar issue on a Production system using a Postgresql 10.6: > > org.postgresql.util.PSQLException: ERROR: EXCEPTION on getstatistics ; ID: > EXCEPTION on getstatistics_media ; ID: uidatareader. > run_query_media(2):

Re: dsa_allocate() faliure

2019-01-28 Thread Thomas Munro
On Tue, Jan 29, 2019 at 2:50 AM Arne Roland wrote: > does anybody have any idea what goes wrong here? Is there some additional > information that could be helpful? Hi Arne, This seems to be a bug; that error should not be reached. I wonder if it is a different manifestation of the bug

RE: dsa_allocate() faliure

2019-01-28 Thread Arne Roland
Hello, does anybody have any idea what goes wrong here? Is there some additional information that could be helpful? All the best Arne Roland

Re: dsa_allocate() faliure

2018-11-26 Thread Jakub Glapa
Justin thanks for the information! I'm running Ubuntu 16.04. I'll try to prepare for the next crash. Couldn't find anything this time. -- regards, Jakub Glapa On Mon, Nov 26, 2018 at 4:52 PM Justin Pryzby wrote: > Hi, thanks for following through. > > On Mon, Nov 26, 2018 at 04:38:35PM

Re: dsa_allocate() faliure

2018-11-26 Thread Justin Pryzby
Hi, thanks for following through. On Mon, Nov 26, 2018 at 04:38:35PM +0100, Jakub Glapa wrote: > I had a look at dmesg and indeed I see something like: > > postgres[30667]: segfault at 0 ip 557834264b16 sp 7ffc2ce1e030 > error 4 in postgres[557833db7000+6d5000] That's useful, I think

Re: dsa_allocate() faliure

2018-11-23 Thread Justin Pryzby
On Fri, Nov 23, 2018 at 03:31:41PM +0100, Jakub Glapa wrote: > Hi Justin, I've upgrade to 10.6 but the error still shows up: > > If I set it to max_parallel_workers=0 I also get and my connection is being > closed (but the server is alive): > > psql db@host as user => set max_parallel_workers=0;

Re: dsa_allocate() faliure

2018-11-23 Thread Jakub Glapa
Hi Justin, I've upgrade to 10.6 but the error still shows up: psql db@host as user => select version(); version

Re: dsa_allocate() faliure

2018-11-22 Thread Justin Pryzby
On Wed, Nov 21, 2018 at 03:26:42PM +0100, Jakub Glapa wrote: > Looks like my email didn't match the right thread: > https://www.postgresql.org/message-id/flat/CAMAYy4%2Bw3NTBM5JLWFi8twhWK4%3Dk_5L4nV5%2BbYDSPu8r4b97Zg%40mail.gmail.com > Any chance to get some feedback on this? In the related

Re: dsa_allocate() faliure

2018-11-13 Thread Jakub Glapa
Hi, I'm also experiencing the problem: dsa_allocate could not find 7 free pages CONTEXT: parallel worker I'm running: PostgreSQL 10.5 (Ubuntu 10.5-1.pgdg16.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609, 64-bit query plan: (select statement over

Re: dsa_allocate() faliure

2018-10-04 Thread Thomas Munro
On Wed, Aug 29, 2018 at 5:48 PM Sand Stone wrote: > I attached a query (and its query plan) that caused the crash: "dsa_allocate > could not find 13 free pages" on one of the worker nodes. I anonymised the > query text a bit. Interestingly, this time only one (same one) of the nodes > is

Re: dsa_allocate() faliure

2018-08-28 Thread Sand Stone
I attached a query (and its query plan) that caused the crash: "dsa_allocate could not find 13 free pages" on one of the worker nodes. I anonymised the query text a bit. Interestingly, this time only one (same one) of the nodes is crashing. Since this is a production environment, I cannot get the

Re: dsa_allocate() faliure

2018-08-15 Thread Sand Stone
Just as a follow up. I tried the parallel execution again (in a stress test environment). Now the crash seems gone. I will keep an eye on this for the next few weeks. My theory is that the Citus cluster created and shut down a lot of TCP connections between coordinator and workers. If running on

Re: dsa_allocate() faliure

2018-05-22 Thread Sand Stone
>>dsa_allocate could not find 7 free pages I just this error message again on all of my worker nodes (I am using Citus 7.4 rel). The PG core is my own build of release_10_stable (10.4) out of GitHub on Ubuntu. What's the best way to debug this? I am running pre-production tests for the next few

Re: dsa_allocate() faliure

2018-01-29 Thread Thomas Munro
On Tue, Jan 30, 2018 at 5:37 AM, Tom Lane wrote: > Rick Otten writes: >> I'm wondering if there is anything I can tune in my PG 10.1 database to >> avoid these errors: > >> $ psql -f failing_query.sql >> psql:failing_query.sql:46: ERROR:

Re: dsa_allocate() faliure

2018-01-29 Thread Tom Lane
Rick Otten writes: > I'm wondering if there is anything I can tune in my PG 10.1 database to > avoid these errors: > $ psql -f failing_query.sql > psql:failing_query.sql:46: ERROR: dsa_allocate could not find 7 free pages > CONTEXT: parallel worker Hmm. There's