Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread Tomas Vondra
On 10/11/2017 02:26 AM, pinker wrote: > Tomas Vondra-4 wrote >> I'm probably a bit dumb (after all, it's 1AM over here), but can you >> explain the CPU chart? I'd understand percentages (say, 75% CPU used) >> but what do the seconds / fractions mean? E.g. when the system time >> reaches 5

Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread Justin Pryzby
On Tue, Oct 10, 2017 at 01:40:07PM -0700, pinker wrote: > Hi to all! > > We've got problem with a very serious repetitive incident on our core > system. Namely, cpu load spikes to 300-400 and the whole db becomes > unresponsive. From db point of view nothing special is happening, memory > looks

Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread pinker
Andres Freund wrote > Others mentioned already that that's worth improving. Yes, we are just setting up pgbouncer Andres Freund wrote > Some versions of this kernel have had serious problems with transparent > hugepages. I'd try turning that off. I think it defaults to off even in > that

Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread pinker
Tomas Vondra-4 wrote > I'm probably a bit dumb (after all, it's 1AM over here), but can you > explain the CPU chart? I'd understand percentages (say, 75% CPU used) > but what do the seconds / fractions mean? E.g. when the system time > reaches 5 seconds, what does that mean? hehe, no you've just

Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread Andres Freund
Hi, On 2017-10-10 13:40:07 -0700, pinker wrote: > and the total number of connections are increasing very fast (but I suppose > it's the symptom not the root cause of cpu load) and exceed max_connections > (1000). Others mentioned already that that's worth improving. > System: > * CentOS Linux

Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread Tomas Vondra
On 10/11/2017 12:28 AM, pinker wrote: > Tomas Vondra-4 wrote >> What is "CPU load"? Perhaps you mean "load average"? > > Yes, I wasn't exact: I mean system cpu usage, it can be seen here - it's the > graph from yesterday's failure (after 6p.m.): >

Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread pinker
Victor Yegorov wrote > Looks like `sdg` and `sdm` are the ones used most. > Can you describe what's on those devices? Do you have WAL and DB sitting > together? > Where DB log files are stored? it's multipath with the same LUN for PGDATA and pg_log, but separate one for xlogs and archives.

Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread John R Pierce
On 10/10/2017 3:28 PM, pinker wrote: It was exactly my first guess. work_mem is set to ~ 350MB and I see a lot of stored procedures with unnecessary WITH clauses (i.e. materialization) and right after it IN query with results of that (hash). 1000 connections all doing queries that need 1

Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread pinker
Tomas Vondra-4 wrote > What is "CPU load"? Perhaps you mean "load average"? Yes, I wasn't exact: I mean system cpu usage, it can be seen here - it's the graph from yesterday's failure (after 6p.m.): So as one can see connections spikes

Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread Victor Yegorov
2017-10-11 0:53 GMT+03:00 pinker : > > Can you provide output of `iostat -myx 10` at the “peak” moments, please? > > sure, please find it here: > https://pastebin.com/f2Pv6hDL Looks like `sdg` and `sdm` are the ones used most. Can you describe what's on those devices? Do you

Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread pinker
Scott Marlowe-2 wrote > Ouch, unless I'm reading that wrong, your IO subsystem seems to be REALLY > slow. it's a huge array where a lot is happening, for instance data snapshots :/ the lun on which is this db is dm-7. I'm a DBA with null knowledge about arrays so any advice will be much

Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread Scott Marlowe
On Tue, Oct 10, 2017 at 3:53 PM, pinker wrote: > Victor Yegorov wrote >> Can you provide output of `iostat -myx 10` at the “peak” moments, please? > > sure, please find it here: > https://pastebin.com/f2Pv6hDL Ouch, unless I'm reading that wrong, your IO subsystem seems to be

Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread pinker
Victor Yegorov wrote > Can you provide output of `iostat -myx 10` at the “peak” moments, please? sure, please find it here: https://pastebin.com/f2Pv6hDL Victor Yegorov wrote > Also, it'd be good to look in more detailed bgwriter/checkpointer stats. > You can find more details in this post:

Re: [GENERAL] Error: "cached plan must not change result type"

2017-10-10 Thread Dmitry Dolgov
>On 9 Oct 2017 13:13, "Durumdara" wrote: > > "cached plan must not change result type" As far as I remember, this kind of errors you can get from a prepared statement execution, when the result type of it was changed (as in your case by adding a column to a table that

Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread Tomas Vondra
On 10/10/2017 10:40 PM, pinker wrote: > Hi to all! > > We've got problem with a very serious repetitive incident on our core > system. Namely, cpu load spikes to 300-400 and the whole db becomes What is "CPU load"? Perhaps you mean "load average"? Also, what are the basic system parameters

Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread Victor Yegorov
2017-10-10 23:40 GMT+03:00 pinker : > We've got problem with a very serious repetitive incident on our core > system. Namely, cpu load spikes to 300-400 and the whole db becomes > unresponsive. From db point of view nothing special is happening, memory > looks fine, disks io's are

Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread pinker
Thank you Scott, we are planning to do it today. But are you sure it will help in this case? -- Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription:

Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread Scott Marlowe
On Tue, Oct 10, 2017 at 2:40 PM, pinker wrote: > Hi to all! > > We've got problem with a very serious repetitive incident on our core > system. Namely, cpu load spikes to 300-400 and the whole db becomes > unresponsive. From db point of view nothing special is happening, memory >

[GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread pinker
Hi to all! We've got problem with a very serious repetitive incident on our core system. Namely, cpu load spikes to 300-400 and the whole db becomes unresponsive. From db point of view nothing special is happening, memory looks fine, disks io's are ok and the only problem is huge cpu load. Kernel

Re: [GENERAL] startup process stuck in recovery

2017-10-10 Thread Tom Lane
Christophe Pettus writes: > I was able to reproduce this on 9.5.9 with the following: Hmm ... so I still can't reproduce the specific symptoms Christophe reports. What I see is that, given this particular test case, the backend process on the master never holds more than a

Re: [GENERAL] Equivalence Classes when using IN

2017-10-10 Thread Nico Williams
On Mon, Oct 09, 2017 at 07:44:50PM -0400, Tom Lane wrote: > David Rowley writes: > > If the only reason that is_simple_subquery() rejects subqueries with > > ORDER BY is due to wanting to keep the order by of a view, then > > couldn't we make is_simple_subquery() a

Re: [GENERAL] startup process stuck in recovery

2017-10-10 Thread Christophe Pettus
> On Oct 10, 2017, at 08:05, Tom Lane wrote: > > You're right, I was testing on HEAD, so that patch might've obscured > the problem. But the code looks like it could still be O(N^2) in > some cases. Will look again later. I was able to reproduce this on 9.5.9 with the

Re: [GENERAL] startup process stuck in recovery

2017-10-10 Thread Tom Lane
Alvaro Herrera writes: > Tom Lane wrote: >> Hmm, I tried to reproduce this and could not. I experimented with >> various permutations of this: > This problem is probably related to commit 9b013dc238c, which AFAICS is > only in pg10, not 9.5. You're right, I was testing

Re: [GENERAL] [asking about how to upgrade docker postgresql without losing the data]

2017-10-10 Thread David G. Johnston
On Tue, Oct 10, 2017 at 4:25 AM, Olivani Prisila wrote: > Hi, > > I am beginner both of docker and postgresql. > > How do i upgrade docker postgresql 9.5 into 9.6 without losing my > current database? > fyi: im using ubuntu verison 14 and docker 17.09 > ​More of a Docker

Re: [GENERAL] Strange checkpoint behavior - checkpoints take alongtime

2017-10-10 Thread Vladimir Nicolici
In fact it was a single delete statement. From: Vladimir Nicolici Sent: Tuesday, October 10, 2017 17:30 To: Achilleas Mantzios; pgsql-general@postgresql.org Subject: RE: [GENERAL] Strange checkpoint behavior - checkpoints take alongtime No, it didn’t. The delete was done in a single transaction.

Re: [GENERAL] startup process stuck in recovery

2017-10-10 Thread Alvaro Herrera
Tom Lane wrote: > Christophe Pettus writes: > > The problem indeed appear to be a very large number of subtransactions, > > each one creating a temp table, inside a single transaction. It's made > > worse by one of those transactions finally getting replayed on the > >

Re: [GENERAL] Strange checkpoint behavior - checkpoints take a longtime

2017-10-10 Thread Vladimir Nicolici
No, it didn’t. The delete was done in a single transaction. From: Achilleas Mantzios Sent: Tuesday, October 10, 2017 17:18 To: pgsql-general@postgresql.org Subject: Re: [GENERAL] Strange checkpoint behavior - checkpoints take a longtime Hello Vladimir, maybe your update triggered auto_vacuum on

Re: [GENERAL] Strange checkpoint behavior - checkpoints take a long time

2017-10-10 Thread Achilleas Mantzios
Hello Vladimir, maybe your update triggered auto_vacuum on those tables ? Default autovacuum_freeze_max_age is exactly set at 200,000,000 . Did you check your vacuum stats afterwards (pg_stat_*_tables) ? Can you show the code which performed the deletes? On 10/10/2017 16:56, Vladimir Nicolici

Re: [GENERAL] Strange checkpoint behavior - checkpoints take a long time

2017-10-10 Thread Vladimir Nicolici
I experimented some more with the settings this weekend, while doing some large write operations (deleting 200 million records from a table), and I realized that the database is capable of generating much more WAL than I estimated. And it seems that spikes in write activity, when longer than a

[GENERAL] How to sort in pgAdmin 4

2017-10-10 Thread Thomasine Dade (tdade)
I am using pgAdmin 4 v2 on Mac OS X 10.12.6. When I filter my data so that I can edit it, it sets a default sort. The text of the query is displayed but it’s grayed so I cannot edit the ORDER BY statement. How can I change the order by? In pgAdmin 3, I had a “Data Sorting” option but do not

[GENERAL] Trigger function problem

2017-10-10 Thread Liglio Cavalcante
Hi, I am trying to migrate data from a master table to her partitions. I am using an update trigger to delete and insert into the master table, and so an insert trigger on the master table redirects the inserted registers to the respective parrtition table. The problem is that the update trigger

Re: [GENERAL] Permissions for Web App

2017-10-10 Thread Stephen Frost
Greetings, * Igal @ Lucee.org (i...@lucee.org) wrote: > It worked, thanks! Be sure to check that you're really getting what you want here. > For future reference and for the benefit of others, the command that > I ran is: > >   ALTER DEFAULT PRIVILEGES IN SCHEMA public >     GRANT SELECT,

Re: [GENERAL] pg_start/stop_backup naming conventions

2017-10-10 Thread Stephen Frost
Greetings, * mj0nes (matthew.jo...@ramtech.co.uk) wrote: > I'm just starting out on a rolling backup strategy and the naming convention > has thrown me slightly for the WAL and "backup_label" files. > > What I want to do is pair up the backup label files with the associated tar > ball of the

Re: [GENERAL] Using cp to back up a database?

2017-10-10 Thread Stephen Frost
Ron, * Ron Johnson (ron.l.john...@cox.net) wrote: > Maybe my original question wasn't clear, so I'll try again: is it > safe to do a physical using cp (as opposed to rsync)? Frankly, I'd say no. There's nothing to guarantee that the backup is actually sync'd out to disk. Further, you're

[GENERAL] [asking about how to upgrade docker postgresql without losing the data]

2017-10-10 Thread Olivani Prisila
Hi, I am beginner both of docker and postgresql. How do i upgrade docker postgresql 9.5 into 9.6 without losing my current database? fyi: im using ubuntu verison 14 and docker 17.09 thanks. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your

Re: [GENERAL] Equivalence Classes when using IN

2017-10-10 Thread Kim Rose Carlsen
> If the only reason that is_simple_subquery() rejects subqueries with > ORDER BY is due to wanting to keep the order by of a view, then > couldn't we make is_simple_subquery() a bit smarter and have it check > if the subquery is going to be joined to something else, which likely > would destroy

Re: [GENERAL] Equivalence Classes when using IN

2017-10-10 Thread Kim Rose Carlsen
> You would benefit from adding the age column to view_customer, or at > least consider having some view which contains all the columns you'll > ever need from those tables and if you need special views with only a > subset of columns due to some software doing "select * from > viewname;", then