Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread Tomas Vondra


On 10/11/2017 02:26 AM, pinker wrote:
> Tomas Vondra-4 wrote
>> I'm probably a bit dumb (after all, it's 1AM over here), but can you
>> explain the CPU chart? I'd understand percentages (say, 75% CPU used)
>> but what do the seconds / fractions mean? E.g. when the system time
>> reaches 5 seconds, what does that mean?
> 
> hehe, no you've just spotted a mistake, it suppose to be 50 cores :)
> out of 80 in total
> 

Ah, so it should say '50 cores' instead of '5s'? Well, that's busy
system I guess.

> 
> Tomas Vondra-4 wrote
>> Have you tried profiling using perf? That usually identifies hot spots
>> pretty quickly - either in PostgreSQL code or in the kernel.
> 
> I was always afraid because of overhead, but maybe it's time to start ...
> 

I don't follow. If you're not in trouble, a little bit of additional
overhead is not an issue (but you generally don't need profiling at that
moment). If you're already in trouble, then spending a bit of CPU time
on basic CPU profile is certainly worth it.

> 
> Tomas Vondra-4 wrote
>> What I meant is that if the system evicts this amount of buffers all the
>> time (i.e. there doesn't seem to be any sudden spike), then it's
>> unlikely to be the cause (or related to it).
> 
> I was actually been thinking about scenario where different sessions
> want to at one time read/write from or to many different relfilenodes,
> what could cause page swap between shared buffers and os cache?
Perhaps. If the sessions only do reads, that would not be visible in
buffer_backends I believe (not sure ATM, would have to check source).
But it'd be visible in buffers_alloc and certainly in blks_read.

> we see that context switches on cpu are increasing as well. kernel 
> documentation says that using page tables instead of Translation
> Lookaside Buffer (TLB) is very costly and on some blogs have seen
> recomendations that using huge pages (so more addresses can fit in
> TLB) will help here but postgresql, unlike oracle, cannot use it for
> anything else than page buffering (so 16gb) ... so process memory
> still needs to use 4k pages.
>

The context switches are likely due to large number of runnable
processes competing for the CPU.

Also, memory bandwidth is increasingly an issue on big boxes ...

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread Justin Pryzby
On Tue, Oct 10, 2017 at 01:40:07PM -0700, pinker wrote:
> Hi to all!
> 
> We've got problem with a very serious repetitive incident on our core
> system. Namely, cpu load spikes to 300-400 and the whole db becomes
> unresponsive. From db point of view nothing special is happening, memory
> looks fine, disks io's are ok and the only problem is huge cpu load. Kernel
> parameters that are increasing with load are always the same:

> * disabled transparent huge pages (they were set before unfortunately to
> 'always')

Did you also try disabling KSM ?
echo 2 |sudo tee /sys/kernel/mm/ksm/run

I believe for us that was affecting a postgres VM(QEMU/KVM) and maybe not
postgres itself.  Worth a try ?

https://www.postgresql.org/message-id/20170718180152.GE17566%40telsasoft.com

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread pinker
Andres Freund wrote
> Others mentioned already that that's worth improving.

Yes, we are just setting up pgbouncer


Andres Freund wrote
> Some versions of this kernel have had serious problems with transparent
> hugepages. I'd try turning that off. I think it defaults to off even in
> that version, but also make sure zone_reclaim_mode is disabled.

Yes, I'm aware of that so always set it to never.
but thank you for the zone_reclaim_mode.



Andres Freund wrote
> 9.6 has quite some scalability improvements over 9.5. I don't know
> whether it's feasible for you to update, but if so, It's worth trying.
> 
> How about taking perf profile to investigate?

Both are on my to do list :)



Andres Freund wrote
> I'd suggest monitoring /proc/meminfo for the amount of Dirty and
> Writeback memory, and see whether rapid changes therein coincide with
> periodds of slowdown.

yes, I was monitoring it the whole day and that's the reason why I've
changed dirty_background_ratio but both of them were flat - without any
bigger spikes. 








--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread pinker
Tomas Vondra-4 wrote
> I'm probably a bit dumb (after all, it's 1AM over here), but can you
> explain the CPU chart? I'd understand percentages (say, 75% CPU used)
> but what do the seconds / fractions mean? E.g. when the system time
> reaches 5 seconds, what does that mean?

hehe, no you've just spotted a mistake, it suppose to be 50 cores :)
out of 80 in total


Tomas Vondra-4 wrote
> Have you tried profiling using perf? That usually identifies hot spots
> pretty quickly - either in PostgreSQL code or in the kernel.

I was always afraid because of overhead, but maybe it's time to start ...


Tomas Vondra-4 wrote
> What I meant is that if the system evicts this amount of buffers all the
> time (i.e. there doesn't seem to be any sudden spike), then it's
> unlikely to be the cause (or related to it).

I was actually been thinking about scenario where different sessions want to
at one time read/write from or to many different relfilenodes, what could
cause page swap between shared buffers and os cache? we see that context
switches on cpu are increasing as well. kernel documentation says that using
page tables instead of Translation Lookaside Buffer (TLB) is very costly and
on some blogs have seen recomendations that using huge pages (so more
addresses can fit in TLB) will help here but postgresql, unlike oracle,
cannot use it for anything else than page buffering (so 16gb) ... so process
memory still needs to use 4k pages.
or memory fragmentation?




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread Andres Freund
Hi,

On 2017-10-10 13:40:07 -0700, pinker wrote:
> and the total number of connections are increasing very fast (but I suppose
> it's the symptom not the root cause of cpu load) and exceed max_connections
> (1000).

Others mentioned already that that's worth improving.

> System:
> * CentOS Linux release 7.2.1511 (Core) 
> * Linux 3.10.0-327.36.3.el7.x86_64 #1 SMP Mon Oct 24 16:09:20 UTC 2016
> x86_64 x86_64 x86_64 GNU/Linux

Some versions of this kernel have had serious problems with transparent
hugepages. I'd try turning that off. I think it defaults to off even in
that version, but also make sure zone_reclaim_mode is disabled.


> * postgresql95-9.5.5-1PGDG.rhel7.x86_64
> * postgresql95-contrib-9.5.5-1PGDG.rhel7.x86_64
> * postgresql95-docs-9.5.5-1PGDG.rhel7.x86_64
> * postgresql95-libs-9.5.5-1PGDG.rhel7.x86_64
> * postgresql95-server-9.5.5-1PGDG.rhel7.x86_64
> 
> * 4 sockets/80 cores

9.6 has quite some scalability improvements over 9.5. I don't know
whether it's feasible for you to update, but if so, It's worth trying.

How about taking perf profile to investigate?


> * vm.dirty_background_bytes = 0
> * vm.dirty_background_ratio = 2
> * vm.dirty_bytes = 0
> * vm.dirty_expire_centisecs = 3000
> * vm.dirty_ratio = 20
> * vm.dirty_writeback_centisecs = 500

I'd suggest monitoring /proc/meminfo for the amount of Dirty and
Writeback memory, and see whether rapid changes therein coincide with
periodds of slowdown.


Greetings,

Andres Freund


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread Tomas Vondra


On 10/11/2017 12:28 AM, pinker wrote:
> Tomas Vondra-4 wrote
>> What is "CPU load"? Perhaps you mean "load average"?
> 
> Yes, I wasn't exact: I mean system cpu usage, it can be seen here - it's the
> graph from yesterday's failure (after 6p.m.):
>  
> So as one can see connections spikes follow cpu spikes...
> 

I'm probably a bit dumb (after all, it's 1AM over here), but can you
explain the CPU chart? I'd understand percentages (say, 75% CPU used)
but what do the seconds / fractions mean? E.g. when the system time
reaches 5 seconds, what does that mean?

> 
> Tomas Vondra-4 wrote
>> Also, what are the basic system parameters (number of cores, RAM), it's
>> difficult to help without knowing that.
> 
> I have actually written everything in the first post:
> 80 CPU and 4 sockets
> over 500GB RAM
> 

Apologies, missed that bit.

> 
> Tomas Vondra-4 wrote
>> Well, 3M transactions over ~2h period is just ~450tps, so nothing
>> extreme. Not sure how large the transactions are, of course.
> 
> It's quite a lot going on. Most of them are complicated stored procedures.
> 

OK.

> 
> Tomas Vondra-4 wrote
>> Something gets executed on the database. We have no idea what it is, but
>> it should be in the system logs. And you should see the process in 'top'
>> with large amounts of virtual memory ...
> 
> Yes, it would be much easier if it would be just single query from the top,
> but the most cpu is eaten by the system itself and I'm not sure why. I
> suppose because of page tables size and anon pages is NUMA related.
> 

Have you tried profiling using perf? That usually identifies hot spots
pretty quickly - either in PostgreSQL code or in the kernel.

> 
> Tomas Vondra-4 wrote
>> Another possibility is a run-away query that consumes a lot of work_mem.
> 
> It was exactly my first guess. work_mem is set to ~ 350MB and I see a lot of
> stored procedures with unnecessary WITH clauses (i.e. materialization) and
> right after it IN query with results of that (hash).
> 

Depends on how much data is in the CTEs. We don't really allocate all of
work_mem at once, but bit by bit.

> 
> Tomas Vondra-4 wrote
>> Measure cache hit ratio (see pg_stat_database.blks_hit and blks_read),
>> and then you can decide.
> 
> Thank you for the tip. I always do it but haven't here,  so the result is
> 0.992969610990056 - so increasing it is rather pointless.
> 

Yeah.

> 
> Tomas Vondra-4 wrote
>> You may also make the bgwriter more aggressive - that won't really
>> improve the hit ratio, it will only make enough room for the backends.
> 
> yes i probably will
> 

On the other hand, the numbers are rather low. I mean, the backends
seems to be evicting ~15k buffers over 5-minute period, which is pretty
much nothing (~400kB/s). I wouldn't bother by tuning this.

> 
> Tomas Vondra-4 wrote
>> But I don't quite see how this could cause the severe problems you have,
>> as I assume this is kinda regular behavior on that system. Hard to say
>> without more data.
> 
> I can provide you with any data you need :)
> 

What I meant is that if the system evicts this amount of buffers all the
time (i.e. there doesn't seem to be any sudden spike), then it's
unlikely to be the cause (or related to it).

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread pinker
Victor Yegorov wrote
> Looks like `sdg` and `sdm` are the ones used most.
> Can you describe what's on those devices? Do you have WAL and DB sitting
> together?
> Where DB log files are stored?

it's multipath with the same LUN for PGDATA and pg_log, but separate one for
xlogs and archives.

mpatha  dm-4 IBM ,2145

size=2.0T features='0' hwhandler='0' wp=rw

|-+- policy='round-robin 0' prio=0 status=active

| |- 7:0:1:2 sdg 8:96  active undef running

| `- 8:0:1:2 sdm 8:192 active undef running

`-+- policy='round-robin 0' prio=0 status=enabled

  |- 7:0:0:2 sdd 8:48  active undef running

  `- 8:0:0:2 sdj 8:144 active undef running



Victor Yegorov wrote
> Can you, please, provide the output of this query (linked from the article
> mentioned):
> https://gist.github.com/lesovsky/4587d70f169739c01d4525027c087d14

00:26:51.226024|120 days
03:05:37.987175|0.6|7.99|300.63|0.46|12673500.4|162.00|0.34|0.51|0.37|1.22|26.721|27.7|41.8|30.6|4.47|34.27|--|21532|124|6510377185|9920323|449049896|677360078|2321057|495798075|0


Victor Yegorov wrote
> And also this query:
> SELECT name,version,source FROM pg_settings WHERE source NOT IN
> ('default','override');

application_name | client   | psql

archive_command  | configuration file   | 

archive_mode | configuration file   | on

autovacuum   | configuration file   | on

autovacuum_max_workers   | configuration file   | 10

checkpoint_completion_target | configuration file   | 0.9

checkpoint_timeout   | configuration file   | 480

client_encoding  | client   | UTF8

DateStyle| configuration file   | ISO, MDY

default_statistics_target| configuration file   | 350

default_text_search_config   | configuration file   | pg_catalog.english

effective_cache_size | configuration file   | 52428800

enable_indexscan | configuration file   | on

huge_pages   | configuration file   | on

lc_messages  | configuration file   | en_US.UTF-8

lc_monetary  | configuration file   | en_US.UTF-8

lc_numeric   | configuration file   | en_US.UTF-8

lc_time  | configuration file   | en_US.UTF-8

listen_addresses | configuration file   | *

log_autovacuum_min_duration  | configuration file   | 0

log_checkpoints  | configuration file   | on

log_connections  | configuration file   | on

log_destination  | configuration file   | stderr

log_directory| configuration file   | pg_log

log_disconnections   | configuration file   | on

log_duration | configuration file   | off

log_filename | configuration file   | postgresql-%a.log

log_line_prefix  | configuration file   | %t [%p]: [%l-1]
user=%u,db=%d

log_lock_waits   | configuration file   | on

log_min_duration_statement   | configuration file   | 0

log_rotation_age | configuration file   | 1440

log_rotation_size| configuration file   | 0

log_temp_files   | configuration file   | 0

log_timezone | configuration file   | Poland

log_truncate_on_rotation | configuration file   | on

logging_collector| configuration file   | on

maintenance_work_mem | configuration file   | 2097152

max_connections  | configuration file   | 1000

max_stack_depth  | environment variable | 2048

max_wal_senders  | configuration file   | 10

max_wal_size | configuration file   | 640

random_page_cost | configuration file   | 1

shared_buffers   | configuration file   | 2097152

temp_buffers | configuration file   | 16384

TimeZone | configuration file   | Poland

track_functions  | configuration file   | all

track_io_timing  | configuration file   | off

wal_buffers  | configuration file   | 2048

wal_keep_segments| configuration file   | 150

wal_level| configuration file   | hot_standby

work_mem | configuration file   | 393216+






--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread John R Pierce

On 10/10/2017 3:28 PM, pinker wrote:

It was exactly my first guess. work_mem is set to ~ 350MB and I see a lot of
stored procedures with unnecessary WITH clauses (i.e. materialization) and
right after it IN query with results of that (hash).


1000 connections all doing queries that need 1 work_mem each will 
consume 1000*350MB == 350GB of your ram.    many queries use several 
work_mem's.


if the vast majority of your operations are OLTP and only access a few 
rows, then large work_mem is NOT a good idea.   If you're doing large 
aggregate operations like OLAP for reporting or whatever, then thats 
another story, but generally doing that sort of thing does NOT use 1000 
connections.



--
john r pierce, recycling bits in santa cruz



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread pinker
Tomas Vondra-4 wrote
> What is "CPU load"? Perhaps you mean "load average"?

Yes, I wasn't exact: I mean system cpu usage, it can be seen here - it's the
graph from yesterday's failure (after 6p.m.):
 
So as one can see connections spikes follow cpu spikes...


Tomas Vondra-4 wrote
> Also, what are the basic system parameters (number of cores, RAM), it's
> difficult to help without knowing that.

I have actually written everything in the first post:
80 CPU and 4 sockets
over 500GB RAM


Tomas Vondra-4 wrote
> Well, 3M transactions over ~2h period is just ~450tps, so nothing
> extreme. Not sure how large the transactions are, of course.

It's quite a lot going on. Most of them are complicated stored procedures.


Tomas Vondra-4 wrote
> Something gets executed on the database. We have no idea what it is, but
> it should be in the system logs. And you should see the process in 'top'
> with large amounts of virtual memory ...

Yes, it would be much easier if it would be just single query from the top,
but the most cpu is eaten by the system itself and I'm not sure why. I
suppose because of page tables size and anon pages is NUMA related.



Tomas Vondra-4 wrote
> Another possibility is a run-away query that consumes a lot of work_mem.

It was exactly my first guess. work_mem is set to ~ 350MB and I see a lot of
stored procedures with unnecessary WITH clauses (i.e. materialization) and
right after it IN query with results of that (hash).



Tomas Vondra-4 wrote
> Measure cache hit ratio (see pg_stat_database.blks_hit and blks_read),
> and then you can decide.

Thank you for the tip. I always do it but haven't here,  so the result is
0.992969610990056 - so increasing it is rather pointless.


Tomas Vondra-4 wrote
> You may also make the bgwriter more aggressive - that won't really
> improve the hit ratio, it will only make enough room for the backends.

yes i probably will


Tomas Vondra-4 wrote
> But I don't quite see how this could cause the severe problems you have,
> as I assume this is kinda regular behavior on that system. Hard to say
> without more data.

I can provide you with any data you need :)


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general





--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread Victor Yegorov
2017-10-11 0:53 GMT+03:00 pinker :

> > Can you provide output of `iostat -myx 10` at the “peak” moments, please?
>
> sure, please find it here:
> https://pastebin.com/f2Pv6hDL


Looks like `sdg` and `sdm` are the ones used most.
Can you describe what's on those devices? Do you have WAL and DB sitting
together?
Where DB log files are stored?


Here you'll find the
> full bgwriter stats: https://pastebin.com/VA8pyfXj


Can you, please, provide the output of this query (linked from the article
mentioned):
https://gist.github.com/lesovsky/4587d70f169739c01d4525027c087d14

And also this query:
SELECT name,version,source FROM pg_settings WHERE source NOT IN
('default','override');


-- 
Victor Yegorov


Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread pinker
Scott Marlowe-2 wrote
> Ouch, unless I'm reading that wrong, your IO subsystem seems to be REALLY
> slow.

it's a huge array where a lot is happening, for instance data snapshots :/
the lun on which is this db is dm-7.
I'm a DBA with null knowledge about arrays so any advice will be much
appreciated :)




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread Scott Marlowe
On Tue, Oct 10, 2017 at 3:53 PM, pinker  wrote:
> Victor Yegorov wrote
>> Can you provide output of `iostat -myx 10` at the “peak” moments, please?
>
> sure, please find it here:
> https://pastebin.com/f2Pv6hDL

Ouch, unless I'm reading that wrong, your IO subsystem seems to be REALLY slow.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread pinker
Victor Yegorov wrote
> Can you provide output of `iostat -myx 10` at the “peak” moments, please?

sure, please find it here:
https://pastebin.com/f2Pv6hDL


Victor Yegorov wrote
> Also, it'd be good to look in more detailed bgwriter/checkpointer stats.
> You can find more details in this post: http://blog.postgresql-
> consulting.com/2017/03/deep-dive-into-postgres-stats_27.html
> (You might want to reset 'shared' stats here.)

thank you for the link, it's really nice explanation. Here you'll find the
full bgwriter stats: https://pastebin.com/VA8pyfXj


-- 
Victor Yegorov





--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Error: "cached plan must not change result type"

2017-10-10 Thread Dmitry Dolgov
>On 9 Oct 2017 13:13, "Durumdara"  wrote:
>
> "cached plan must not change result type"

As far as I remember, this kind of errors you can get from a prepared
statement execution, when the result type of it was changed (as in your
case by adding a column to a table that provides the data for a prepared
statement).


Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread Tomas Vondra


On 10/10/2017 10:40 PM, pinker wrote:
> Hi to all!
> 
> We've got problem with a very serious repetitive incident on our core
> system. Namely, cpu load spikes to 300-400 and the whole db becomes

What is "CPU load"? Perhaps you mean "load average"?

Also, what are the basic system parameters (number of cores, RAM), it's
difficult to help without knowing that.

> unresponsive. From db point of view nothing special is happening, memory
> looks fine, disks io's are ok and the only problem is huge cpu load. Kernel
> parameters that are increasing with load are always the same:
> 
> * page tables size
> * Committed_AS
> * Active anon
> 
>  
> 
> and the total number of connections are increasing very fast (but I suppose
> it's the symptom not the root cause of cpu load) and exceed max_connections
> (1000).
> 

I doubt you have 1000 cores in your system, so 1000 connections active
at the same time is guaranteed to cause issues. What we see quite often
is a minor hiccup (occasional slow query) snowballing into much more
serious trouble exactly because of this.

Queries get slower for some reason, application starts opening more
connections (through a built-in connection pool) to run more queries,
that further increases pressure, slows the queries even more, ...

As Scott suggested, you should consider using a connection pool.

> System:
> * CentOS Linux release 7.2.1511 (Core) 
> * Linux 3.10.0-327.36.3.el7.x86_64 #1 SMP Mon Oct 24 16:09:20 UTC 2016
> x86_64 x86_64 x86_64 GNU/Linux
> * postgresql95-9.5.5-1PGDG.rhel7.x86_64
> * postgresql95-contrib-9.5.5-1PGDG.rhel7.x86_64
> * postgresql95-docs-9.5.5-1PGDG.rhel7.x86_64
> * postgresql95-libs-9.5.5-1PGDG.rhel7.x86_64
> * postgresql95-server-9.5.5-1PGDG.rhel7.x86_64
> 
> * 4 sockets/80 cores
> * vm.dirty_background_bytes = 0
> * vm.dirty_background_ratio = 2
> * vm.dirty_bytes = 0
> * vm.dirty_expire_centisecs = 3000
> * vm.dirty_ratio = 20
> * vm.dirty_writeback_centisecs = 500
> 
> after the first incident we have changed:
> * increased shared_buffers to 16GB (completely on huge pages. previously
> 2GB)
> * adjusted vm.nr_hugepages to 8000 (we've got 2mb pages)
> * changed vm.overcommit_memory = 2 and vm.overcommit_ratio = 99
> * disabled transparent huge pages (they were set before unfortunately to
> 'always')
> 
> 
> It's a highly transactional db. Today I've run:
> select now(), txid_current();
> and the results:
> 3 339 351 transactions between 2017-10-10 14:42 and 2017-10-10 16:24
> 

Well, 3M transactions over ~2h period is just ~450tps, so nothing
extreme. Not sure how large the transactions are, of course.

> ... snip ...
> 
> So, from the kernel stats we know that the failure happens when db is trying
> to alocate some huge amount of pages (page tables size, anons, commited_as).
> But what is triggering this situation? 
>

Something gets executed on the database. We have no idea what it is, but
it should be in the system logs. And you should see the process in 'top'
with large amounts of virtual memory ...

> I suppose it could be lazy autovacuum (just standard settings). So
> autovacuum had to read whole 369gb yesterday to clean xids. today did the
> same on some other tables.

Possible, but it shouldn't allocate more than maintenance_work_mem. So I
don't why it would allocate so much virtual memory.

Another possibility is a run-away query that consumes a lot of work_mem.

> Another idea is too small shared buffers setting. 

... snip ...

> bgwriter stats:
>  
> 

Yes, this suggests you probably have shared_buffers set too low, but
it's impossible to say if increasing the size will help - perhaps your
active set (part of DB you regularly access) is way too big.

Measure cache hit ratio (see pg_stat_database.blks_hit and blks_read),
and then you can decide.

You may also make the bgwriter more aggressive - that won't really
improve the hit ratio, it will only make enough room for the backends.

But I don't quite see how this could cause the severe problems you have,
as I assume this is kinda regular behavior on that system. Hard to say
without more data.


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread Victor Yegorov
2017-10-10 23:40 GMT+03:00 pinker :

> We've got problem with a very serious repetitive incident on our core
> system. Namely, cpu load spikes to 300-400 and the whole db becomes
> unresponsive. From db point of view nothing special is happening, memory
> looks fine, disks io's are ok and the only problem is huge cpu load. Kernel
> parameters that are increasing with load are always the same:
>

Can you provide output of `iostat -myx 10` at the “peak” moments, please?

Also, it'd be good to look in more detailed bgwriter/checkpointer stats.
You can find more details in this post: http://blog.postgresql-
consulting.com/2017/03/deep-dive-into-postgres-stats_27.html
(You might want to reset 'shared' stats here.)


-- 
Victor Yegorov


Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread pinker
Thank you Scott,
we are planning to do it today. But are you sure it will help in this case?



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread Scott Marlowe
On Tue, Oct 10, 2017 at 2:40 PM, pinker  wrote:
> Hi to all!
>
> We've got problem with a very serious repetitive incident on our core
> system. Namely, cpu load spikes to 300-400 and the whole db becomes
> unresponsive. From db point of view nothing special is happening, memory
> looks fine, disks io's are ok and the only problem is huge cpu load. Kernel
> parameters that are increasing with load are always the same:

The solution here is to reduce the number of connections usually via
some kind of connection pooling. Any db server will have a max
throughput at around the number of cpu cores == connections (give or
take a factor of 2). Outside that performance falls off, and has a
very sharp knee on the other side as the # of conns goes up.

Reduce connections, db runs faster. Increase it slows until it
eventually falls over.

pgbouncer and pgpool II are useful on the db end, look at pooling
options on the app side as well.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] core system is getting unresponsive because over 300 cpu load

2017-10-10 Thread pinker
Hi to all!

We've got problem with a very serious repetitive incident on our core
system. Namely, cpu load spikes to 300-400 and the whole db becomes
unresponsive. From db point of view nothing special is happening, memory
looks fine, disks io's are ok and the only problem is huge cpu load. Kernel
parameters that are increasing with load are always the same:

* page tables size
* Committed_AS
* Active anon

 

and the total number of connections are increasing very fast (but I suppose
it's the symptom not the root cause of cpu load) and exceed max_connections
(1000).

System:
* CentOS Linux release 7.2.1511 (Core) 
* Linux 3.10.0-327.36.3.el7.x86_64 #1 SMP Mon Oct 24 16:09:20 UTC 2016
x86_64 x86_64 x86_64 GNU/Linux
* postgresql95-9.5.5-1PGDG.rhel7.x86_64
* postgresql95-contrib-9.5.5-1PGDG.rhel7.x86_64
* postgresql95-docs-9.5.5-1PGDG.rhel7.x86_64
* postgresql95-libs-9.5.5-1PGDG.rhel7.x86_64
* postgresql95-server-9.5.5-1PGDG.rhel7.x86_64

* 4 sockets/80 cores
* vm.dirty_background_bytes = 0
* vm.dirty_background_ratio = 2
* vm.dirty_bytes = 0
* vm.dirty_expire_centisecs = 3000
* vm.dirty_ratio = 20
* vm.dirty_writeback_centisecs = 500

after the first incident we have changed:
* increased shared_buffers to 16GB (completely on huge pages. previously
2GB)
* adjusted vm.nr_hugepages to 8000 (we've got 2mb pages)
* changed vm.overcommit_memory = 2 and vm.overcommit_ratio = 99
* disabled transparent huge pages (they were set before unfortunately to
'always')


It's a highly transactional db. Today I've run:
select now(), txid_current();
and the results:
3 339 351 transactions between 2017-10-10 14:42 and 2017-10-10 16:24

* db size 1,1TB
* RAM over 500GB
* biggest tables (the rest isn't big):
369 GB
48 GB
48 GB
34 GB
23 GB
19 GB
19 GB
17 GB
16 GB
12 GB
9910 MB

We have captured some of db statistics, for instance bgwriter and
buffercache.
Today the load spides happened at:
1). 10:44
2). 11:04
(and then several times during a day)
The premiere was yesterday about 6PM.

What we observed back then was for instance autovacuum process to prevent
wraparound on the biggest table (369GB). We did vacuum freeze manually after
this happened but before that we gathered statistics with the query:
SELECT
   oid::regclass::text AS table,
   age(relfrozenxid) AS xid_age, 
   mxid_age(relminmxid) AS mxid_age, 
   least( 
(SELECT setting::int
FROMpg_settings
WHERE   name = 'autovacuum_freeze_max_age') - age(relfrozenxid), 
(SELECT setting::int
FROMpg_settings
WHERE   name = 'autovacuum_multixact_freeze_max_age') -
mxid_age(relminmxid)  
) AS tx_before_wraparound_vacuum,
pg_size_pretty(pg_total_relation_size(oid)) AS size,
pg_stat_get_last_autovacuum_time(oid) AS last_autovacuum
FROMpg_class
ORDER BY tx_before_wraparound_vacuum;

and the biggest table which was vacuumed looked like:
2173105118156548-17310511369 GB2017-09-30
01:57:33.972068+02


So, from the kernel stats we know that the failure happens when db is trying
to alocate some huge amount of pages (page tables size, anons, commited_as).
But what is triggering this situation? 
I suppose it could be lazy autovacuum (just standard settings). So
autovacuum had to read whole 369gb yesterday to clean xids. today did the
same on some other tables.
Another idea is too small shared buffers setting. 
Today it looked like:
 

c - means count 
the number after c is the usage count, so c5dirty means here count of dirty
pages with usagecount=5

that is the snapshot before and after the failure at 10:44

before and after the spike at 11:04:
 


My interpretation of it is the following:
 the count of clean buffers with high usagecount is decreasing, the count of
buffers with usagecount of 0 and 1 is very unstable -> so the buffers have
no time to get older in the shared buffers and are thrown out?

bgwriter stats:
 

the biggest number of buffers is cleaned by backends - so there is no free
buffers with usagecount 0 and LWlocks happen?

So increasing shared buffers would be a solution?
Please help, it's happening quite often and I'm not sure which way is the
right one...  








--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] startup process stuck in recovery

2017-10-10 Thread Tom Lane
Christophe Pettus  writes:
> I was able to reproduce this on 9.5.9 with the following:

Hmm ... so I still can't reproduce the specific symptoms Christophe
reports.

What I see is that, given this particular test case, the backend
process on the master never holds more than a few locks at a time.
Each time we abort a subtransaction, the AE lock it was holding
on the temp table it created gets dropped.  However ... on the
standby server, pre v10, the replay process attempts to take all
12000 of those AE locks at once.  This is not a great plan.

On 9.5, for me, as soon as we're out of shared memory
ResolveRecoveryConflictWithLock will go into an infinite loop.
And AFAICS it *is* infinite; it doesn't look to me like it's
making any progress.  This is pretty easy to diagnose though
because it spews "out of shared memory" WARNING messages to the
postmaster log at an astonishing rate.

9.6 hits the OOM condition as well, but manages to get out of it
somehow.  I'm not very clear how, and the log trace doesn't look
like it's real clean: after a bunch of these

WARNING:  out of shared memory
CONTEXT:  xlog redo at 0/C1098AC0 for Standby/LOCK: xid 134024 db 423347 rel 
524106 
WARNING:  out of shared memory
CONTEXT:  xlog redo at 0/C10A97E0 for Standby/LOCK: xid 134024 db 423347 rel 
524151 
WARNING:  out of shared memory
CONTEXT:  xlog redo at 0/C10B36B0 for Standby/LOCK: xid 134024 db 423347 rel 
524181 
WARNING:  out of shared memory
CONTEXT:  xlog redo at 0/C10BD780 for Standby/LOCK: xid 134024 db 423347 rel 
524211 

you get a bunch of these

WARNING:  you don't own a lock of type AccessExclusiveLock
CONTEXT:  xlog redo at 0/C13A79B0 for Transaction/COMMIT: 2017-10-10 
15:05:56.615721-04
LOG:  RecoveryLockList contains entry for lock no longer recorded by lock 
manager: xid 134024 database 423347 relation 526185
CONTEXT:  xlog redo at 0/C13A79B0 for Transaction/COMMIT: 2017-10-10 
15:05:56.615721-04
WARNING:  you don't own a lock of type AccessExclusiveLock
CONTEXT:  xlog redo at 0/C13A79B0 for Transaction/COMMIT: 2017-10-10 
15:05:56.615721-04
LOG:  RecoveryLockList contains entry for lock no longer recorded by lock 
manager: xid 134024 database 423347 relation 526188
CONTEXT:  xlog redo at 0/C13A79B0 for Transaction/COMMIT: 2017-10-10 
15:05:56.615721-04
WARNING:  you don't own a lock of type AccessExclusiveLock
CONTEXT:  xlog redo at 0/C13A79B0 for Transaction/COMMIT: 2017-10-10 
15:05:56.615721-04
LOG:  RecoveryLockList contains entry for lock no longer recorded by lock 
manager: xid 134024 database 423347 relation 526191
CONTEXT:  xlog redo at 0/C13A79B0 for Transaction/COMMIT: 2017-10-10 
15:05:56.615721-04

The important point though is that "a bunch" is a finite number,
whereas 9.5 seems to be just stuck.  I'm not sure how Christophe's
server managed to continue to make progress.

It looks like the 9.6-era patch 37c54863c must have been responsible
for that behavioral change.  There's no indication in the commit message
or the comments that anyone had specifically considered the OOM
scenario, so I think it's just accidental that it's better.

v10 and HEAD avoid the problem because the standby server doesn't
take locks (any at all, AFAICS).  I suppose this must be a
consequence of commit 9b013dc238c, though I'm not sure exactly how.

Anyway, it's pretty scary that it's so easy to run the replay process
out of shared memory pre-v10.  I wonder if we should consider
backpatching that fix.  Any situation where the replay process takes
more locks concurrently than were ever held on the master is surely
very bad news.

A marginally lesser concern is that the replay process does need to have
robust behavior in the face of locktable OOM.  AFAICS whatever it is doing
now is just accidental, and I'm not sure it's correct.  "Doesn't get into
an infinite loop" is not a sufficiently high bar.

And I'm still wondering exactly what Christophe actually saw ...

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Equivalence Classes when using IN

2017-10-10 Thread Nico Williams
On Mon, Oct 09, 2017 at 07:44:50PM -0400, Tom Lane wrote:
> David Rowley  writes:
> > If the only reason that is_simple_subquery() rejects subqueries with
> > ORDER BY is due to wanting to keep the order by of a view, then
> > couldn't we make is_simple_subquery() a bit smarter and have it check
> > if the subquery is going to be joined to something else, which likely
> > would destroy the order, or at least it would remove any guarantees of
> > it.
> 
> I'm not on board with this.  The assumption is that if the user put an
> ORDER BY there, that means they want that subquery to be computed in that
> order.  It's not for us to decide they didn't mean what they said.
> 
> Moreover, there are cases where the ORDER BY would be semantically
> significant, eg if there's a LIMIT or volatile functions or tSRFs
> involved.

Or where the order is meaningful to an aggregate function applied to
columns of a view result set.  I'm not sure what the full set of cases
where the ORDER BY on the inner query is meaningful, but I'm sure there
are cases it is not.

If there are no such constraints on dropping the ORDER BY, then the it
could be dropped, making the view query simpler.

Nico
-- 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] startup process stuck in recovery

2017-10-10 Thread Christophe Pettus

> On Oct 10, 2017, at 08:05, Tom Lane  wrote:
> 
> You're right, I was testing on HEAD, so that patch might've obscured
> the problem.  But the code looks like it could still be O(N^2) in
> some cases.  Will look again later.

I was able to reproduce this on 9.5.9 with the following:

DO $$
DECLARE
   i int := 1;
BEGIN
   FOR i IN 1..12000 LOOP
  BEGIN
  PERFORM f();
  i := i / 0;
  EXCEPTION
 WHEN division_by_zero THEN
  END;
   END LOOP;
END;
$$ language plpgsql;

where f() is:

CREATE OR REPLACE FUNCTION f() RETURNS VOID AS $$
BEGIN
  CREATE TEMPORARY TABLE test_table ON COMMIT DROP AS SELECT i FROM 
generate_series(1, 100) i;
END:
$$ language plpgsql;

A couple of observations:

-- In this version, I couldn't do a select * from pg_locks() on the secondary 
without getting an out-of-shared-memory error.
-- If I increased max_locks_per_transaction to 15000, the problem didn't occur, 
even if I bumped up the number of iterations in the first to 2.

--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] startup process stuck in recovery

2017-10-10 Thread Tom Lane
Alvaro Herrera  writes:
> Tom Lane wrote:
>> Hmm, I tried to reproduce this and could not.  I experimented with
>> various permutations of this:

> This problem is probably related to commit 9b013dc238c, which AFAICS is
> only in pg10, not 9.5.

You're right, I was testing on HEAD, so that patch might've obscured
the problem.  But the code looks like it could still be O(N^2) in
some cases.  Will look again later.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [asking about how to upgrade docker postgresql without losing the data]

2017-10-10 Thread David G. Johnston
On Tue, Oct 10, 2017 at 4:25 AM, Olivani Prisila 
wrote:

> Hi,
>
> I am beginner both of docker and postgresql.
>
> How do i upgrade docker postgresql 9.5 into 9.6 without losing my
> current database?
> fyi: im using ubuntu verison 14 and docker 17.09
>

​More of a Docker forum question than PostgreSQL.  It depends on whether
you setup a distinct data container and/or mounted a host location into the
container to store the data.  In that case you should be able to link that
external dependency into the new runtime container in the same manner as it
is linked into the existing one.

If the data is directly within the runtime container it will be
considerably more difficult - though pg_dump/pg_restore might prove easiest
to accomplish.

David J.
​


Re: [GENERAL] Strange checkpoint behavior - checkpoints take alongtime

2017-10-10 Thread Vladimir Nicolici
In fact it was a single delete statement.

From: Vladimir Nicolici
Sent: Tuesday, October 10, 2017 17:30
To: Achilleas Mantzios; pgsql-general@postgresql.org
Subject: RE: [GENERAL] Strange checkpoint behavior - checkpoints take alongtime

No, it didn’t. The delete was done in a single transaction.


Re: [GENERAL] startup process stuck in recovery

2017-10-10 Thread Alvaro Herrera
Tom Lane wrote:
> Christophe Pettus  writes:
> > The problem indeed appear to be a very large number of subtransactions, 
> > each one creating a temp table, inside a single transaction.  It's made 
> > worse by one of those transactions finally getting replayed on the 
> > secondary, only to have another one come in right behind it...
> 
> Hmm, I tried to reproduce this and could not.  I experimented with
> various permutations of this:

This problem is probably related to commit 9b013dc238c, which AFAICS is
only in pg10, not 9.5.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Strange checkpoint behavior - checkpoints take a longtime

2017-10-10 Thread Vladimir Nicolici
No, it didn’t. The delete was done in a single transaction.

From: Achilleas Mantzios
Sent: Tuesday, October 10, 2017 17:18
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Strange checkpoint behavior - checkpoints take a longtime

Hello Vladimir,

maybe your update triggered auto_vacuum on those tables ? Default 
autovacuum_freeze_max_age is exactly set at 200,000,000 . Did you check your 
vacuum stats afterwards (pg_stat_*_tables) ?
Can you show the code which performed the deletes?

On 10/10/2017 16:56, Vladimir Nicolici wrote:
I experimented some more with the settings this weekend, while doing some large 
write operations (deleting 200 million records from a table), and I realized 
that the database is capable of generating much more WAL than I estimated.
 
And it seems that spikes in write activity, when longer than a few minutes, can 
cause the checkpoint process to “panic” and start a checkpoint earlier, and 
trying to complete it as soon as possible, estimating, correctly, that if that 
level of activity continues it will hit the max_wal_size limit.
 
Based on that, I reduced the checkpoint_timeout from 30 minutes to 20 minutes, 
while keeping max_wal_size at 144GB . Alternatively I could have increased the 
maximum WAL size more, but I’m not sure it’s a good idea to set it higher than 
the shared buffers, which are also set at 144GB. After this change, on Monday 
all checkpoints were triggered by “time”, I didn’t have any more checkpoints 
triggered by “xlog”.
 
I also set checkpoint_completion_target to 0.5 to see if our hardware can 
handle concentrating the write activity for 20 minutes in just 10 minutes, and 
that worked very well too, checkpoints finished on time. The %util (busy%) for 
the SSDs as reported by sar was around 20% when not doing a checkpoint, and 60% 
during the checkpoint, so it seems the hardware will be able to handle future 
increases in activity just fine.
 
The lesson I learned here is that max_wal_size needs to be configured based on 
the *maximum* volume of wal the database can generate in the checkpoint_timeout 
interval. Initially I had it set based on the *average* volume of wal generated 
in that interval, setting it to 3 times that average, but that was not enough, 
triggering the unexpected behavior.
 
Thanks,
Vlad

-- 
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt



Re: [GENERAL] Strange checkpoint behavior - checkpoints take a long time

2017-10-10 Thread Achilleas Mantzios

Hello Vladimir,

maybe your update triggered auto_vacuum on those tables ? Default 
autovacuum_freeze_max_age is exactly set at 200,000,000 . Did you check your 
vacuum stats afterwards (pg_stat_*_tables) ?
Can you show the code which performed the deletes?

On 10/10/2017 16:56, Vladimir Nicolici wrote:


I experimented some more with the settings this weekend, while doing some large write operations (deleting 200 million records from a table), and I realized that the database is capable of 
generating much more WAL than I estimated.


And it seems that spikes in write activity, when longer than a few minutes, can cause the checkpoint process to “panic” and start a checkpoint earlier, and trying to complete it as soon as possible, 
estimating, correctly, that if that level of activity continues it will hit the max_wal_size limit.


Based on that, I reduced the checkpoint_timeout from 30 minutes to 20 minutes, while keeping max_wal_size at 144GB . Alternatively I could have increased the maximum WAL size more, but I’m not sure 
it’s a good idea to set it higher than the shared buffers, which are also set at 144GB. After this change, on Monday all checkpoints were triggered by “time”, I didn’t have any more checkpoints 
triggered by “xlog”.


I also set checkpoint_completion_target to 0.5 to see if our hardware can handle concentrating the write activity for 20 minutes in just 10 minutes, and that worked very well too, checkpoints 
finished on time. The %util (busy%) for the SSDs as reported by sar was around 20% when not doing a checkpoint, and 60% during the checkpoint, so it seems the hardware will be able to handle future 
increases in activity just fine.


The lesson I learned here is that max_wal_size needs to be configured based on the **maximum** volume of wal the database can generate in the checkpoint_timeout interval. Initially I had it set 
based on the **average** volume of wal generated in that interval, setting it to 3 times that average, but that was not enough, triggering the unexpected behavior.


Thanks,

Vlad



--
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt



Re: [GENERAL] Strange checkpoint behavior - checkpoints take a long time

2017-10-10 Thread Vladimir Nicolici
I experimented some more with the settings this weekend, while doing some large 
write operations (deleting 200 million records from a table), and I realized 
that the database is capable of generating much more WAL than I estimated.

And it seems that spikes in write activity, when longer than a few minutes, can 
cause the checkpoint process to “panic” and start a checkpoint earlier, and 
trying to complete it as soon as possible, estimating, correctly, that if that 
level of activity continues it will hit the max_wal_size limit.

Based on that, I reduced the checkpoint_timeout from 30 minutes to 20 minutes, 
while keeping max_wal_size at 144GB . Alternatively I could have increased the 
maximum WAL size more, but I’m not sure it’s a good idea to set it higher than 
the shared buffers, which are also set at 144GB. After this change, on Monday 
all checkpoints were triggered by “time”, I didn’t have any more checkpoints 
triggered by “xlog”.

I also set checkpoint_completion_target to 0.5 to see if our hardware can 
handle concentrating the write activity for 20 minutes in just 10 minutes, and 
that worked very well too, checkpoints finished on time. The %util (busy%) for 
the SSDs as reported by sar was around 20% when not doing a checkpoint, and 60% 
during the checkpoint, so it seems the hardware will be able to handle future 
increases in activity just fine.

The lesson I learned here is that max_wal_size needs to be configured based on 
the *maximum* volume of wal the database can generate in the checkpoint_timeout 
interval. Initially I had it set based on the *average* volume of wal generated 
in that interval, setting it to 3 times that average, but that was not enough, 
triggering the unexpected behavior.

Thanks,
Vlad


[GENERAL] How to sort in pgAdmin 4

2017-10-10 Thread Thomasine Dade (tdade)
I am using pgAdmin 4 v2 on Mac OS X 10.12.6.

When I filter my data so that I can edit it, it sets a default sort. The text 
of the query is displayed but it’s grayed so I cannot edit the ORDER BY 
statement. How can I change the order by?

In pgAdmin 3, I had a “Data Sorting” option but do not see that option in 
pgAdmin 4.

These are the steps I followed in pgAdmin 4:


  *   Right-click on table name
  *   Choose View/Edit Data > Filtered Rows
  *   Enter the filter value that will be used to create the WHERE clause
  *   Click OK
  *   The query displays at the top and it’s grayed out and it includes an 
ORDER BY statement

Thanks.


[GENERAL] Trigger function problem

2017-10-10 Thread Liglio Cavalcante
Hi,

I am trying to migrate data from a master table to her partitions. I am
using an update trigger to delete and insert into the master table, and so
an insert trigger on the master table redirects the inserted registers to
the respective parrtition table. The problem is that the update trigger is
not deleting the registers from the master table. I am using postgresql 9.5.

CREATE OR REPLACE FUNCTION tb_master_update_trigger()
RETURNS TRIGGER AS $$
BEGIN
DELETE FROM tb_master WHERE OLD.id_master = id_master ;
INSERT INTO tb_master VALUES (NEW.*); 
RETURN NULL;
END;
$$
LANGUAGE plpgsql;

CREATE TRIGGER update_tb_master_trigger 
BEFORE UPDATE ON tb_master
FOR EACH ROW EXECUTE PROCEDURE tb_master_update_trigger();

UPDATE tb_master SET id_master = id_master ;



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Permissions for Web App

2017-10-10 Thread Stephen Frost
Greetings,

* Igal @ Lucee.org (i...@lucee.org) wrote:
> It worked, thanks!

Be sure to check that you're really getting what you want here.

> For future reference and for the benefit of others, the command that
> I ran is:
> 
>   ALTER DEFAULT PRIVILEGES IN SCHEMA public
>     GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO webapp;

Note that DEFAULT PRIVILEGES apply to a specific role (in the above
case, whatever CURRENT_USER is) and therefore will only be helpful if
you only have one user creating tables, in this case.

There's been a lot of questions and discussions about having an actual
'read only' role that can be granted out.  Now that we've actually got
default roles in PG, this is something that becomes at least somewhat
more practical and might be able to happen for PG11 if there's interest
and effort put into it.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [GENERAL] pg_start/stop_backup naming conventions

2017-10-10 Thread Stephen Frost
Greetings,

* mj0nes (matthew.jo...@ramtech.co.uk) wrote:
> I'm just starting out on a rolling backup strategy and the naming convention
> has thrown me slightly for the WAL and "backup_label" files.
> 
> What I want to do is pair up the backup label files with the associated tar
> ball of the data directory. When I first ran pg_start_backup('label'), I
> naively thought a file would be created named label. Maybe not that naive
> considering the documentation:
> 
> /pg_start_backup creates a backup label file, *called backup_label*, in the
> cluster directory with information about your backup, including the start
> time and label string./

That's an interesting point, but not particularly relevent any more as
we've deprecated that approach to doing backups now because of issues
with the backup_label file and what happens if the system is restarted
during a backup.

> Thanks for any pointers.

Frankly, developing your own backup software for PG isn't trivial and
isn't something I'd recommend.  There's a number of good tools out there
which have been developed over years to work well to specifically
address PG backups and to do things correctly (such as making sure WAL
archiving is working and that WAL files are sync'd to disk before
telling PG that it's been copied).

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [GENERAL] Using cp to back up a database?

2017-10-10 Thread Stephen Frost
Ron,

* Ron Johnson (ron.l.john...@cox.net) wrote:
> Maybe my original question wasn't clear, so I'll try again: is it
> safe to do a physical using cp (as opposed to rsync)?

Frankly, I'd say no.  There's nothing to guarantee that the backup is
actually sync'd out to disk.  Further, you're probably in the same boat
when it comes to WAL archiving (just using 'cp' there isn't good for the
same reason), though that's arguably worse because once you return true
from archive_command, that WAL file will be gone and you could end up
with a hole in your WAL stream.

Next, this backup method has been deprecated because of issues with the
backup_label file and what happens when the system crashes during a
backup.

So, no, you shouldn't be just using 'cp', or 'rsync', or any of those
similar, simple, tools for doing a proper PG backup.  Use a tool which
has been developed specifically for PG such as pgBackRest, barman,
WAL-E, WAL-G, etc.

Thanks!

Stephen


signature.asc
Description: Digital signature


[GENERAL] [asking about how to upgrade docker postgresql without losing the data]

2017-10-10 Thread Olivani Prisila
Hi,

I am beginner both of docker and postgresql.

How do i upgrade docker postgresql 9.5 into 9.6 without losing my
current database?
fyi: im using ubuntu verison 14 and docker 17.09

thanks.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Equivalence Classes when using IN

2017-10-10 Thread Kim Rose Carlsen
> If the only reason that is_simple_subquery() rejects subqueries with
> ORDER BY is due to wanting to keep the order by of a view, then
> couldn't we make is_simple_subquery() a bit smarter and have it check
> if the subquery is going to be joined to something else, which likely
> would destroy the order, or at least it would remove any guarantees of
> it.
>
> Something like the attached?

I dont know if it makes any difference that the ORDER BY is used in a
DISTINCT ON clause. In this case the ORDER BY is important.


- Kim


Re: [GENERAL] Equivalence Classes when using IN

2017-10-10 Thread Kim Rose Carlsen
> You would benefit from adding the age column to view_customer, or at
> least consider having some view which contains all the columns you'll
> ever need from those tables and if you need special views with only a
> subset of columns due to some software doing "select * from
> viewname;", then you could just create some. Joining to the same table
> again seems like a bit of a waste of effort for the planner and
> executor.

I would argue that the anti pattern would be the software that
insist on using "select * from viewname;" from a view that has
calculated columns that you do not care for. I recommend
introducing both lightweight views and heavyweight views, so you
can join up probably for what you need.

My example is fabricated trying to simplify things, but I seem to
create more confusion than clarity in my example. My point was
only to see if anything could be added to the fabricated
execution path. I agree that the listed example does not make
sense. So I will try and give some more context to real use
cases.

Imagine an invoice entity where you have one relation for invoice
base data and a relation for invoice_line. The invoice has some
invoice_id, customer_id, due_date, paid_date and invoice_line
contains each line with a invoice_id, display_name, amount. A
view (view_invoice_with_amount) where you calculate the total.

so a query could be
SELECT c.customer_id,
   i.invoice_amount_total
  FROM view_customer c
  JOIN view_invoice_with_amount i
ON c.customer_id = i.customer_id
 WHERE c.first_name = 'John';

If you ever need to filter by invoice_amount_total, it might be
necesary denormalize the relations and cache the amount in the
invoice table.

> I'd assume customer_id is the PRIMARY KEY of customer and
> is unique.

This is a continuation of the previous example, maybe I should have
included it all to make it more clear. But customer_id is declared
as a primary key.

> It's not all that clear what your view is doing here. Confusingly
> there's a Sort in the plan, yet nothing in the query asked for that,
> so I guess that the view must have an ORDER BY. If you get rid of that
> the planner would likely use an index on product (customer_id) to
> parameterise the nested loop, at least, it likely would, if you have
> one.

The view is defined in the original post. What I was trying to illustrate
was a DISTINCT ON clause to prioritize multiple products pr customer
to a somewhat "main" product for the customer. The ORDER BY on product_id
would in this case then map the first product a customer gets to its
"main" product. It could also be the most valuable product or newest ordered
active product etc. It is just some way of mapping one to many relation to a
one to one. Again the example is simplified and fabricated and maybe looses
its power to explain its intents.

> It's pretty bad practice to have ORDER BY in views. I kinda wish we
> didn't even allow it, but that ship sailed many years ago...

It is required by DISTINCT ON and as soon as you go into
reporting, datawarehouse then it gets difficult to avoid these
along with group by. Instead of writing each query from the
ground up you get a huge benefit by factorizing each query into
meaningful entities that can stand alone and make sense by
themself, and from these build up the query to answer your
questions. That way you gain lots of re-use of code and
definition doesn't change between queries. The down side is it
leaves alot of work to the planner. It's a trade off between
optimization, readability and simplicity.

I hope I make more sense now.

- Kim