Re: Deadlock with 2 processes waiting on transactionids and locking unrelated tables

2018-02-18 Thread Rene Romero Benavides
Hi. Does any of the two tables have triggers? What's the database /
transaction isolation level? Do the updates run in a transaction among
other read / write operations within the same transaction ?
Regards.

2018-02-18 23:28 GMT-06:00 David Wheeler :

> Hi,
>
> We’re seeing deadlock semi-regularly (0-2 per day) that I’m really having
> trouble getting to the bottom of.
>
> Process 7172 waits for ShareLock on transaction 4078724272
> <(407)%20872-4272>; blocked by process 7186.
> Process 7186 waits for ShareLock on transaction 4078724210
> <(407)%20872-4210>; blocked by process 7172.
>
> The two queries in question are updates on unrelated tables. Running the
> queries on their own shows no overlapping entries in pg_locks.
>
> Process 7172: update ticket set unread = true where ticketid = $1
> Process 7186: update "planscheduleitem" set "planschedule"=$1 where
> "psi"=$2
>
> How can I work out why Postgres has decided that the two processes are in
> deadlock? Is there an explainer somewhere on transaction level locks? I
> can’t see anything in the docs besides that they exist.
>
>
>
> Details below
>
> select version();
>   version
> 
> ---
>  PostgreSQL 10.2 (Ubuntu 10.2-1.pgdg14.04+1) on x86_64-pc-linux-gnu,
> compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4, 64-bit
> (1 row)
>
> 
> 
> ---
>
>
> after running update "planscheduleitem" set "planschedule"=$1 where
> "psi"=$2
>
> SELECT locktype, virtualxid, transactionid, virtualtransaction, pid, mode,
> relname, page, tuple
> FROM pg_locks LEFT JOIN pg_class ON (relation=oid) where pid =41715;
>locktype| virtualxid | transactionid | virtualtransaction |  pid  |
>   mode   | relname | page | tuple
> ---++---+---
> -+---+--+-+--+---
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_parentticketid   |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_originalticketid |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_tickettypeid_idx |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_subject_idx  |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_closedtime_idx   |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_assignedto_idx   |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_serviceuid_idx   |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_parentuid_idx|  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_createdtime_idx  |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_txid |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_tickettype   |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_ticketpriority   |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_idx_0|  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_pkey |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | number_constraint   |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket  |  |
>  virtualxid| 56/2306863 |   | 56/2306863 | 41715 |
> ExclusiveLock| |  |
>  transactionid ||4089785154 <(408)%20978-5154> |
> 56/2306863 | 41715 | ExclusiveLock| |
>|
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_fromuid_idx  |  |
> (19 rows)
>
> 
> 
> 
>
> after running update ticket set unread = true where ticketid = $1
>
> SELECT locktype, virtualxid, 

Migrate2Postgres - A new tool for migration from other DBMSs

2018-02-18 Thread Igal Sapir
Hi everybody,

I published a tool that makes it easy to migrate a database from other
DBMSs to Postgres:
https://github.com/isapir/Migrate2Postgres

Currently it supports migrations from MS SQL Server, but it is written in a
way that will make it easy to migrate from other DBMSs as well.

I also published a video (didn't spend too much time editing it, sorry)
with an overview and a demo of migrating the AdventureWorld database, which
is one of the MS SQL Server sample databases.  The migration itself takes
about a minute.  The overview of the tool and the explanations take longer:
https://youtu.be/5eF9_UB73TI

For the impatient, this is the point in the video where I start using the
tool (so you can watch that first and then later watch the overview and the
configuration options from the beginning);
https://youtu.be/5eF9_UB73TI?t=9m51s

Feedback welcome!  Thank you,


Igal


Deadlock with 2 processes waiting on transactionids and locking unrelated tables

2018-02-18 Thread David Wheeler
Hi,

We’re seeing deadlock semi-regularly (0-2 per day) that I’m really having 
trouble getting to the bottom of. 

Process 7172 waits for ShareLock on transaction 4078724272; blocked by process 
7186.
Process 7186 waits for ShareLock on transaction 4078724210; blocked by process 
7172.

The two queries in question are updates on unrelated tables. Running the 
queries on their own shows no overlapping entries in pg_locks. 

Process 7172: update ticket set unread = true where ticketid = $1
Process 7186: update "planscheduleitem" set "planschedule"=$1 where "psi"=$2

How can I work out why Postgres has decided that the two processes are in 
deadlock? Is there an explainer somewhere on transaction level locks? I can’t 
see anything in the docs besides that they exist. 



Details below

select version();
  version
---
 PostgreSQL 10.2 (Ubuntu 10.2-1.pgdg14.04+1) on x86_64-pc-linux-gnu, compiled 
by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4, 64-bit
(1 row)

---


after running update "planscheduleitem" set "planschedule"=$1 where "psi"=$2

SELECT locktype, virtualxid, transactionid, virtualtransaction, pid, mode, 
relname, page, tuple
FROM pg_locks LEFT JOIN pg_class ON (relation=oid) where pid =41715;
   locktype| virtualxid | transactionid | virtualtransaction |  pid  |  
 mode   | relname | page | tuple
---++---++---+--+-+--+---
 relation  ||   | 56/2306863 | 41715 | 
RowExclusiveLock | ticket_parentticketid   |  |
 relation  ||   | 56/2306863 | 41715 | 
RowExclusiveLock | ticket_originalticketid |  |
 relation  ||   | 56/2306863 | 41715 | 
RowExclusiveLock | ticket_tickettypeid_idx |  |
 relation  ||   | 56/2306863 | 41715 | 
RowExclusiveLock | ticket_subject_idx  |  |
 relation  ||   | 56/2306863 | 41715 | 
RowExclusiveLock | ticket_closedtime_idx   |  |
 relation  ||   | 56/2306863 | 41715 | 
RowExclusiveLock | ticket_assignedto_idx   |  |
 relation  ||   | 56/2306863 | 41715 | 
RowExclusiveLock | ticket_serviceuid_idx   |  |
 relation  ||   | 56/2306863 | 41715 | 
RowExclusiveLock | ticket_parentuid_idx|  |
 relation  ||   | 56/2306863 | 41715 | 
RowExclusiveLock | ticket_createdtime_idx  |  |
 relation  ||   | 56/2306863 | 41715 | 
RowExclusiveLock | ticket_txid |  |
 relation  ||   | 56/2306863 | 41715 | 
RowExclusiveLock | ticket_tickettype   |  |
 relation  ||   | 56/2306863 | 41715 | 
RowExclusiveLock | ticket_ticketpriority   |  |
 relation  ||   | 56/2306863 | 41715 | 
RowExclusiveLock | ticket_idx_0|  |
 relation  ||   | 56/2306863 | 41715 | 
RowExclusiveLock | ticket_pkey |  |
 relation  ||   | 56/2306863 | 41715 | 
RowExclusiveLock | number_constraint   |  |
 relation  ||   | 56/2306863 | 41715 | 
RowExclusiveLock | ticket  |  |
 virtualxid| 56/2306863 |   | 56/2306863 | 41715 | 
ExclusiveLock| |  |
 transactionid ||4089785154 | 56/2306863 | 41715 | 
ExclusiveLock| |  |
 relation  ||   | 56/2306863 | 41715 | 
RowExclusiveLock | ticket_fromuid_idx  |  |
(19 rows)



after running update ticket set unread = true where ticketid = $1

SELECT locktype, virtualxid, transactionid, virtualtransaction, pid, mode, 
relname, page, tuple
FROM pg_locks LEFT JOIN pg_class ON (relation=oid) where pid =41715;
   locktype| virtualxid | transactionid | virtualtransaction |  pid  |  
 mode   |   relname| page | tuple
---++---++---+--+--+--+---
 relation  ||   | 

Re: pgBackRest backup from standby

2018-02-18 Thread Michael Paquier
On Sun, Feb 18, 2018 at 06:34:46PM -0600, Don Seiler wrote:
> Looking to use pgBackRest to take a backup from a hot standby. I'm reading
> that pgBackRest still needs to connect to the primary and copy some files.
> My questions are:
> 
> 
>1. What files does it need to copy? Config files? WAL files?
>2. How does it connect? SSH?
>3. Does pgBackRest need to be installed and configured on the primary as
>well?

I am adding in CC: Stephen Frost and David Steele who work on the took.
You may want to contact the maintainers directly through github where
the project is maintained:
https://github.com/pgbackrest/pgbackrest
--
Michael


signature.asc
Description: PGP signature


pgBackRest backup from standby

2018-02-18 Thread Don Seiler
Evening all.

Looking to use pgBackRest to take a backup from a hot standby. I'm reading
that pgBackRest still needs to connect to the primary and copy some files.
My questions are:


   1. What files does it need to copy? Config files? WAL files?
   2. How does it connect? SSH?
   3. Does pgBackRest need to be installed and configured on the primary as
   well?

Thanks,
Don.

-- 
Don Seiler
www.seiler.us


Re: Need to fix one more glitch in upgrade to -10.2

2018-02-18 Thread Tim Cross

Rich Shepard  writes:

> On Mon, 19 Feb 2018, Tim Cross wrote:
>
>> It is possible for the target of a symbolic link to be changed, deleted
>> etc (a dangling sym link).
>
> Tim,
>
>Broken symlinks display in a different color, black on a red background if
> I remember correctly, rather than the light cyan of a working symlink. I've
> seen enough of the former to recognize the difference. :-)
>
> Thanks,
>
> Rich

Hi Rich,

yes, for dangling links, if your ls is configured to show colours and
your shell honours those colours, dangling links will show up with a
different colour. However, this will not tell you if, for example, the
mode of the file being pointed to does not have the executable bit set
for the user/group running/calling the sym link. If the file doe snot
have the right executable bits set, you would get the error message
about not being able to find the executable file.

>From memory, your cut and paste looked like it has a '*' at the end,
which is what ls will show (assuming the alias you have adds that
option), so the target probably has at least the owner exec bit set, but
when tracking down problems like this, I have learnt it is often best to
just verify the modes and owner/group memberships of the target are
correct.

regards,

Tim

--
Tim Cross



Re: Need to fix one more glitch in upgrade to -10.2

2018-02-18 Thread Rich Shepard

On Mon, 19 Feb 2018, Tim Cross wrote:


It is possible for the target of a symbolic link to be changed, deleted
etc (a dangling sym link).


Tim,

  Broken symlinks display in a different color, black on a red background if
I remember correctly, rather than the light cyan of a working symlink. I've
seen enough of the former to recognize the difference. :-)

Thanks,

Rich



Re: Need to fix one more glitch in upgrade to -10.2

2018-02-18 Thread Tim Cross

Rich Shepard  writes:

> On Sun, 18 Feb 2018, Tim Cross wrote:
>
>>> # ll /usr/bin/postgres
>>> lrwxrwxrwx 1 root root 35 Feb 17 09:30 /usr/bin/postgres -> 
>>> ../lib/postgresql/10.2/bin/postgres*
>
>> Try doing an 'll' on the second part of that output i.e.
>> ll /usr//lib/postgresql/10.2/bin/postgres*
>
>See my message, repeated above.
>
> Regards,
>
> Rich

Your ll command is only showing what the symbolic link is pointing to,
not the status of the thing it points to. It is possible for the target
of a symbolic link to be changed, deleted etc (a dangling sym link). My
point was to verify the target and it's permissions, not just the  sym
link itself.

Tim


-- 
Tim Cross



Re: Need to fix one more glitch in upgrade to -10.2

2018-02-18 Thread Tim Cross

Rich Shepard  writes:

> On Sun, 18 Feb 2018, Tim Cross wrote:
>
>> This may not be relevant,
>
> Tim,
>
>Nope. Pat goes for stability, not cutting edge. No systemd in the
> forthcoming 15.0, either.
>
> Thanks,
>
> Rich

No worries, though I'm not sure you can call systemd 'cutting edge'
anymore. Even many of the distros which argued against it have switched
(i.e. Debian, Ubuntu).

Tim


-- 
Tim Cross



Re: READ COMMITTED vs. index-only scans

2018-02-18 Thread Jacek Kołodziej
On Wed, Jan 17, 2018 at 9:34 PM, Jacek Kołodziej 
wrote:

> Hi Tom,
>
> On Wed, Jan 17, 2018 at 7:56 PM, Tom Lane  wrote:
>
>> =?UTF-8?Q?Jacek_Ko=C5=82odziej?=  writes:
>> > Here's what happening to me: the "A" query occasionally (in my case: on
>> the
>> > order of tenths per day) returns an ID _higher_ than any ID present in
>> > second query's result (other conditions I haven't specified do _not_
>> filter
>> > any more rows than "id <= max ID") - as if some entries were visible for
>> > the first query, but not for the second one. This is an inconsistency
>> that
>> > is very problematic for me.
>>
>> That sounds problematic to me too, but how certain are you that the "other
>> conditions you haven't specified" aren't suppressing the last row?  That'd
>> certainly be the least surprising explanation.  If it isn't that, though,
>> this surely seems like a bug.
>>
>> Yes, I'm fairly sure of that. When I execute that same "B" query again
> some time afterwards, it returns all expected rows - I mean, also these
> that were "included" in original "A" query and that were "missing" in "B"
> one first time around.
>
>
>> Can you determine whether the row(s) missing in the second query are
>> freshly committed?  Or have they been there awhile?
>>
>> Depends on what would be considered "fresh", usually it's on the order of
> miliseconds or seconds.
>
>
>> > Where am I wrong? What am I missing? What information may I provide to
>> help
>> > with investigating this?
>>
>> Probably the best thing to spend time on would be to try to extract a
>> publishable test case.  It would be really hard to get to the bottom
>> of an issue like this without having a reproducer.  It's okay if it
>> takes awhile to reproduce the fault ...
>>
>> I'd certainly love to have a working repro. I won't be able to do it for
> the next few days but I'll work on this right after the weekend.
>
>
>> Also, before spending a whole lot of time on this: are you on 9.6.6?
>> If not, update, just in case this is an already-fixed issue.  The
>> symptoms don't sound familiar, but I don't want to waste a lot of
>> time only to find out it's some manifestation of a known bug.
>>
>> regards, tom lane
>>
>
> I'm using 9.6.5; I'm not administrating it so it might take some time
> before updating but once it's done, I'll get back with whether that fixed
> the situation. In the meantime, when trying to reproduce it locally, I'll
> use both 9.6.5 and 9.6.6 to see whether it makes any difference.
>
> Thank you very much for the suggestions.
>
>
> --
> Kind regards,
> Jacek Kołodziej
> http://kolodziejj.info
>


Hello again,

unsurprisingly, the fault was on my side - it was discovered by my
colleague. Source of the problem was disregarding of how sequences work -
i.e., how they produce monotonic numbers at query time (in this case: IDs
for the primary key) but these queries (and,in turn, numbers from the
sequence) may be committed in a different order (T1, T2, T3 - transactions;
T1 and T2 appends to the events table, T3 reads from it with "A" and "B"
queries):
- T1 was inserting an event (it got ID 6) - it did not commit yet!
- T2 was inserting an event (it got ID 7) and it commits
- T3 has made a query "A" (which gets "max event's ID" equal to 7 at that
time) - and selects events with query "B" (which gets event with ID 7, but
not the one with ID 6 - because it has not been comitted yet)
- T1 commits but that event (ID equals to 6) has been omitted already and
won't be picked up ever again

REPEATABLE READ could not help us in such situation.

We ended up - at least for now - forcing serialization of inserting the
events (so that their IDs will always match  the order in which they are
inserted _and comitted_ into the database); while it may be suboptimal, it
seems like a good-enough solution for our use case for the time being and
we'll be working on more scalable solution in the future.

Thank you for looking into this and inspiration for further investigation.

-- 
Kind regards,
Jacek Kołodziej
http://kolodziejj.info


Re: Need to fix one more glitch in upgrade to -10.2 [FIXED]

2018-02-18 Thread Rich Shepard

On Sun, 18 Feb 2018, Rich Shepard wrote:


 Thanks for confirming


  Removed all files in the data/ directory, re-initialized the cluster, and
restored the dumped .sql file (minus three databases and their roles
manually deleted). All works well now.

  Thanks, Adrian!

Best regards,

Rich





Re: Need to fix one more glitch in upgrade to -10.2

2018-02-18 Thread Rich Shepard

On Sun, 18 Feb 2018, Adrian Klaver wrote:


Is this appropriate?



Yes.


Adrian,

  Thanks for confirming


They could not have been removed as they are in the file. I am guessing
you are saying they are not in use as far as you know. Just a warning(from
experience), memory is a tricky thing and removing what is thought to be
inactive roles is a quick way to find they are not.


  Well, one set of roles is related to my former bookkeeping system and that
database had not been removed. Another set of roles was related to a
replacement bookkeeping system I didn't use and I don't recall seeing that
database when I last ran 'psql -l'.

So can anyone who knows that postgres role is generally always there. If you 
want to do this at least restrict the user field.


  As I'm the only one here unless someone is sitting here and logging in
under my username they won't see a thing. And no one's going to sit here an
log in as me other than me. :-) One advantage of working from home.

Regards,

Rich




Re: query performance

2018-02-18 Thread Tomas Vondra

On 02/18/2018 06:37 AM, David Rowley wrote:
> On 18 February 2018 at 12:35, hmidi slim  wrote:
>> Is there an other optimized solution to make a query such this:
>> select * from (
>> select e.name, e1.name, e.id
>> from establishment as e, establishment as e1
>> where e.id <> e1.id
>> and e1.id = 1
>> and ST_DWithin(geom, ST_MakePoint(e1.longitude, e1.latitude)::geography,
>> 1000)) as tmp inner join product as p on p.establishment_id = tmp.id
> 
> You really should state what is wrong with the performance of the
> above version and also include the EXPLAIN  (ANALYZE, BUFFERS) of that
> query.
> 
> The version of PostgreSQL that you're running it on is also a good
> thing to share.
> 
> Details of the indexes which you've defined on the tables are also
> useful information. It would especially be good to know if you've
> added an index on product (establishment_id), for example.
> 

There's actually a wiki page with overview of what should be included in
"slow query" questions:

https://wiki.postgresql.org/wiki/Slow_Query_Questions

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: shared_buffers 8GB maximum

2018-02-18 Thread Tomas Vondra

On 02/18/2018 02:41 PM, Vitaliy Garnashevich wrote:
> 
>> I certainly wouldn't recommend using 1/2 of RAM right away. There's
>> a good chance it would be a waste of memory - for example due to
>> double buffering, which effectively reduces "total" cache hit
>> ratio.
> 
> Double buffering is often mentioned in context of tuning shared
> buffers. Is there a tool to actually measure the amount of double
> buffering happening in the system?
> 

I'm not aware of such tool. But I suppose it could be done by
integrating information from pg_buffercache and pgfincore [1].

[1] https://github.com/klando/pgfincore

>> Those evictions are performed either by backends or bgwriter, both
>> of which are less efficient than checkpointer. Not only can
>> checkpointer perform various optimizations (e.g. sorting buffers to
>> make the writes more sequential), but it also writes each dirty
>> buffer just once. With smaller shared_buffers the page may have be
>> written multiple times.
> 
> In the case when shared_buffers cover most of RAM, most of writes
> should happen by checkpointer, and cache hit ratio should be high. So
> a hypothetical question: Could shared_buffers=200GB on a 250 GB RAM
> server ever be a reasonable setting? (assuming there are no other
> applications running except postgres, and 50GB is enough for
> allocating work_mem/maintenance_work_mem and for serving queries)
> 

It depends on how large is the active part of the data set is. If it
fits into 200GB but not to smaller shared buffers (say, 100GB), then
using 200GB may be a win.

If the active set is much larger than RAM, smaller shared_buffer values
work better in my experience.


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: shared_buffers 8GB maximum

2018-02-18 Thread Pavel Stehule
2018-02-18 14:41 GMT+01:00 Vitaliy Garnashevich :

>
> I certainly wouldn't recommend using 1/2 of RAM right away. There's a
>> good chance it would be a waste of memory - for example due to double
>> buffering, which effectively reduces "total" cache hit ratio.
>>
>
> Double buffering is often mentioned in context of tuning shared buffers.
> Is there a tool to actually measure the amount of double buffering
> happening in the system?
>
> Those evictions are performed either by backends or bgwriter, both of
>> which are less efficient than checkpointer. Not only can checkpointer
>> perform various optimizations (e.g. sorting buffers to make the writes
>> more sequential), but it also writes each dirty buffer just once. With
>> smaller shared_buffers the page may have be written multiple times.
>>
>
> In the case when shared_buffers cover most of RAM, most of writes should
> happen by checkpointer, and cache hit ratio should be high. So a
> hypothetical question: Could shared_buffers=200GB on a 250 GB RAM server
> ever be a reasonable setting? (assuming there are no other applications
> running except postgres, and 50GB is enough for allocating
> work_mem/maintenance_work_mem and for serving queries)
>
> The best thing you can do is set shared buffers to some conservative
>> value (say, 4-8GB), let the system run for a day or two, compute the
>> cache hit ratio using metrics in pg_stat_database, and then decide if
>> you need to resize shared buffers.
>>
>> Gradual increases are a good approach in general. And yes, having
>>
>>  buffers_checkpoint > buffers_clean > buffers_backend
>>
>> is a good idea too. Together with the cache hit ratio it's probably a
>> more sensible metric than looking at usagecount directly.
>>
>
> Thanks! While increasing shared_buffers we'll be looking at changes in
> cache hit ratio too.
>

When we did calculation of some analytic tasks, then increasing
shared_buffers had negative impact on speed. Probably hit ration was too
low after change, but the maintenance of shared buffers (searching free
blocks) was slower.

So optimal size of SB depends on use case too much - note -- too big SB
means small work mem what can be worse .. work_mem must be multiplied by
max_connection and by some constant .. 2 or 3.

Regards

Pavel




>
> Regards,
> Vitaliy
>
>
>


Re: Need to fix one more glitch in upgrade to -10.2

2018-02-18 Thread Rich Shepard

On Sun, 18 Feb 2018, Tim Cross wrote:


# ll /usr/bin/postgres
lrwxrwxrwx 1 root root 35 Feb 17 09:30 /usr/bin/postgres -> 
../lib/postgresql/10.2/bin/postgres*



Try doing an 'll' on the second part of that output i.e.
ll /usr//lib/postgresql/10.2/bin/postgres*


  See my message, repeated above.

Regards,

Rich



Re: shared_buffers 8GB maximum

2018-02-18 Thread Vitaliy Garnashevich



I certainly wouldn't recommend using 1/2 of RAM right away. There's a
good chance it would be a waste of memory - for example due to double
buffering, which effectively reduces "total" cache hit ratio.


Double buffering is often mentioned in context of tuning shared buffers. 
Is there a tool to actually measure the amount of double buffering 
happening in the system?



Those evictions are performed either by backends or bgwriter, both of
which are less efficient than checkpointer. Not only can checkpointer
perform various optimizations (e.g. sorting buffers to make the writes
more sequential), but it also writes each dirty buffer just once. With
smaller shared_buffers the page may have be written multiple times.


In the case when shared_buffers cover most of RAM, most of writes should 
happen by checkpointer, and cache hit ratio should be high. So a 
hypothetical question: Could shared_buffers=200GB on a 250 GB RAM server 
ever be a reasonable setting? (assuming there are no other applications 
running except postgres, and 50GB is enough for allocating 
work_mem/maintenance_work_mem and for serving queries)



The best thing you can do is set shared buffers to some conservative
value (say, 4-8GB), let the system run for a day or two, compute the
cache hit ratio using metrics in pg_stat_database, and then decide if
you need to resize shared buffers.

Gradual increases are a good approach in general. And yes, having

 buffers_checkpoint > buffers_clean > buffers_backend

is a good idea too. Together with the cache hit ratio it's probably a
more sensible metric than looking at usagecount directly.


Thanks! While increasing shared_buffers we'll be looking at changes in 
cache hit ratio too.


Regards,
Vitaliy




Re: Need to fix one more glitch in upgrade to -10.2

2018-02-18 Thread Rich Shepard

On Sun, 18 Feb 2018, Tim Cross wrote:


This may not be relevant,


Tim,

  Nope. Pat goes for stability, not cutting edge. No systemd in the
forthcoming 15.0, either.

Thanks,

Rich