subject:"Re\: \[HACKERS\] autovacuum can't keep up, bloat just continues to rise"

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-24 Thread Peter Geoghegan

On Mon, Jul 24, 2017 at 10:50 AM, Joshua D. Drake  
wrote:
> Does this suggest that we don't have a cleanup problem but a fragmentation
> problem (or both at least for the index)? Having an index that is almost
> twice the uncleaned up size isn't that uncommon.

As Tom pointed out up-thread, it's important to distinguish between
inherent overhead, and overhead due to garbage that needs to be
cleaned-up by vacuum. It's really hard to delineate which is which
here, and I'm not going to try to put a number on it. What I will
point out is that you can see quite a significant difference between
the space utilization of a B-Tree without any dead tuples, just from
the order in which tuples are initially inserted.

You can get about a 1/3 loss of space by inserting randomly, rather
than inserting in sorted order, which is what REINDEX will more or
less do for you. That's because random workloads almost entirely get
50:50 page splits, whereas sorted input will always split the
rightmost page, and so will always get 90:10 splits. The space in the
random case isn't exactly wasted; it's there for the taking, for key
values that happen to fit on the page. You effectively require a
larger average reserve of free space on pages with the random
workload, because the implementation does not and cannot reason that
it would be best to concentrate free space in parts of the keyspace
where there is most need for it.

That having been said, I do think that this workload suffers from
index bloat in a way that isn't so easily explained. It does seem to
be an issue with VACUUM controlling bloat in the index in particular.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-24 Thread Joshua D. Drake


On 07/23/2017 12:03 PM, Joshua D. Drake wrote:

As you can see even with aggressive vacuuming, over a period of 36 hours 
life gets increasingly miserable.


The largest table is:

postgres=# select 
pg_size_pretty(pg_total_relation_size('bmsql_order_line'));

  pg_size_pretty

  148 GB
(1 row)



[snip]


With the PK being

postgres=# select 
pg_size_pretty(pg_relation_size('bmsql_order_line_pkey'));

  pg_size_pretty

  48 GB
(1 row)

I tried to see how much data we are dealing with here:


-hackers,

I cleaned up the table with VACUUM FULL and ended up with the following:

postgres=# select 
pg_size_pretty(pg_total_relation_size('bmsql_order_line'));

 pg_size_pretty

 118 GB
(1 row)

postgres=# select pg_size_pretty(pg_relation_size('bmsql_order_line_pkey'));
 pg_size_pretty

 27 GB
(1 row)

Does this suggest that we don't have a cleanup problem but a 
fragmentation problem (or both at least for the index)? Having an index 
that is almost twice the uncleaned up size isn't that uncommon.


Thanks in advance,

JD


--
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL Centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://pgconf.us
* Unless otherwise stated, opinions are my own.   *


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-23 Thread Joshua D. Drake


Hello,

I changed the test to run for 6 hours at a time regardless of number of 
transactions. I also changed the du command to only look at the database 
(previously wal logs were included). This is the clearest indication of 
the problem I have been able to produce.


Again, this is with 128 clients and 500 warehouses. The first test is a 
clean test, everything dropped, vacuumed etc... Each subsequent test is 
just starting the test again to have breakpoints.



-+---
 autovacuum  | on
 autovacuum_analyze_scale_factor | 0.1
 autovacuum_analyze_threshold| 50
 autovacuum_freeze_max_age   | 2
 autovacuum_max_workers  | 12
 autovacuum_multixact_freeze_max_age | 4
 autovacuum_naptime  | 10
 autovacuum_vacuum_cost_delay| 0
 autovacuum_vacuum_cost_limit| 5000
 autovacuum_vacuum_scale_factor  | 0.1
 autovacuum_vacuum_threshold | 50
 autovacuum_work_mem | -1
 log_autovacuum_min_duration | -1
 max_wal_size| 640
 checkpoint_timeout  | 86400
 checkpoint_completion_target| 0.5

Starting base metric
50G /srv/main/base

Test 1:
90G /srv/main/base
TPS: 838

Test 2:
121G/srv/main/base
TPS: 725

Test 3:
146G/srv/main/base
TPS: 642

Test 4:
171G/srv/main/base
TPS: 549

Test 5:
189G/srv/main/base
TPS: 489

Test 6:
208G/srv/main/base
TPS: 454

As you can see even with aggressive vacuuming, over a period of 36 hours 
life gets increasingly miserable.


The largest table is:

postgres=# select 
pg_size_pretty(pg_total_relation_size('bmsql_order_line'));

 pg_size_pretty

 148 GB
(1 row)

postgres=# \d bmsql_order_line
 Table "public.bmsql_order_line"
 Column |Type | Modifiers
+-+---
 ol_w_id| integer | not null
 ol_d_id| integer | not null
 ol_o_id| integer | not null
 ol_number  | integer | not null
 ol_i_id| integer | not null
 ol_delivery_d  | timestamp without time zone |
 ol_amount  | numeric(6,2)|
 ol_supply_w_id | integer |
 ol_quantity| integer |
 ol_dist_info   | character(24)   |
Indexes:
"bmsql_order_line_pkey" PRIMARY KEY, btree (ol_w_id, ol_d_id, 
ol_o_id, ol_number)

Foreign-key constraints:
"ol_order_fkey" FOREIGN KEY (ol_w_id, ol_d_id, ol_o_id) REFERENCES 
bmsql_oorder(o_w_id, o_d_id, o_id)
"ol_stock_fkey" FOREIGN KEY (ol_supply_w_id, ol_i_id) REFERENCES 
bmsql_stock(s_w_id, s_i_id)


With the PK being

postgres=# select pg_size_pretty(pg_relation_size('bmsql_order_line_pkey'));
 pg_size_pretty

 48 GB
(1 row)

I tried to see how much data we are dealing with here:

postgres=# select count(*) from bmsql_order_line;
   count
---
 910324839
(1 row)

Time: 503965.767 ms

And just to show that we were pushing to get these numbers:

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   2.380.002.201.980.00   93.44

Device:tpsMB_read/sMB_wrtn/sMB_readMB_wrtn
sdb2027.40   239.99 0.05   1199  0
sda   0.80 0.00 0.01  0  0



So we have 910M rows, and it took 8.39941667 minutes to count them at 
240MB/s.


I know this is a lot of data and as I said previously, happy to let 
anyone look at it. However, we clearly have something deeper to look into.


Thanks in advance,

JD


--
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL Centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://pgconf.us
* Unless otherwise stated, opinions are my own.   *


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-21 Thread Joshua D. Drake


On 07/20/2017 08:58 PM, Joshua D. Drake wrote:

On 07/19/2017 07:57 PM, Tom Lane wrote:

Peter Geoghegan  writes:



Test 1: 55G/srv/main
TPS:955

Test 2: 112G/srv/main
TPS:531 (Not sure what happened here, long checkpoint?)

Test 3: 109G/srv/main
TPS:868

Test 4: 143G
TPS:840

Test 5: 154G
TPS: 722

I am running the query here:

https://wiki.postgresql.org/wiki/Index_Maintenance#Summarize_keyspace_of_a_B-Tree_index 


This query didn't finish after 12 hours. Here is the new set:

name |  setting
-+---
 autovacuum  | on
 autovacuum_analyze_scale_factor | 0.1
 autovacuum_analyze_threshold| 50
 autovacuum_freeze_max_age   | 2
 autovacuum_max_workers  | 12
 autovacuum_multixact_freeze_max_age | 4
 autovacuum_naptime  | 10
 autovacuum_vacuum_cost_delay| 0
 autovacuum_vacuum_cost_limit| 5000
 autovacuum_vacuum_scale_factor  | 0.1
 autovacuum_vacuum_threshold | 50
 autovacuum_work_mem | -1
 log_autovacuum_min_duration | -1

I have only ran one test but it is pretty telling:

Test 1: 60G /srv/main
TPS: 914

Test 2: 92G /srv/main
TPS: Still running

I will post a update after the third or fourth test depending on the 
numbers. I created this instance exactly for these tests so if someone 
wants to poke around I can give access.


Thanks,

JD


--
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL Centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://pgconf.us
* Unless otherwise stated, opinions are my own.   *


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-21 Thread Joshua D. Drake


On 07/20/2017 11:54 PM, Sokolov Yura wrote:

On 2017-07-21 06:58, Joshua D. Drake wrote:

On 07/19/2017 07:57 PM, Tom Lane wrote:

Peter Geoghegan  writes:





--
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL Centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://pgconf.us
* Unless otherwise stated, opinions are my own.   *


Have you measured increased vacuum ring buffer?


No, not yet. I think we are still in the proving the problem stage.

JD



This will require recompilation, though.

With regards,



--
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL Centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://pgconf.us
* Unless otherwise stated, opinions are my own.   *


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-20 Thread Sokolov Yura


On 2017-07-21 06:58, Joshua D. Drake wrote:

On 07/19/2017 07:57 PM, Tom Lane wrote:

Peter Geoghegan  writes:

My argument for the importance of index bloat to the more general
bloat problem is simple: any bloat that accumulates, that cannot be
cleaned up, will probably accumulate until it impacts performance
quite noticeably.


But that just begs the question: *does* it accumulate indefinitely, or
does it eventually reach a more-or-less steady state?  The traditional
wisdom about btrees, for instance, is that no matter how full you pack
them to start with, the steady state is going to involve something 
like

1/3rd free space.  You can call that bloat if you want, but it's not
likely that you'll be able to reduce the number significantly without
paying exorbitant costs.

I'm not claiming that we don't have any problems, but I do think it's
important to draw a distinction between bloat and normal operating
overhead.


Agreed but we aren't talking about 30% I don't think. Here is where I
am at. It took until 30 minutes ago for the tests to finish:

name |  setting
-+---
 autovacuum  | on
 autovacuum_analyze_scale_factor | 0.1
 autovacuum_analyze_threshold| 50
 autovacuum_freeze_max_age   | 2
 autovacuum_max_workers  | 3
 autovacuum_multixact_freeze_max_age | 4
 autovacuum_naptime  | 60
 autovacuum_vacuum_cost_delay| 20
 autovacuum_vacuum_cost_limit| -1
 autovacuum_vacuum_scale_factor  | 0.2
 autovacuum_vacuum_threshold | 50
 autovacuum_work_mem | -1
 log_autovacuum_min_duration | -1


Test 1: 55G /srv/main
TPS:955

Test 2: 112G/srv/main
TPS:531 (Not sure what happened here, long checkpoint?)

Test 3: 109G/srv/main
TPS:868

Test 4: 143G
TPS:840

Test 5: 154G
TPS:722

I am running the query here:

https://wiki.postgresql.org/wiki/Index_Maintenance#Summarize_keyspace_of_a_B-Tree_index

And will post a followup. Once the query finishes I am going to launch
the tests with autovacuum_vacuum_cost_limit of 5000. Is there anything
else you folks would like me to change?

JD




--
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL Centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://pgconf.us
* Unless otherwise stated, opinions are my own.   *


Have you measured increased vacuum ring buffer?
This will require recompilation, though.

With regards,
--
Sokolov Yura
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-20 Thread Mark Kirkwood


On 21/07/17 15:58, Joshua D. Drake wrote:


On 07/19/2017 07:57 PM, Tom Lane wrote:

Peter Geoghegan  writes:

My argument for the importance of index bloat to the more general
bloat problem is simple: any bloat that accumulates, that cannot be
cleaned up, will probably accumulate until it impacts performance
quite noticeably.


But that just begs the question: *does* it accumulate indefinitely, or
does it eventually reach a more-or-less steady state?  The traditional
wisdom about btrees, for instance, is that no matter how full you pack
them to start with, the steady state is going to involve something like
1/3rd free space.  You can call that bloat if you want, but it's not
likely that you'll be able to reduce the number significantly without
paying exorbitant costs.

I'm not claiming that we don't have any problems, but I do think it's
important to draw a distinction between bloat and normal operating
overhead.


Agreed but we aren't talking about 30% I don't think. Here is where I 
am at. It took until 30 minutes ago for the tests to finish:


name |  setting
-+---
 autovacuum  | on
 autovacuum_analyze_scale_factor | 0.1
 autovacuum_analyze_threshold| 50
 autovacuum_freeze_max_age   | 2
 autovacuum_max_workers  | 3
 autovacuum_multixact_freeze_max_age | 4
 autovacuum_naptime  | 60
 autovacuum_vacuum_cost_delay| 20
 autovacuum_vacuum_cost_limit| -1
 autovacuum_vacuum_scale_factor  | 0.2
 autovacuum_vacuum_threshold | 50
 autovacuum_work_mem | -1
 log_autovacuum_min_duration | -1


Test 1: 55G/srv/main
TPS:955

Test 2: 112G/srv/main
TPS:531 (Not sure what happened here, long checkpoint?)

Test 3: 109G/srv/main
TPS:868

Test 4: 143G
TPS:840

Test 5: 154G
TPS: 722

I am running the query here:

https://wiki.postgresql.org/wiki/Index_Maintenance#Summarize_keyspace_of_a_B-Tree_index 



And will post a followup. Once the query finishes I am going to launch 
the tests with autovacuum_vacuum_cost_limit of 5000. Is there anything 
else you folks would like me to change?







I usually advise setting autovacuum_naptime = 10s (or even 5s) for 
workloads that do a lot of updates (or inserts + deletes) - as on modern 
HW a lot of churn can happen in 1 minute, and that just makes vacuum's 
job harder.


regards
Mark



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-20 Thread Joshua D. Drake


On 07/19/2017 07:57 PM, Tom Lane wrote:

Peter Geoghegan  writes:

My argument for the importance of index bloat to the more general
bloat problem is simple: any bloat that accumulates, that cannot be
cleaned up, will probably accumulate until it impacts performance
quite noticeably.


But that just begs the question: *does* it accumulate indefinitely, or
does it eventually reach a more-or-less steady state?  The traditional
wisdom about btrees, for instance, is that no matter how full you pack
them to start with, the steady state is going to involve something like
1/3rd free space.  You can call that bloat if you want, but it's not
likely that you'll be able to reduce the number significantly without
paying exorbitant costs.

I'm not claiming that we don't have any problems, but I do think it's
important to draw a distinction between bloat and normal operating
overhead.


Agreed but we aren't talking about 30% I don't think. Here is where I am 
at. It took until 30 minutes ago for the tests to finish:


name |  setting
-+---
 autovacuum  | on
 autovacuum_analyze_scale_factor | 0.1
 autovacuum_analyze_threshold| 50
 autovacuum_freeze_max_age   | 2
 autovacuum_max_workers  | 3
 autovacuum_multixact_freeze_max_age | 4
 autovacuum_naptime  | 60
 autovacuum_vacuum_cost_delay| 20
 autovacuum_vacuum_cost_limit| -1
 autovacuum_vacuum_scale_factor  | 0.2
 autovacuum_vacuum_threshold | 50
 autovacuum_work_mem | -1
 log_autovacuum_min_duration | -1


Test 1: 55G /srv/main
TPS:955

Test 2: 112G/srv/main
TPS:531 (Not sure what happened here, long checkpoint?)

Test 3: 109G/srv/main
TPS:868

Test 4: 143G
TPS:840

Test 5: 154G
TPS:722

I am running the query here:

https://wiki.postgresql.org/wiki/Index_Maintenance#Summarize_keyspace_of_a_B-Tree_index

And will post a followup. Once the query finishes I am going to launch 
the tests with autovacuum_vacuum_cost_limit of 5000. Is there anything 
else you folks would like me to change?


JD




--
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL Centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://pgconf.us
* Unless otherwise stated, opinions are my own.   *


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-20 Thread Peter Geoghegan

On Thu, Jul 20, 2017 at 7:45 AM, Claudio Freire  wrote:
>> For the purposes of this discussion, I'm mostly talking about
>> duplicates within a page on a unique index. If the keyspace owned by
>> an int4 unique index page only covers 20 distinct values, it will only
>> ever cover 20 distinct values, now and forever, despite the fact that
>> there is room for about 400 (a 90/10 split leaves you with 366 items +
>> 1 high key).
>
> Microvacuum could also help.
>
> If during a scan you find pointers that point to dead (in vacuum terms)
> tuples, the pointers in the index could be deleted. That could be done
> during insert into unique indexes before a split, to avoid the split.
>
> Chances are, if there are duplicates, at least a few of them will be dead.

My whole point is that that could easily fail to happen early enough
to prevent a pagesplit that is only needed because there is a short
term surge in the number of duplicate versions that need to be
available for one old snapshot. A pagesplit can be a permanent
solution to a temporary problem. Page deletion can only occur under
tight conditions that are unlikely to *ever* be met in many cases.

Imagine if it was impossible to insert physical duplicates into unique
indexes. In that world, you'd end up bloating some overflow data
structure in UPDATE heavy cases (where HOT doesn't work out). The
bloat wouldn't go on leaf pages, and so you wouldn't get page splits,
and so you wouldn't end up with leaf pages that can only store 20
distinct values now and forever, because that's the range of values
represented by downlinks and the leaf's high key. That's a situation
we actually saw for the leftmost leaf page in Alik's Zipfian
distribution test.

The way that the keyspace is broken up is supposed to be balanced, and
to have long term utility. Working against that to absorb a short term
bloat problem is penny wise, pound foolish.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-20 Thread Jeff Janes

On Thu, Jul 20, 2017 at 6:28 AM, Stephen Frost  wrote:

> Greetings,
>
> * Sokolov Yura (y.soko...@postgrespro.ru) wrote:
> > I wrote two days ago about vacuum ring buffer:
> > https://www.postgresql.org/message-id/8737e9bddb82501da1134f021bf492
> 9a%40postgrespro.ru
> >
> > Increasing Vacuum's ring buffer to size of Bulk Writer's one reduces
> > autovacuum time in 3-10 times.
> > (for both patched and unpatched version I used single non-default
> > setting
> > 'autovacuum_cost_delay=2ms').
> >
> > This is single line change, and it improves things a lot.
>
> Right- when the database fits in the OS cache but not in shared_buffers.
>

On a system with a slow fsync, increasing the ring buffer helps a lot even
if database doesn't fit in the OS cache. When the next buffer allocation
runs into a dirtied buffer in the ring, it needs to sync the WAL up through
that buffer's LSN before it can write it out and reuse it.  With a small
ring, this means a lot of WAL flushing needs to be done.

> I do agree that's a useful improvement to make based on your testing.
>
> It's not clear off-hand how much that would improve this case, as
> the database size appears to pretty quickly get beyond the OS memory
> size (and only in the first test is the DB starting size less than
> system memory to begin with).
>

Also, this system probably has a pretty fast fdatasync, considering it is
SSD.

Cheers,

Jeff

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-20 Thread Claudio Freire

On Thu, Jul 20, 2017 at 12:08 AM, Peter Geoghegan  wrote:
>> The traditional
>> wisdom about btrees, for instance, is that no matter how full you pack
>> them to start with, the steady state is going to involve something like
>> 1/3rd free space.  You can call that bloat if you want, but it's not
>> likely that you'll be able to reduce the number significantly without
>> paying exorbitant costs.
>
> For the purposes of this discussion, I'm mostly talking about
> duplicates within a page on a unique index. If the keyspace owned by
> an int4 unique index page only covers 20 distinct values, it will only
> ever cover 20 distinct values, now and forever, despite the fact that
> there is room for about 400 (a 90/10 split leaves you with 366 items +
> 1 high key).

Microvacuum could also help.

If during a scan you find pointers that point to dead (in vacuum terms)
tuples, the pointers in the index could be deleted. That could be done
during insert into unique indexes before a split, to avoid the split.

Chances are, if there are duplicates, at least a few of them will be dead.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-20 Thread Joshua D. Drake


On 07/20/2017 06:28 AM, Stephen Frost wrote:


It's not clear off-hand how much that would improve this case, as
the database size appears to pretty quickly get beyond the OS memory
size (and only in the first test is the DB starting size less than
system memory to begin with).


FYI,

I will be posting new numbers in a few hours. I had been planning on 
posting this last night but... KDE.


JD


--
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL Centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://pgconf.us
* Unless otherwise stated, opinions are my own.   *


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-20 Thread Stephen Frost

Greetings,

* Sokolov Yura (y.soko...@postgrespro.ru) wrote:
> I wrote two days ago about vacuum ring buffer:
> https://www.postgresql.org/message-id/8737e9bddb82501da1134f021bf4929a%40postgrespro.ru
> 
> Increasing Vacuum's ring buffer to size of Bulk Writer's one reduces
> autovacuum time in 3-10 times.
> (for both patched and unpatched version I used single non-default
> setting
> 'autovacuum_cost_delay=2ms').
> 
> This is single line change, and it improves things a lot.

Right- when the database fits in the OS cache but not in shared_buffers.

I do agree that's a useful improvement to make based on your testing.

It's not clear off-hand how much that would improve this case, as
the database size appears to pretty quickly get beyond the OS memory
size (and only in the first test is the DB starting size less than
system memory to begin with).

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-19 Thread Sokolov Yura

On 2017-07-20 05:52, Masahiko Sawada wrote:
On Thu, Jul 20, 2017 at 8:24 AM, Stephen Frost  
wrote:

* Tom Lane (t...@sss.pgh.pa.us) wrote:

"Joshua D. Drake"  writes:
> At PGConf US Philly last week I was talking with Jim and Jan about
> performance. One of the items that came up is that PostgreSQL can't run
> full throttle for long periods of time. The long and short is that no
> matter what, autovacuum can't keep up. This is what I have done:

Try reducing autovacuum_vacuum_cost_delay more, and/or increasing
autovacuum_vacuum_cost_limit.

Or get rid of the cost delay entirely and let autovacuum actually go 
as

fast as it can when it's run.  The assertion that it can't keep up is
still plausible, but configuring autovacuum to sleep regularly and 
then

complaining that it's not able to keep up doesn't make sense.

Reducing the nap time might also be helpful if autovacuum is going as
fast as it can and it's able to clear a table in less than a minute.

There have been discussions on this list about parallel vacuum of a
particular table as well; to address this issue I'd encourage 
reviewing

those discussions and looking at writing a patch to implement that
feature as that would address the case where the table is large enough
that autovacuum simply can't get through all of it before the other
backends have used all space available and then substantially 
increased

the size of the relation (leading to vacuum on the table running for
longer).

Yeah, the parallel vacuum of a particular table might help this issue
unless disk I/O is bottle-neck. I'm planning work on that.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

I wrote two days ago about vacuum ring buffer:
https://www.postgresql.org/message-id/8737e9bddb82501da1134f021bf4929a%40postgrespro.ru

Increasing Vacuum's ring buffer to size of Bulk Writer's one reduces
autovacuum time in 3-10 times.
(for both patched and unpatched version I used single non-default 
setting

'autovacuum_cost_delay=2ms').

This is single line change, and it improves things a lot.

With regards,
--
Sokolov Yura
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-19 Thread Peter Geoghegan

On Wed, Jul 19, 2017 at 7:57 PM, Tom Lane  wrote:
> Peter Geoghegan  writes:
>> My argument for the importance of index bloat to the more general
>> bloat problem is simple: any bloat that accumulates, that cannot be
>> cleaned up, will probably accumulate until it impacts performance
>> quite noticeably.
>
> But that just begs the question: *does* it accumulate indefinitely, or
> does it eventually reach a more-or-less steady state?

Yes, I believe it does reach a more-or-less steady state. It saturates
when there is a lot of contention, because then you actually can reuse
the bloat. If it didn't saturate, and instead became arbitrarily bad,
then we'd surely have heard about that before now.

The bloat is not entirely wasted, because it actually prevents you
from getting even more bloat in that part of the keyspace.

> The traditional
> wisdom about btrees, for instance, is that no matter how full you pack
> them to start with, the steady state is going to involve something like
> 1/3rd free space.  You can call that bloat if you want, but it's not
> likely that you'll be able to reduce the number significantly without
> paying exorbitant costs.

For the purposes of this discussion, I'm mostly talking about
duplicates within a page on a unique index. If the keyspace owned by
an int4 unique index page only covers 20 distinct values, it will only
ever cover 20 distinct values, now and forever, despite the fact that
there is room for about 400 (a 90/10 split leaves you with 366 items +
1 high key).

I don't know if I should really even call this bloat, since the term
is so overloaded, although this is what other database systems call
index bloat. I like to think of it as "damage to the keyspace",
although that terminology seems unlikely to catch on.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-19 Thread Tom Lane

Peter Geoghegan  writes:
> My argument for the importance of index bloat to the more general
> bloat problem is simple: any bloat that accumulates, that cannot be
> cleaned up, will probably accumulate until it impacts performance
> quite noticeably.

But that just begs the question: *does* it accumulate indefinitely, or
does it eventually reach a more-or-less steady state?  The traditional
wisdom about btrees, for instance, is that no matter how full you pack
them to start with, the steady state is going to involve something like
1/3rd free space.  You can call that bloat if you want, but it's not
likely that you'll be able to reduce the number significantly without
paying exorbitant costs.

I'm not claiming that we don't have any problems, but I do think it's
important to draw a distinction between bloat and normal operating
overhead.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-19 Thread Masahiko Sawada

On Thu, Jul 20, 2017 at 8:24 AM, Stephen Frost  wrote:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> "Joshua D. Drake"  writes:
>> > At PGConf US Philly last week I was talking with Jim and Jan about
>> > performance. One of the items that came up is that PostgreSQL can't run
>> > full throttle for long periods of time. The long and short is that no
>> > matter what, autovacuum can't keep up. This is what I have done:
>>
>> Try reducing autovacuum_vacuum_cost_delay more, and/or increasing
>> autovacuum_vacuum_cost_limit.
>
> Or get rid of the cost delay entirely and let autovacuum actually go as
> fast as it can when it's run.  The assertion that it can't keep up is
> still plausible, but configuring autovacuum to sleep regularly and then
> complaining that it's not able to keep up doesn't make sense.
>
> Reducing the nap time might also be helpful if autovacuum is going as
> fast as it can and it's able to clear a table in less than a minute.
>
> There have been discussions on this list about parallel vacuum of a
> particular table as well; to address this issue I'd encourage reviewing
> those discussions and looking at writing a patch to implement that
> feature as that would address the case where the table is large enough
> that autovacuum simply can't get through all of it before the other
> backends have used all space available and then substantially increased
> the size of the relation (leading to vacuum on the table running for
> longer).

Yeah, the parallel vacuum of a particular table might help this issue
unless disk I/O is bottle-neck. I'm planning work on that.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-19 Thread Peter Geoghegan

On Wed, Jul 19, 2017 at 3:54 PM, Alvaro Herrera
 wrote:
>> Index bloat is a general problem that B-Trees have in all other major
>> systems, but I think that PostgreSQL has a tendency to allow indexes
>> to become progressively more bloated over time, in a way that it often
>> can never recover from [1].
>
> Interesting assertion.

I don't pretend to understand the complicated feedback loops that may
exist for workloads that are very reliant on pruning,
kill_prior_tuples cleanup, very aggressive vacuuming, etc. I'll just
say that it seems very reasonable to suppose that they exist.

My argument for the importance of index bloat to the more general
bloat problem is simple: any bloat that accumulates, that cannot be
cleaned up, will probably accumulate until it impacts performance
quite noticeably.

Index bloat may not seem as important as I suggest to some. The ideas
I'm talking about were received somewhat skeptically at pgCon, when I
brought them up informally. Maybe this is a more cogent explanation:
if your indexes were somehow magically never bloated, but the heap
could become just as bloated, then it might not matter that much
simply because the heap pages wouldn't be accessed by index scans.
Heap page accesses may demonstrably be the bottleneck today, without
that magic in place, but perhaps only because indexes point to the
bloat in the first place. It could be a bit like the situation with
bloating UNDO segments in Oracle; those are not the worst thing to
have to bloat. And, the kill_prior_tuples stuff is only as strong as
the weakest link (oldest snapshot).

> Many years ago I set to implement btree page
> merging[1] from ideas in a 1996 paper[2], though that work never saw the
> light of day.  Maybe it can be valuable now.

Lots of other database systems don't implement B-Tree page merging,
because it's hard to make work with techniques like Lehman & Yao
B-Link trees, and the average case space utilization still ends up
being good enough. It may be more worthwhile for us, though.
Apparently Kevin has some ideas here.

If I'm right about this index bloat issue, then I'd sooner tackle the
problem by preventing bloat in unique indexes in the fist place, by
using some additional indirection, a topic that I've said plenty about
recently. I think that you can sometimes, though fairly rarely, see
cases that get *really* out of hand, but with things like in-database
queues, that have unusually aggressive update patterns where a great
many duplicates are generated in indexes [1]. Our handling of
duplicates in unique indexes [2] is surely a problem there.

> Another thing to consider is indirect indexes, wherein you get less heap
> bloat because more updates can become HOT.

I think that the stuff I'm talking about, about having indirection for
the primary key and making sure unique indexes actually disallow even
physical duplicates actually enables indirect indexes. Remember how I
pointed out issues with unique indexes and VACUUM when you presented
on it at the pgCon unconference? I think that those problems may be
solvable through centralizing everything by making duplicates within
leaf pages physically impossible for unique indexes (not just
logically impossible). One chokepoint for dealing with bloat cleanup
for every index is the entire point of your indirect index design, but
that requires that you actually have no ambiguity about what every
logical pointer (in every secondary index) points to.

> If we also allow heap to be pruned from line pointers by walking indexes
> to remove specific pointers, instead of requiring indexes to be scanned
> whole for the removal, as proposed by Andres, we could get further
> along.

Yeah. I talked to Robert about doing that just today. That seems very
worthwhile, no matter how it ends up being used (whether it's by
VACUUM, something closer to an synchronous deletion, or whatever).

[1] https://brandur.org/postgres-queues
[2] 
https://wiki.postgresql.org/wiki/Key_normalization#Avoiding_unnecessary_unique_index_enforcement
-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-19 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> "Joshua D. Drake"  writes:
> > At PGConf US Philly last week I was talking with Jim and Jan about 
> > performance. One of the items that came up is that PostgreSQL can't run 
> > full throttle for long periods of time. The long and short is that no 
> > matter what, autovacuum can't keep up. This is what I have done:
> 
> Try reducing autovacuum_vacuum_cost_delay more, and/or increasing
> autovacuum_vacuum_cost_limit.

Or get rid of the cost delay entirely and let autovacuum actually go as
fast as it can when it's run.  The assertion that it can't keep up is
still plausible, but configuring autovacuum to sleep regularly and then
complaining that it's not able to keep up doesn't make sense.

Reducing the nap time might also be helpful if autovacuum is going as
fast as it can and it's able to clear a table in less than a minute.

There have been discussions on this list about parallel vacuum of a
particular table as well; to address this issue I'd encourage reviewing
those discussions and looking at writing a patch to implement that
feature as that would address the case where the table is large enough
that autovacuum simply can't get through all of it before the other
backends have used all space available and then substantially increased
the size of the relation (leading to vacuum on the table running for
longer).

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-19 Thread Joe Conway

On 07/19/2017 03:29 PM, Tom Lane wrote:
> "Joshua D. Drake"  writes:
>> At PGConf US Philly last week I was talking with Jim and Jan about 
>> performance. One of the items that came up is that PostgreSQL can't run 
>> full throttle for long periods of time. The long and short is that no 
>> matter what, autovacuum can't keep up. This is what I have done:
> 
> Try reducing autovacuum_vacuum_cost_delay more, and/or increasing
> autovacuum_vacuum_cost_limit.

I would try
  autovacuum_vacuum_cost_delay = 0
and for any tables > 1 million rows:
  autovacuum_vacuum_scale_factor: 0
  autovacuum_vacuum_threshold: 10 (perhaps even smaller)

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-19 Thread Alvaro Herrera

Peter Geoghegan wrote:

> Index bloat is a general problem that B-Trees have in all other major
> systems, but I think that PostgreSQL has a tendency to allow indexes
> to become progressively more bloated over time, in a way that it often
> can never recover from [1].

Interesting assertion.  Many years ago I set to implement btree page
merging[1] from ideas in a 1996 paper[2], though that work never saw the
light of day.  Maybe it can be valuable now.

Another thing to consider is indirect indexes, wherein you get less heap
bloat because more updates can become HOT.

If we also allow heap to be pruned from line pointers by walking indexes
to remove specific pointers, instead of requiring indexes to be scanned
whole for the removal, as proposed by Andres, we could get further
along.

[1] 
https://www.postgresql.org/message-id/20020912235429.4714071a.alvhe...@atentus.com
[2] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.9961

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-19 Thread Tom Lane

"Joshua D. Drake"  writes:
> At PGConf US Philly last week I was talking with Jim and Jan about 
> performance. One of the items that came up is that PostgreSQL can't run 
> full throttle for long periods of time. The long and short is that no 
> matter what, autovacuum can't keep up. This is what I have done:

Try reducing autovacuum_vacuum_cost_delay more, and/or increasing
autovacuum_vacuum_cost_limit.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-19 Thread Peter Geoghegan

On Wed, Jul 19, 2017 at 3:11 PM, Joshua D. Drake  wrote:
> The good news is, PostgreSQL is not doing half bad against 128 connections
> with only 16vCPU. The bad news is we more than doubled our disk size without
> getting reuse or bloat under control. The concern here is that under heavy
> write loads that are persistent, we will eventually bloat out and have to
> vacuum full, no matter what. I know that Jan has done some testing and the
> best he could get is something like 8 days before PostgreSQL became unusable
> (but don't quote me on that).
>
> I am open to comments, suggestions, running multiple tests with different
> parameters or just leaving this in the archive for people to reference.

Did you see my blogpost on planet PostgreSQL from last night?

https://pgeoghegan.blogspot.com/2017/07/postgresql-index-bloat-microscope.html

Perhaps you could use my query to instrument an interesting index, to
see what that turns up. I would really like to get a better sense of
how often and to what extent index bloat is a problem that VACUUM is
just inherently incapable of keeping under control over time. The
timeline for performance to degrade with real production workloads is
very interesting to me. It's really hard to simulate certain types of
problems that you will see frequently in production.

Index bloat is a general problem that B-Trees have in all other major
systems, but I think that PostgreSQL has a tendency to allow indexes
to become progressively more bloated over time, in a way that it often
can never recover from [1]. This may be a particular problem with
unique indexes, where many physical duplicates accumulate in pages.
Duplicates that are theoretically reclaimable, but due to how the
keyspace is split up, will never actually be reclaimed [2].

[1] 
postgr.es/m/CAH2-Wzmf6intNY1ggiNzOziiO5Eq=DsXfeptODGxO=2j-i1...@mail.gmail.com
[2] 
https://wiki.postgresql.org/wiki/Key_normalization#VACUUM_and_nbtree_page_deletion
-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

23 matches

Site Navigation

Mail list logo

Footer information