Re: [GENERAL] Strange case of database bloat

2017-07-26 Thread Peter Geoghegan
On Tue, Jul 4, 2017 at 10:18 PM, Chris Travers  wrote:
> First, I haven't seen major problems of database bloat in a long time which
> is why I find this case strange.  I wanted to ask here what may be causing
> it.
>
> Problem:
> ==
> Database is in the 100GB to 200GB size range, running on btrfs (not my
> choice) with nodatacow enabled (which I set up to fix a performance issue).
> The workload is a very heavy batch-update workload.
>
> The database bloats linearly.  I have measured this on one  table (of 149M
> rows).
>
> After vacuum full this table is (including indexes): 17GB
> Every 24 hrs, seems to add its original space in size to the file system
> +/-.
>
> Bloat seems to be affecting both indexes and underlying tables.
>
> Vacuum verbose does not indicate a disproportionate number of rows being
> unremovable.  So autovacuum is keeping up without too much difficulty.

That is odd. I find the linear increase in bloat alarming.

I found this thread in the context of investigating a possible
performance regression in 9.5 [1]. It's far from clear that that's
actually what's up here, but it's worth considering as a possibility.
Some questions about that:

Can you show the schema of at least one of the tables that are most
affected, its indexes, etc?

Are all indexes bloated equally?

Do you notice that unique indexes are not as badly affected as other
indexes? How does it break down, in terms of how much each individual
index grows in size?

You say that the problem is with both indexes and tables. How much of
this is table bloat, and how much is index bloat?

Thanks

[1] 
https://postgr.es/m/CAH2-Wz=sfakvmv1x9jh19ej8am8tzn9f-yecips9hrrrqss...@mail.gmail.com
-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Strange case of database bloat

2017-07-05 Thread Jeff Janes
On Tue, Jul 4, 2017 at 10:18 PM, Chris Travers 
wrote:

>
> Questions
> ===
> I assume that it is the fact that rows update frequently which is the
> problem here? But why doesn't Postgres re-use any of the empty disk pages?
>

Can you install the contrib extension pg_freespacemap and use "select *
from pg_freespace('table_name')" to see if PostgreSQL agrees that the space
is re-usable?

Cheers,

Jeff


Re: [GENERAL] Strange case of database bloat

2017-07-05 Thread Chris Travers
On Wed, Jul 5, 2017 at 3:51 PM, Bill Moran  wrote:

> On Wed, 5 Jul 2017 13:28:29 +0200
> Chris Travers  wrote:
>
> > On Wed, Jul 5, 2017 at 1:00 PM, PT  wrote:
> >
> > > 2x the working size for a frequently updated table isn't terrible
> bloat.
> > > Or are
> > > you saying it grows 2x every 24 hours and keeps growing? The real
> question
> > > is
> > > how often the table is being vacuumed. How long have you let the
> > > experiment run
> > > for? Does the table find an equilibrium size where it stops growing?
> Have
> > > you
> > > turned on logging for autovacuum to see how often it actually runs on
> this
> > > table?
> >
> > If it were only twice it would not bother me.  The fact that it is twice
> > after 24 hrs, 3x after 48 hrs and 4x after 72 hrs is alarming.
>
> Ok, yup, that seems like an issue.
>
> > > No unremovable rows does not indicate that autovaccum is keeping up. It
> > > just
> > > indicates that you don't have a problem with uncommitted transactions
> > > holding
> > > rows for long periods of time.
> >
> > Right.  I should have specified that I also have not seen auto vacuum in
> > pg_stat_activity with an unusual duration.
>
> How long does it take when you run it manually? My experience is that
> autovac
> can take orders of magnitude longer with the default cost delays, but just
> becuase you don't see it, doesn't mean it's not happening. Turn on autovac
> logging and check the logs after a few days.
>

a few min for a normal vacuum, maybe 20-30 min for vacuum full (on one of
the large tables).

>
> > > Have you looked at tuning the autovacuum parameters for this table?
> More
> > > frequent
> > > vacuums should keep things more under control. However, if the write
> load
> > > is
> > > heavy, you'll probably want to lower autovacuum_vacuum_cost_delay.
> > > Personally,
> > > I feel like the default value for this should be 0, but there are
> likely
> > > those
> > > that would debate that. In any event, if that setting is too high it
> can
> > > cause
> > > autovacuum to take so long that it can't keep up. In theory, setting it
> > > too low
> > > can cause autovaccum to have a negative performance impact, but I've
> never
> > > seen
> > > that happen on modern hardware.
> >
> > Most of the writes are periodic (hourly?) batch updates which are fairly
> > big.
>
> Not sure how that statement is related to the comments I made preceeding
> it.
>

Not using cost-based vacuum afaik but will check that.  It is a good point.

>
> > > But that's all speculation until you know how frequently autovacuum
> runs on
> > > that table and how long it takes to do its work.
> >
> > Given the other time I have seen similar behaviour, the question in my
> mind
> > is why free pages near the beginning of the table don't seem to be
> re-used.
>
> It's possible that the early pages don't have enough usable space for the
> updated
> rows. Depending on your update patterns, you may end up with bloat
> scattered across
> many pages, with no individual page having enough space to be reused. That
> seems
> unlikely as the bloat becomes many times the used space, though.
>

The fire 35 pages are completely empty.  As I say I have seen this sort of
thing before (and I wonder if empty pages early in a table are somehow
biased against in terms of writes).

>
> The pg_freespacemap extension should be useful in determining if that's
> what's
> happening. Combine that with turning on logging to ensure that autovacuum
> is
> actually operating effectively.
>

I am not convinced it is a vacuum problem.  Would it be likely that large
batch updates would linearly continue to write pages forward as a
performance optimisation?


>
> --
> Bill Moran 
>



-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: [GENERAL] Strange case of database bloat

2017-07-05 Thread Bill Moran
On Wed, 5 Jul 2017 13:28:29 +0200
Chris Travers  wrote:

> On Wed, Jul 5, 2017 at 1:00 PM, PT  wrote:
> 
> > 2x the working size for a frequently updated table isn't terrible bloat.
> > Or are
> > you saying it grows 2x every 24 hours and keeps growing? The real question
> > is
> > how often the table is being vacuumed. How long have you let the
> > experiment run
> > for? Does the table find an equilibrium size where it stops growing? Have
> > you
> > turned on logging for autovacuum to see how often it actually runs on this
> > table?
> 
> If it were only twice it would not bother me.  The fact that it is twice
> after 24 hrs, 3x after 48 hrs and 4x after 72 hrs is alarming.

Ok, yup, that seems like an issue.

> > No unremovable rows does not indicate that autovaccum is keeping up. It
> > just
> > indicates that you don't have a problem with uncommitted transactions
> > holding
> > rows for long periods of time.
> 
> Right.  I should have specified that I also have not seen auto vacuum in
> pg_stat_activity with an unusual duration.

How long does it take when you run it manually? My experience is that autovac
can take orders of magnitude longer with the default cost delays, but just
becuase you don't see it, doesn't mean it's not happening. Turn on autovac
logging and check the logs after a few days.

> > Have you looked at tuning the autovacuum parameters for this table? More
> > frequent
> > vacuums should keep things more under control. However, if the write load
> > is
> > heavy, you'll probably want to lower autovacuum_vacuum_cost_delay.
> > Personally,
> > I feel like the default value for this should be 0, but there are likely
> > those
> > that would debate that. In any event, if that setting is too high it can
> > cause
> > autovacuum to take so long that it can't keep up. In theory, setting it
> > too low
> > can cause autovaccum to have a negative performance impact, but I've never
> > seen
> > that happen on modern hardware.
> 
> Most of the writes are periodic (hourly?) batch updates which are fairly
> big.

Not sure how that statement is related to the comments I made preceeding it.

> > But that's all speculation until you know how frequently autovacuum runs on
> > that table and how long it takes to do its work.
> 
> Given the other time I have seen similar behaviour, the question in my mind
> is why free pages near the beginning of the table don't seem to be re-used.

It's possible that the early pages don't have enough usable space for the 
updated
rows. Depending on your update patterns, you may end up with bloat scattered 
across
many pages, with no individual page having enough space to be reused. That seems
unlikely as the bloat becomes many times the used space, though.

The pg_freespacemap extension should be useful in determining if that's what's
happening. Combine that with turning on logging to ensure that autovacuum is
actually operating effectively. 

-- 
Bill Moran 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Strange case of database bloat

2017-07-05 Thread Scott Mead
On Wed, Jul 5, 2017 at 7:28 AM, Chris Travers 
wrote:

>
>
> On Wed, Jul 5, 2017 at 1:00 PM, PT  wrote:
>
>>
>> 2x the working size for a frequently updated table isn't terrible bloat.
>> Or are
>> you saying it grows 2x every 24 hours and keeps growing? The real
>> question is
>> how often the table is being vacuumed. How long have you let the
>> experiment run
>> for? Does the table find an equilibrium size where it stops growing? Have
>> you
>> turned on logging for autovacuum to see how often it actually runs on this
>> table?
>>
>
> If it were only twice it would not bother me.  The fact that it is twice
> after 24 hrs, 3x after 48 hrs and 4x after 72 hrs is alarming.
>
>>
>> No unremovable rows does not indicate that autovaccum is keeping up. It
>> just
>> indicates that you don't have a problem with uncommitted transactions
>> holding
>> rows for long periods of time.
>>
>
> Right.  I should have specified that I also have not seen auto vacuum in
> pg_stat_activity with an unusual duration.
>

What about anything 'WHERE state = 'idle in transaction' ?



>
>> Have you looked at tuning the autovacuum parameters for this table? More
>> frequent
>> vacuums should keep things more under control. However, if the write load
>> is
>> heavy, you'll probably want to lower autovacuum_vacuum_cost_delay.
>> Personally,
>> I feel like the default value for this should be 0, but there are likely
>> those
>> that would debate that. In any event, if that setting is too high it can
>> cause
>> autovacuum to take so long that it can't keep up. In theory, setting it
>> too low
>> can cause autovaccum to have a negative performance impact, but I've
>> never seen
>> that happen on modern hardware.
>>
>
> Most of the writes are periodic (hourly?) batch updates which are fairly
> big.
>

I've had similar issues when each update makes a row larger than any of the
available slots.  I had a workload (admittedly on an older version of
postgres) where we were updating every row a few times a day.  Each time,
the row (a bytea field) would grow about 0.5 - 5.0%.  This would prevent us
from using freespace (it was all too small).  The only way around this was :

1. Run manual table rebuilds (this was before pg_repack / reorg).  Use
pg_repack now
2. Fix the app

  Essentially, I would do targeted, aggressive vacuuming and then, once a
month (or once I hit a bloat threshold) do a repack (again, it was my
custom process back then).  This was the bandage until I could get the app
fixed to stop churning so badly.


>
>> But that's all speculation until you know how frequently autovacuum runs
>> on
>> that table and how long it takes to do its work.
>>
>
> Given the other time I have seen similar behaviour, the question in my
> mind is why free pages near the beginning of the table don't seem to be
> re-used.
>
> I would like to try to verify that however, if you have any ideas.
>
>>
>> --
>> PT 
>>
>
>
>
> --
> Best Wishes,
> Chris Travers
>
> Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
> lock-in.
> http://www.efficito.com/learn_more
>



-- 
--
Scott Mead
Sr. Architect
*OpenSCG *
http://openscg.com


Re: [GENERAL] Strange case of database bloat

2017-07-05 Thread Chris Travers
On Wed, Jul 5, 2017 at 1:00 PM, PT  wrote:

>
> 2x the working size for a frequently updated table isn't terrible bloat.
> Or are
> you saying it grows 2x every 24 hours and keeps growing? The real question
> is
> how often the table is being vacuumed. How long have you let the
> experiment run
> for? Does the table find an equilibrium size where it stops growing? Have
> you
> turned on logging for autovacuum to see how often it actually runs on this
> table?
>

If it were only twice it would not bother me.  The fact that it is twice
after 24 hrs, 3x after 48 hrs and 4x after 72 hrs is alarming.

>
> No unremovable rows does not indicate that autovaccum is keeping up. It
> just
> indicates that you don't have a problem with uncommitted transactions
> holding
> rows for long periods of time.
>

Right.  I should have specified that I also have not seen auto vacuum in
pg_stat_activity with an unusual duration.

>
> Have you looked at tuning the autovacuum parameters for this table? More
> frequent
> vacuums should keep things more under control. However, if the write load
> is
> heavy, you'll probably want to lower autovacuum_vacuum_cost_delay.
> Personally,
> I feel like the default value for this should be 0, but there are likely
> those
> that would debate that. In any event, if that setting is too high it can
> cause
> autovacuum to take so long that it can't keep up. In theory, setting it
> too low
> can cause autovaccum to have a negative performance impact, but I've never
> seen
> that happen on modern hardware.
>

Most of the writes are periodic (hourly?) batch updates which are fairly
big.

>
> But that's all speculation until you know how frequently autovacuum runs on
> that table and how long it takes to do its work.
>

Given the other time I have seen similar behaviour, the question in my mind
is why free pages near the beginning of the table don't seem to be re-used.

I would like to try to verify that however, if you have any ideas.

>
> --
> PT 
>



-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: [GENERAL] Strange case of database bloat

2017-07-05 Thread PT
On Wed, 5 Jul 2017 07:18:03 +0200
Chris Travers  wrote:

> Hi;
> 
> First, I haven't seen major problems of database bloat in a long time which
> is why I find this case strange.  I wanted to ask here what may be causing
> it.
> 
> Problem:
> ==
> Database is in the 100GB to 200GB size range, running on btrfs (not my
> choice) with nodatacow enabled (which I set up to fix a performance
> issue).  The workload is a very heavy batch-update workload.
> 
> The database bloats linearly.  I have measured this on one  table (of 149M
> rows).
> 
> After vacuum full this table is (including indexes): 17GB
> Every 24 hrs, seems to add its original space in size to the file system
> +/-.
> 
> Bloat seems to be affecting both indexes and underlying tables.
> 
> Vacuum verbose does not indicate a disproportionate number of rows being
> unremovable.  So autovacuum is keeping up without too much difficulty.
> 
> 
> Troubleshooting so far
> ===
> 
>  filefrag finds a single extent on each file, so copy-on-write is not the
> culprit
> 
> Selecting the smallest 10 values of ctid from one of the bloating tables
> shows the first page used is around page 35 with one row per used page (and
> large gaps in between).
> 
> Questions
> ===
> I assume that it is the fact that rows update frequently which is the
> problem here? But why doesn't Postgres re-use any of the empty disk pages?
> 
> More importantly, is there anything that can be done to mitigate this issue
> other than a frequent vacuum full?

2x the working size for a frequently updated table isn't terrible bloat. Or are
you saying it grows 2x every 24 hours and keeps growing? The real question is
how often the table is being vacuumed. How long have you let the experiment run
for? Does the table find an equilibrium size where it stops growing? Have you
turned on logging for autovacuum to see how often it actually runs on this
table?

No unremovable rows does not indicate that autovaccum is keeping up. It just
indicates that you don't have a problem with uncommitted transactions holding
rows for long periods of time.

Have you looked at tuning the autovacuum parameters for this table? More 
frequent
vacuums should keep things more under control. However, if the write load is
heavy, you'll probably want to lower autovacuum_vacuum_cost_delay. Personally,
I feel like the default value for this should be 0, but there are likely those
that would debate that. In any event, if that setting is too high it can cause
autovacuum to take so long that it can't keep up. In theory, setting it too low
can cause autovaccum to have a negative performance impact, but I've never seen
that happen on modern hardware.

But that's all speculation until you know how frequently autovacuum runs on
that table and how long it takes to do its work.

-- 
PT 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Strange case of database bloat

2017-07-05 Thread Chris Travers
On Wed, Jul 5, 2017 at 7:18 AM, Chris Travers 
wrote:

> Hi;
>
> First, I haven't seen major problems of database bloat in a long time
> which is why I find this case strange.  I wanted to ask here what may be
> causing it.
>
> Problem:
> ==
> Database is in the 100GB to 200GB size range, running on btrfs (not my
> choice) with nodatacow enabled (which I set up to fix a performance
> issue).  The workload is a very heavy batch-update workload.
>
> The database bloats linearly.  I have measured this on one  table (of 149M
> rows).
>
> After vacuum full this table is (including indexes): 17GB
> Every 24 hrs, seems to add its original space in size to the file system
> +/-.
>
> Bloat seems to be affecting both indexes and underlying tables.
>
> Vacuum verbose does not indicate a disproportionate number of rows being
> unremovable.  So autovacuum is keeping up without too much difficulty.
>
>
> Troubleshooting so far
> ===
>
>  filefrag finds a single extent on each file, so copy-on-write is not the
> culprit
>
> Selecting the smallest 10 values of ctid from one of the bloating tables
> shows the first page used is around page 35 with one row per used page (and
> large gaps in between).
>
> Questions
> ===
> I assume that it is the fact that rows update frequently which is the
> problem here? But why doesn't Postgres re-use any of the empty disk pages?
>
> More importantly, is there anything that can be done to mitigate this
> issue other than a frequent vacuum full?
>

Two points I think I forgot to mention:

This is PostgreSQL 9.5.1

Last I saw something similar was a more "minor" case on a larger db, on
PostgreSQL 9.3.x

The more minor case was a small table (maybe 20k rows) which had bloated to
1GB in size due to this same sort of problem but we ignored it because the
table was cached all the time and at the RAM we were using, it wasn't a
significant drain  on performance.  However, here it is.

First 20 CTIDs from one table:

(35,25)
(48,15)
(76,20)
(77,20)
(83,20)
(96,19)
(100,19)
(103,13)
(111,9)
(115,12)
(124,11)
(120,12)
(131,12)
(137,12)
(150,14)
(152,12)
(157,20)
(162,14)


> --
> Best Wishes,
> Chris Travers
>
> Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
> lock-in.
> http://www.efficito.com/learn_more
>



-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


[GENERAL] Strange case of database bloat

2017-07-04 Thread Chris Travers
Hi;

First, I haven't seen major problems of database bloat in a long time which
is why I find this case strange.  I wanted to ask here what may be causing
it.

Problem:
==
Database is in the 100GB to 200GB size range, running on btrfs (not my
choice) with nodatacow enabled (which I set up to fix a performance
issue).  The workload is a very heavy batch-update workload.

The database bloats linearly.  I have measured this on one  table (of 149M
rows).

After vacuum full this table is (including indexes): 17GB
Every 24 hrs, seems to add its original space in size to the file system
+/-.

Bloat seems to be affecting both indexes and underlying tables.

Vacuum verbose does not indicate a disproportionate number of rows being
unremovable.  So autovacuum is keeping up without too much difficulty.


Troubleshooting so far
===

 filefrag finds a single extent on each file, so copy-on-write is not the
culprit

Selecting the smallest 10 values of ctid from one of the bloating tables
shows the first page used is around page 35 with one row per used page (and
large gaps in between).

Questions
===
I assume that it is the fact that rows update frequently which is the
problem here? But why doesn't Postgres re-use any of the empty disk pages?

More importantly, is there anything that can be done to mitigate this issue
other than a frequent vacuum full?

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more