Re: [HACKERS] Optimization for lazy_scan_heap

2016-10-17 Thread Masahiko Sawada
On Mon, Oct 3, 2016 at 10:59 AM, Michael Paquier
 wrote:
> On Mon, Sep 26, 2016 at 5:26 AM, Rahila Syed  wrote:
>> Some initial comments on optimize_lazy_scan_heap_v2.patch.
>
> Seems worth pursuing. Marking as returned with feedback because of
> lack of activity and some basic reviews sent.

Thank you for reviewing this patch, and sorry for my late reply.

I've measured the performance improvement of this patch, but I got the
result showing that it can improve vacuum freeze performance 30 sec
with 32TB table. I don't think that this patch is worth to pursue any
further, so I'd like to withdraw this.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization for lazy_scan_heap

2016-10-02 Thread Michael Paquier
On Mon, Sep 26, 2016 at 5:26 AM, Rahila Syed  wrote:
> Some initial comments on optimize_lazy_scan_heap_v2.patch.

Seems worth pursuing. Marking as returned with feedback because of
lack of activity and some basic reviews sent.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization for lazy_scan_heap

2016-09-25 Thread Rahila Syed
Hello,

Some initial comments on optimize_lazy_scan_heap_v2.patch.

>-   while (next_unskippable_block < nblocks)
>+   while (next_unskippable_block < nblocks &&
>+  !FORCE_CHECK_PAGE(next_unskippable_block))

Dont  we need similar check of !FORCE_CHECK_PAGE(next_unskippable_block) in
the below code
which appears on line no 556 of vacuumlazy.c ?

next_unskippable_block = 0;
if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
{
while (next_unskippable_block < nblocks)


>   {
>if ((vmstatus & VISIBILITYMAP_ALL_FROZEN) == 0)
>break;
>+   n_all_frozen++;
>}
>else
>{
>if ((vmstatus & VISIBILITYMAP_ALL_VISIBLE) == 0)
>break;
>+
>+   /* Count the number of all-frozen pages */
>+   if (vmstatus & VISIBILITYMAP_ALL_FROZEN)
>+   n_all_frozen++;
>}

I think its better to have just one place where we increment n_all_frozen
before if-else block.

>}
>vacuum_delay_point();
>next_unskippable_block++;
>+   n_skipped++;
>}
>}

Also, instead of incrementing n_skipped everytime, it can be set to
'next_unskippable_block' or 'next_unskippable_block -blkno'
once at the end of the loop.

Thank you,
Rahila Syed



On Thu, Aug 25, 2016 at 11:52 AM, Masahiko Sawada 
wrote:

> On Thu, Aug 25, 2016 at 1:41 AM, Anastasia Lubennikova
>  wrote:
> > The following review has been posted through the commitfest application:
> > make installcheck-world:  tested, passed
> > Implements feature:   not tested
> > Spec compliant:   not tested
> > Documentation:not tested
> >
> > Hi,
> > I haven't tested the performance yet, but the patch itself looks pretty
> good
> > and reasonable improvement.
> > I have a question about removing the comment. It seems to be really
> tricky
> > moment. How do we know that all-frozen block hasn't changed since the
> > moment we checked it?
>
> I think that we don't need to check whether all-frozen block hasn't
> changed since we checked it.
> Even if the all-frozen block has changed after we saw it as an
> all-frozen page, we can update the relfrozenxid.
> Because any new XIDs just added to that page are newer than the
> GlobalXmin we computed.
>
> Am I missing something?
>
> In this patch, since we count the number of all-frozen page even
> during not an aggresive scan, I thought that we don't need to check
> whether these blocks were all-frozen pages.
>
> > I'm going to test the performance this week.
> > I wonder if you could send a test script or describe the steps to test
> it?
>
> This optimization reduces the number of scanning visibility map when
> table is almost visible or frozen..
> So it would be effective for very large table (more than several
> hundreds GB) which is almost all-visible or all-frozen pages.
>
> For example,
> 1. Create very large table.
> 2. Execute VACUUM FREEZE to freeze all pages of table.
> 3. Measure the execution time of VACUUM FREEZE.
> I hope that the second VACUUM FREEZE become fast a little.
>
> I modified the comment, and attached v2 patch.
>
> Regards,
>
> --
> Masahiko Sawada
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


Re: [HACKERS] Optimization for lazy_scan_heap

2016-09-08 Thread Masahiko Sawada
On Thu, Sep 8, 2016 at 11:27 PM, Jim Nasby  wrote:
> On 9/8/16 3:03 AM, Masahiko Sawada wrote:
>>
>> Current manual vacuum doesn't output how may all_frozen pages we
>> skipped according to visibility map.
>> That's why I attached 0001 patch which makes the manual vacuum emit
>> such information.
>
>
> I think we should add that, and info about all-frozen skips, regardless of
> the rest of these changes. Can you submit that separately to the commitfest?
> (Or I can if you want.)

Yeah, autovacuum emits such information but manual vacuum doesn't.
I submitted that patch before[1] but patch was not committed, because
verbose output is already complicated.
But I will submit it again and start to discussion about that.

[1] 
https://www.postgresql.org/message-id/CAD21AoCQGTKDxQ6YfrJ0%2B%3Dev6A3i3pt2ULdWxCtwPLKR2E77jg%40mail.gmail.com

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization for lazy_scan_heap

2016-09-08 Thread Robert Haas
On Thu, Sep 8, 2016 at 4:03 AM, Masahiko Sawada  wrote:
> On Wed, Sep 7, 2016 at 4:11 PM, Simon Riggs  wrote:
>> On 7 September 2016 at 04:13, Masahiko Sawada  wrote:
>>
>>> Since current HEAD could scan visibility map twice, the execution time
>>> of Patched is approximately half of HEAD.
>>
>> Sounds good.
>>
>> To ensure we are doing exactly same amount of work as before, did you
>> see the output of VACUUM VEROBOSE?
>
> Sorry, the previous test result I posted was something wrong.
> I rerun the performance test and results are,
>
> * 1TB Table(visibility map size is 32MB)
> HEAD : 4853.250 ms (00:04.853)
> Patched : 3805.789 ms (00:03.806)
>
> * 8TB Table(visibility map size is 257MB)
> HEAD : 37853.891 ms (00:37.854)
> Patched : 30153.943 ms (00:30.154)
>
> * 32TB Table(visibility map size is 1GB)
> HEAD: 151908.344 ms (02:31.908)
> Patched: 120560.037 ms (02:00.560)
>
> Since visibility map page can be cached onto shared buffer or OS cache
> by first scanning, the benefit of this patch seems not to be large.

Yeah, that's not nearly as good as the first set of results.  Maybe
it's still worth doing, but if you need to have a 32TB table to save
as much as 30 seconds, we don't have much of a problem here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization for lazy_scan_heap

2016-09-08 Thread Jim Nasby

On 9/8/16 3:03 AM, Masahiko Sawada wrote:

Current manual vacuum doesn't output how may all_frozen pages we
skipped according to visibility map.
That's why I attached 0001 patch which makes the manual vacuum emit
such information.


I think we should add that, and info about all-frozen skips, regardless 
of the rest of these changes. Can you submit that separately to the 
commitfest? (Or I can if you want.)

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization for lazy_scan_heap

2016-09-08 Thread Masahiko Sawada
On Wed, Sep 7, 2016 at 4:11 PM, Simon Riggs  wrote:
> On 7 September 2016 at 04:13, Masahiko Sawada  wrote:
>
>> Since current HEAD could scan visibility map twice, the execution time
>> of Patched is approximately half of HEAD.
>
> Sounds good.
>
> To ensure we are doing exactly same amount of work as before, did you
> see the output of VACUUM VEROBOSE?

Sorry, the previous test result I posted was something wrong.
I rerun the performance test and results are,

* 1TB Table(visibility map size is 32MB)
HEAD : 4853.250 ms (00:04.853)
Patched : 3805.789 ms (00:03.806)

* 8TB Table(visibility map size is 257MB)
HEAD : 37853.891 ms (00:37.854)
Patched : 30153.943 ms (00:30.154)

* 32TB Table(visibility map size is 1GB)
HEAD: 151908.344 ms (02:31.908)
Patched: 120560.037 ms (02:00.560)

Since visibility map page can be cached onto shared buffer or OS cache
by first scanning, the benefit of this patch seems not to be large.

Here are outputs of VACUUM VERBOSE for 32TB table.

* HEAD
INFO:  vacuuming "public.vm_skip_test"
INFO:  "vm_skip_test": found 0 removable, 0 nonremovable row versions
in 0 out of 4294967294 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins.
Skipped 4294967294 all-frozen pages according to visibility map.
0 pages are entirely empty.
CPU 1.06s/148.11u sec elapsed 149.20 sec.
VACUUM
Time: 151908.344 ms (02:31.908)

* Patched
INFO:  vacuuming "public.vm_skip_test"
INFO:  "vm_skip_test": found 0 removable, 0 nonremovable row versions
in 0 out of 4294967294 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins.
Skipped 4294967294 all-frozen pages according to visibility map.
0 pages are entirely empty.
CPU 0.65s/117.15u sec elapsed 117.81 sec.
VACUUM
Time: 120560.037 ms (02:00.560)

Current manual vacuum doesn't output how may all_frozen pages we
skipped according to visibility map.
That's why I attached 0001 patch which makes the manual vacuum emit
such information.

>
> Can we produce a test that verifies the result patched/unpatched?
>

Attached test shell script but because I don't have such a large disk,
I've measured performance benefit using by something like unofficial
way.

To make a situation where table is extremly large and make
corresponding visibility map, I applied 0002 patch and made a fake
visibility map.
Attached 0002 patch adds GUC parameter cheat_vacuum_table_size which
artificially defines table size being vacuumed .
For example, If we do,
  SET cheat_vacuum_table_size = 4;
  VACUUM vm_test;
then in lazy_scan_heap, vm_test table is processed as an
8TB(MaxBlockNumber / 4) table.

Attached test shell script makes fake visibility map files and
executes the performance tests for 1TB, 8TB and 32TB table.
Please confirm it.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
From bbda8e4144101a41804865d88592a15bf0150340 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada 
Date: Thu, 8 Sep 2016 11:24:21 -0700
Subject: [PATCH 1/2] lazy_scan_heap outputs how many all_frozen pages are
 skipped.

---
 src/backend/commands/vacuumlazy.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index b5fb325..20d8334 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -1331,6 +1331,10 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	"Skipped %u pages due to buffer pins.\n",
 	vacrelstats->pinskipped_pages),
 	 vacrelstats->pinskipped_pages);
+	appendStringInfo(&buf, ngettext("Skipped %u all-frozen page according to visibility map.\n",
+	"Skipped %u all-frozen pages according to visibility map.\n",
+	vacrelstats->frozenskipped_pages),
+	 vacrelstats->frozenskipped_pages);
 	appendStringInfo(&buf, ngettext("%u page is entirely empty.\n",
 	"%u pages are entirely empty.\n",
 	empty_pages),
-- 
2.8.1

From 01e493a44ba6c5bb9b59a1016bfce4b9bcf75e95 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada 
Date: Thu, 8 Sep 2016 11:36:28 -0700
Subject: [PATCH 2/2] Add cheat_vacuum_table_size.

---
 contrib/pg_visibility/pg_visibility.c |  6 +-
 src/backend/commands/vacuumlazy.c |  9 +++--
 src/backend/utils/misc/guc.c  | 10 ++
 src/include/commands/vacuum.h |  1 +
 4 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 7034066..37ace99 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -12,6 +12,7 @@
 #include "access/visibilitymap.h"
 #include "catalog/pg_type.h"
 #include "catalog/storage_xlog.h"
+#include "commands/vacuum.h"
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "storage/bufmgr.h"
@@ -242,7 +243,10 @@ pg_visibility_map_summary(PG_FUN

Re: [HACKERS] Optimization for lazy_scan_heap

2016-09-07 Thread Robert Haas
On Tue, Sep 6, 2016 at 11:13 PM, Masahiko Sawada  wrote:
> I understood, thank you.
>
> I've measured the performance benefit of this patch by following steps.
> 1. Create very large table and set all-visible flag to all blocks.
> 2. Measure the execution time of vacuum that skips to scan all heap pages.
>
> * 1TB Table(visibility map size is 32MB)
> HEAD : 11012.274 ms (00:11.012)
> Patched : 6742.481 ms (00:06.742)
>
> * 8TB Table(visibility map size is 64MB)
> HEAD : 69886.605 ms (01:09.887)
> Patched : 35562.131 ms (00:35.562)
>
> * 32TB Table(visibility map size is 258MB)
> HEAD: 265901.096 ms (04:25.901)
> Patched: 131779.082 ms (02:11.779)
>
> Since current HEAD could scan visibility map twice, the execution time
> of Patched is approximately half of HEAD.
> But when table is several hundreds gigabyte, performance benefit would
> not be large.

Wow, those are surprisingly good results.  Interesting.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization for lazy_scan_heap

2016-09-07 Thread Simon Riggs
On 7 September 2016 at 04:13, Masahiko Sawada  wrote:

> Since current HEAD could scan visibility map twice, the execution time
> of Patched is approximately half of HEAD.

Sounds good.

To ensure we are doing exactly same amount of work as before, did you
see the output of VACUUM VEROBOSE?

Can we produce a test that verifies the result patched/unpatched?

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization for lazy_scan_heap

2016-09-06 Thread Masahiko Sawada
On Wed, Sep 7, 2016 at 1:47 AM, Robert Haas  wrote:
> On Mon, Sep 5, 2016 at 8:57 PM, Masahiko Sawada  wrote:
>>> What performance difference does this make, in a realistic use case?
>>
>> I have yet to measure performance effect but it would be effect for
>> very large frozen table.
>
> I think if you are proposing this patch because you think that the
> existing code won't perform well, you definitely need to submit some
> performance results showing that your approach is better.  If you
> can't show that your approach is practically better (rather than just
> theoretically better), then I'm not sure we should change anything.
> It's very easy to screw up the code in this area and we do not want to
> risk corrupting data for the sake of an optimization that isn't really
> needed in the first place.
>
> Of course, if you can prove that the change has a significant benefit,
> then it's definitely worth considering.
>

I understood, thank you.

I've measured the performance benefit of this patch by following steps.
1. Create very large table and set all-visible flag to all blocks.
2. Measure the execution time of vacuum that skips to scan all heap pages.

* 1TB Table(visibility map size is 32MB)
HEAD : 11012.274 ms (00:11.012)
Patched : 6742.481 ms (00:06.742)

* 8TB Table(visibility map size is 64MB)
HEAD : 69886.605 ms (01:09.887)
Patched : 35562.131 ms (00:35.562)

* 32TB Table(visibility map size is 258MB)
HEAD: 265901.096 ms (04:25.901)
Patched: 131779.082 ms (02:11.779)

Since current HEAD could scan visibility map twice, the execution time
of Patched is approximately half of HEAD.
But when table is several hundreds gigabyte, performance benefit would
not be large.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization for lazy_scan_heap

2016-09-06 Thread Robert Haas
On Mon, Sep 5, 2016 at 8:57 PM, Masahiko Sawada  wrote:
>> What performance difference does this make, in a realistic use case?
>
> I have yet to measure performance effect but it would be effect for
> very large frozen table.

I think if you are proposing this patch because you think that the
existing code won't perform well, you definitely need to submit some
performance results showing that your approach is better.  If you
can't show that your approach is practically better (rather than just
theoretically better), then I'm not sure we should change anything.
It's very easy to screw up the code in this area and we do not want to
risk corrupting data for the sake of an optimization that isn't really
needed in the first place.

Of course, if you can prove that the change has a significant benefit,
then it's definitely worth considering.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization for lazy_scan_heap

2016-09-05 Thread Masahiko Sawada
On Mon, Sep 5, 2016 at 6:27 PM, Simon Riggs  wrote:
> On 4 August 2016 at 05:57, Masahiko Sawada  wrote:
>
>> While reviewing freeze map code, Andres pointed out[1] that
>> lazy_scan_heap could accesses visibility map twice and its logic is
>> seems a bit tricky.
>> As discussed before, it's not nice especially when large relation is
>> entirely frozen.
>>
>> I posted the patch for that before but since this is an optimization,
>> not bug fix, I'd like to propose it again.
>> Please give me feedback.
>>
>> [1] 
>> https://www.postgresql.org/message-id/20160505000840.epatoq6d2e556446%40alap3.anarazel.de
>
> If we have a freeze map now, surely tables will no longer be entirely frozen?

Well, if table is completely frozen, freezing for all pages will be skipped.

> What performance difference does this make, in a realistic use case?

I have yet to measure performance effect but it would be effect for
very large frozen table.

> How would we test this to check it is exactly correct?

One possible idea is that we emit the number of skipped page according
visibility map as a vacuum verbose message, and check it.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization for lazy_scan_heap

2016-09-05 Thread Simon Riggs
On 4 August 2016 at 05:57, Masahiko Sawada  wrote:

> While reviewing freeze map code, Andres pointed out[1] that
> lazy_scan_heap could accesses visibility map twice and its logic is
> seems a bit tricky.
> As discussed before, it's not nice especially when large relation is
> entirely frozen.
>
> I posted the patch for that before but since this is an optimization,
> not bug fix, I'd like to propose it again.
> Please give me feedback.
>
> [1] 
> https://www.postgresql.org/message-id/20160505000840.epatoq6d2e556446%40alap3.anarazel.de

If we have a freeze map now, surely tables will no longer be entirely frozen?

What performance difference does this make, in a realistic use case?

How would we test this to check it is exactly correct?

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization for lazy_scan_heap

2016-08-24 Thread Masahiko Sawada
On Thu, Aug 25, 2016 at 1:41 AM, Anastasia Lubennikova
 wrote:
> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, passed
> Implements feature:   not tested
> Spec compliant:   not tested
> Documentation:not tested
>
> Hi,
> I haven't tested the performance yet, but the patch itself looks pretty good
> and reasonable improvement.
> I have a question about removing the comment. It seems to be really tricky
> moment. How do we know that all-frozen block hasn't changed since the
> moment we checked it?

I think that we don't need to check whether all-frozen block hasn't
changed since we checked it.
Even if the all-frozen block has changed after we saw it as an
all-frozen page, we can update the relfrozenxid.
Because any new XIDs just added to that page are newer than the
GlobalXmin we computed.

Am I missing something?

In this patch, since we count the number of all-frozen page even
during not an aggresive scan, I thought that we don't need to check
whether these blocks were all-frozen pages.

> I'm going to test the performance this week.
> I wonder if you could send a test script or describe the steps to test it?

This optimization reduces the number of scanning visibility map when
table is almost visible or frozen..
So it would be effective for very large table (more than several
hundreds GB) which is almost all-visible or all-frozen pages.

For example,
1. Create very large table.
2. Execute VACUUM FREEZE to freeze all pages of table.
3. Measure the execution time of VACUUM FREEZE.
I hope that the second VACUUM FREEZE become fast a little.

I modified the comment, and attached v2 patch.

Regards,

--
Masahiko Sawada
diff --git a/src/backend/commands/vacuumlazy.c 
b/src/backend/commands/vacuumlazy.c
index 231e92d..226dc83 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -471,6 +471,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats 
*vacrelstats,
PROGRESS_VACUUM_MAX_DEAD_TUPLES
};
int64   initprog_val[3];
+   BlockNumber n_skipped;
+   BlockNumber n_all_frozen;
 
pg_rusage_init(&ru0);
 
@@ -501,6 +503,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats 
*vacrelstats,
initprog_val[2] = vacrelstats->max_dead_tuples;
pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
+   n_skipped = 0;
+   n_all_frozen = 0;
+
/*
 * Except when aggressive is set, we want to skip pages that are
 * all-visible according to the visibility map, but only when we can 
skip
@@ -558,14 +563,20 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats 
*vacrelstats,
{
if ((vmstatus & VISIBILITYMAP_ALL_FROZEN) == 0)
break;
+   n_all_frozen++;
}
else
{
if ((vmstatus & VISIBILITYMAP_ALL_VISIBLE) == 0)
break;
+
+   /* Count the number of all-frozen pages */
+   if (vmstatus & VISIBILITYMAP_ALL_FROZEN)
+   n_all_frozen++;
}
vacuum_delay_point();
next_unskippable_block++;
+   n_skipped++;
}
}
 
@@ -574,7 +585,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats 
*vacrelstats,
else
skipping_blocks = false;
 
-   for (blkno = 0; blkno < nblocks; blkno++)
+   blkno = 0;
+   while (blkno < nblocks)
{
Buffer  buf;
Pagepage;
@@ -592,18 +604,22 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats 
*vacrelstats,
TransactionId visibility_cutoff_xid = InvalidTransactionId;
 
/* see note above about forcing scanning of last page */
-#define FORCE_CHECK_PAGE() \
-   (blkno == nblocks - 1 && should_attempt_truncation(vacrelstats))
+#define FORCE_CHECK_PAGE(blk) \
+   ((blk) == nblocks - 1 && should_attempt_truncation(vacrelstats))
 
pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, 
blkno);
 
if (blkno == next_unskippable_block)
{
+   n_skipped = 0;
+   n_all_frozen = 0;
+
/* Time to advance next_unskippable_block */
next_unskippable_block++;
if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
{
-   while (next_unskippable_block < nblocks)
+   while (next_unskippable_block < nblocks &&
+  

Re: [HACKERS] Optimization for lazy_scan_heap

2016-08-24 Thread Anastasia Lubennikova
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested

Hi,
I haven't tested the performance yet, but the patch itself looks pretty good
and reasonable improvement.
I have a question about removing the comment. It seems to be really tricky
moment. How do we know that all-frozen block hasn't changed since the 
moment we checked it?

-* Tricky, tricky.  If this is in aggressive 
vacuum, the page
-* must have been all-frozen at the time we 
checked whether it
-* was skippable, but it might not be any more. 
 We must be
-* careful to count it as a skipped all-frozen 
page in that
-* case, or else we'll think we can't update 
relfrozenxid and
-* relminmxid.  If it's not an aggressive 
vacuum, we don't
-* know whether it was all-frozen, so we have 
to recheck; but
-* in this case an approximate answer is OK.
+* We know that there are n_skipped pages by 
the visibilitymap scan we
+* did just before.
 */

I'm going to test the performance this week.
I wonder if you could send a test script or describe the steps to test it?

The new status of this patch is: Waiting on Author

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Optimization for lazy_scan_heap

2016-08-03 Thread Masahiko Sawada
Hi all,

While reviewing freeze map code, Andres pointed out[1] that
lazy_scan_heap could accesses visibility map twice and its logic is
seems a bit tricky.
As discussed before, it's not nice especially when large relation is
entirely frozen.

I posted the patch for that before but since this is an optimization,
not bug fix, I'd like to propose it again.
Please give me feedback.

[1] 
https://www.postgresql.org/message-id/20160505000840.epatoq6d2e556446%40alap3.anarazel.de

Regards,

--
Masahiko Sawada
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 231e92d..e7cdd2c 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -471,6 +471,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		PROGRESS_VACUUM_MAX_DEAD_TUPLES
 	};
 	int64		initprog_val[3];
+	BlockNumber	n_skipped;
+	BlockNumber n_all_frozen;
 
 	pg_rusage_init(&ru0);
 
@@ -501,6 +503,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	initprog_val[2] = vacrelstats->max_dead_tuples;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
+	n_skipped = 0;
+	n_all_frozen = 0;
+
 	/*
 	 * Except when aggressive is set, we want to skip pages that are
 	 * all-visible according to the visibility map, but only when we can skip
@@ -558,14 +563,20 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			{
 if ((vmstatus & VISIBILITYMAP_ALL_FROZEN) == 0)
 	break;
+n_all_frozen++;
 			}
 			else
 			{
 if ((vmstatus & VISIBILITYMAP_ALL_VISIBLE) == 0)
 	break;
+
+/* Count the number of all-frozen pages */
+if (vmstatus & VISIBILITYMAP_ALL_FROZEN)
+	n_all_frozen++;
 			}
 			vacuum_delay_point();
 			next_unskippable_block++;
+			n_skipped++;
 		}
 	}
 
@@ -574,7 +585,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	else
 		skipping_blocks = false;
 
-	for (blkno = 0; blkno < nblocks; blkno++)
+	blkno = 0;
+	while (blkno < nblocks)
 	{
 		Buffer		buf;
 		Page		page;
@@ -592,18 +604,22 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		TransactionId visibility_cutoff_xid = InvalidTransactionId;
 
 		/* see note above about forcing scanning of last page */
-#define FORCE_CHECK_PAGE() \
-		(blkno == nblocks - 1 && should_attempt_truncation(vacrelstats))
+#define FORCE_CHECK_PAGE(blk) \
+		((blk) == nblocks - 1 && should_attempt_truncation(vacrelstats))
 
 		pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
 
 		if (blkno == next_unskippable_block)
 		{
+			n_skipped = 0;
+			n_all_frozen = 0;
+
 			/* Time to advance next_unskippable_block */
 			next_unskippable_block++;
 			if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 			{
-while (next_unskippable_block < nblocks)
+while (next_unskippable_block < nblocks &&
+	   !FORCE_CHECK_PAGE(next_unskippable_block))
 {
 	uint8		vmskipflags;
 
@@ -614,14 +630,22 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	{
 		if ((vmskipflags & VISIBILITYMAP_ALL_FROZEN) == 0)
 			break;
+
+		/* Count the number of all-frozen pages */
+		n_all_frozen++;
 	}
 	else
 	{
 		if ((vmskipflags & VISIBILITYMAP_ALL_VISIBLE) == 0)
 			break;
+
+		/* Count the number of all-frozen pages */
+		if (vmskipflags & VISIBILITYMAP_ALL_FROZEN)
+			n_all_frozen++;
 	}
 	vacuum_delay_point();
 	next_unskippable_block++;
+	n_skipped++;
 }
 			}
 
@@ -646,26 +670,23 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 		else
 		{
 			/*
-			 * The current block is potentially skippable; if we've seen a
-			 * long enough run of skippable blocks to justify skipping it, and
-			 * we're not forced to check it, then go ahead and skip.
-			 * Otherwise, the page must be at least all-visible if not
-			 * all-frozen, so we can set all_visible_according_to_vm = true.
+			 * The current block is the first block of continuous skippable blocks,
+			 * and we know that how many blocks we can skip to scan. If we've
+			 * seen a long enough run of skippable blocks to justify skipping it,
+			 * then go ahead and skip. Otherwise, the page must be at least all-visible
+			 * if not all-frozen, so we can set all_visible_according_to_vm = true.
 			 */
-			if (skipping_blocks && !FORCE_CHECK_PAGE())
+			if (skipping_blocks)
 			{
 /*
- * Tricky, tricky.  If this is in aggressive vacuum, the page
- * must have been all-frozen at the time we checked whether it
- * was skippable, but it might not be any more.  We must be
- * careful to count it as a skipped all-frozen page in that
- * case, or else we'll think we can't update relfrozenxid and
- * relminmxid.  If it's not an aggressive vacuum, we don't
- * know whether it was all-frozen, so we have to recheck; but
- * in this case an approximate answer is OK.
+ * We know that there