Re: [HACKERS] Reviewing freeze map code

2016-05-17 Thread Vik Fearing
On 17/05/16 21:32, Alvaro Herrera wrote:
> Is SCAN_ALL really the best we can do here?  The business of having an
> underscore in an option name has no precedent (other than
> CURRENT_DATABASE and the like).

ALTER DATABASE has options for ALLOW_CONNECTIONS, CONNECTION_LIMIT, and
IS_TEMPLATE.

> How about COMPLETE, TOTAL, or WHOLE?

Sure, I'll play this game.  I like EXHAUSTIVE.
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-17 Thread Masahiko Sawada
On Tue, May 17, 2016 at 4:34 PM, Joshua D. Drake  wrote:
> On 05/17/2016 12:32 PM, Alvaro Herrera wrote:
>
>>   Syntaxes are;
>>   VACUUM (SCAN_ALL) table_name;
>>   VACUUM (SCAN_ALL);  -- for all tables on database
>>
>> Is SCAN_ALL really the best we can do here?  The business of having an
>> underscore in an option name has no precedent (other than
>> CURRENT_DATABASE and the like).  How about COMPLETE, TOTAL, or WHOLE?
>>
>
> VACUUM (ANALYZE, VERBOSE, WHOLE)
> 
>
> That seems reasonable? I agree that SCAN_ALL doesn't fit. I am not trying to
> pull a left turn but is there a technical reason we don't just make FULL do
> this?
>

FULL option requires AccessExclusiveLock, which could be a problem.

Regards,

--
Masahiko Sawada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-17 Thread Joshua D. Drake

On 05/17/2016 12:32 PM, Alvaro Herrera wrote:


  Syntaxes are;
  VACUUM (SCAN_ALL) table_name;
  VACUUM (SCAN_ALL);  -- for all tables on database

Is SCAN_ALL really the best we can do here?  The business of having an
underscore in an option name has no precedent (other than
CURRENT_DATABASE and the like).  How about COMPLETE, TOTAL, or WHOLE?



VACUUM (ANALYZE, VERBOSE, WHOLE)


That seems reasonable? I agree that SCAN_ALL doesn't fit. I am not 
trying to pull a left turn but is there a technical reason we don't just 
make FULL do this?


JD

--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-17 Thread Masahiko Sawada
On Tue, May 17, 2016 at 3:32 PM, Alvaro Herrera
 wrote:
> Masahiko Sawada wrote:
>> On Mon, May 16, 2016 at 10:49 AM, Robert Haas  wrote:
>
>> > We should support scan_all only with the new-style options syntax for
>> > VACUUM; that is, vacuum (scan_all) rename.  That doesn't require
>> > making scan_all a keyword, which is good: this is a minor feature, and
>> > we don't want to bloat the parsing tables for it.
>>
>> I agree with having new-style options syntax.
>> Isn't it better to have SCAN_ALL option without parentheses?
>>
>> Syntaxes are;
>> VACUUM SCAN_ALL table_name;
>> VACUUM SCAN_ALL;  -- for all tables on database
>
> No, I agree with Robert that we shouldn't add any more such options to
> avoid keyword proliferation.
>
>  Syntaxes are;
>  VACUUM (SCAN_ALL) table_name;
>  VACUUM (SCAN_ALL);  -- for all tables on database

Okay, I agree with this.

> Is SCAN_ALL really the best we can do here?  The business of having an
> underscore in an option name has no precedent (other than
> CURRENT_DATABASE and the like).

Another way is having tool or function that removes _vm file safely for example.

> How about COMPLETE, TOTAL, or WHOLE?

IMHO, I don't have strong opinion about SCAN_ALL as long as we have
document about that option and option name doesn't confuse users.
But ISTM that COMPLETE, TOTAL might make users mislead normal vacuum
as it doesn't do that completely.

Regards,

--
Masahiko Sawada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-17 Thread Alvaro Herrera
Masahiko Sawada wrote:
> On Mon, May 16, 2016 at 10:49 AM, Robert Haas  wrote:

> > We should support scan_all only with the new-style options syntax for
> > VACUUM; that is, vacuum (scan_all) rename.  That doesn't require
> > making scan_all a keyword, which is good: this is a minor feature, and
> > we don't want to bloat the parsing tables for it.
> 
> I agree with having new-style options syntax.
> Isn't it better to have SCAN_ALL option without parentheses?
> 
> Syntaxes are;
> VACUUM SCAN_ALL table_name;
> VACUUM SCAN_ALL;  -- for all tables on database

No, I agree with Robert that we shouldn't add any more such options to
avoid keyword proliferation.

 Syntaxes are;
 VACUUM (SCAN_ALL) table_name;
 VACUUM (SCAN_ALL);  -- for all tables on database

Is SCAN_ALL really the best we can do here?  The business of having an
underscore in an option name has no precedent (other than
CURRENT_DATABASE and the like).  How about COMPLETE, TOTAL, or WHOLE?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-17 Thread Masahiko Sawada
On Mon, May 16, 2016 at 10:49 AM, Robert Haas  wrote:
> On Tue, May 10, 2016 at 10:40 PM, Masahiko Sawada  
> wrote:
>> Or second way I came up with is having tool to remove particular _vm
>> file safely, which is executed via SQL or client tool like
>> pg_resetxlog.
>>
>> Attached updated VACUUM SCAN_ALL patch.
>> Please find it.
>
> We should support scan_all only with the new-style options syntax for
> VACUUM; that is, vacuum (scan_all) rename.  That doesn't require
> making scan_all a keyword, which is good: this is a minor feature, and
> we don't want to bloat the parsing tables for it.
>

I agree with having new-style options syntax.
Isn't it better to have SCAN_ALL option without parentheses?

Syntaxes are;
VACUUM SCAN_ALL table_name;
VACUUM SCAN_ALL;  -- for all tables on database

Regards,

--
Masahiko Sawada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-16 Thread Robert Haas
On Tue, May 10, 2016 at 10:40 PM, Masahiko Sawada  wrote:
> Or second way I came up with is having tool to remove particular _vm
> file safely, which is executed via SQL or client tool like
> pg_resetxlog.
>
> Attached updated VACUUM SCAN_ALL patch.
> Please find it.

We should support scan_all only with the new-style options syntax for
VACUUM; that is, vacuum (scan_all) rename.  That doesn't require
making scan_all a keyword, which is good: this is a minor feature, and
we don't want to bloat the parsing tables for it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-10 Thread Jim Nasby

On 5/10/16 11:42 PM, Jim Nasby wrote:

On 5/6/16 4:55 PM, Peter Geoghegan wrote:

On Fri, May 6, 2016 at 2:49 PM, Andres Freund  wrote:

Jeff Janes has done astounding work in these matters.  (I don't think
we credit him enough for that.)


+many.


Agreed. I'm a huge fan of what Jeff has been able to do in this area.
I often say so. It would be even better if Jeff's approach to testing
was followed as an example by other people, but I wouldn't bet on it
ever happening. It requires real persistence and deep understanding to
do well.


It takes deep understanding to *design* the tests, not to write them.
There's a lot of folks out there that will never understand enough to
design tests meant to expose data corruption but who could easily code
someone else's design, especially if we provided tools/ways to tweak a
cluster to make testing easier/faster (such as artificially advancing
XID/MXID).


Speaking of which, another email in the thread made me realize that 
there's a test condition no one has mentioned: verifying we don't lose 
tuples after wraparound.


To test this, you'd want a table that's mostly frozen. Ideally, dirty a 
single tuple on a bunch of frozen pages, with committed updates, 
deletes, and un-committed inserts. Advance XID far enough to get you 
close to wrap-around. Do a vacuum, SELECT count(*), advance XID past 
wraparound, SELECT count(*) again and you should get the same number.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-10 Thread Jim Nasby

On 5/6/16 4:08 PM, Joshua D. Drake wrote:


VACUUM THEWHOLEDAMNTHING


+100

(hahahaha)


You know what? Why not? Seriously? We aren't product. This is supposed
to be a bit fun. Let's have some fun with it? It would be so easy to
turn that into a positive advocacy opportunity.


Honestly, for an option this obscure, I agree. I don't think we'd want 
any normally used stuff named so glibly, but I sure as heck could have 
used some easter-eggs like this when I was doing training.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-10 Thread Jim Nasby

On 5/6/16 4:55 PM, Peter Geoghegan wrote:

On Fri, May 6, 2016 at 2:49 PM, Andres Freund  wrote:

Jeff Janes has done astounding work in these matters.  (I don't think
we credit him enough for that.)


+many.


Agreed. I'm a huge fan of what Jeff has been able to do in this area.
I often say so. It would be even better if Jeff's approach to testing
was followed as an example by other people, but I wouldn't bet on it
ever happening. It requires real persistence and deep understanding to
do well.


It takes deep understanding to *design* the tests, not to write them. 
There's a lot of folks out there that will never understand enough to 
design tests meant to expose data corruption but who could easily code 
someone else's design, especially if we provided tools/ways to tweak a 
cluster to make testing easier/faster (such as artificially advancing 
XID/MXID).

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-10 Thread Jim Nasby

On 5/6/16 4:20 PM, Andres Freund wrote:

On 2016-05-06 14:15:47 -0700, Josh berkus wrote:

For the serious testing, does anyone have a good technique for creating
loads which would stress-test vacuum freezing?  It's hard for me to come
up with anything which wouldn't be very time-and-resource intensive
(like running at 10,000 TPS for a week).


I've changed the limits for freezing options a while back, so you can
now set autovacuum_freeze_max as low as 10 (best set
vacuum_freeze_table_age accordingly).  You'll have to come up with a
workload that doesn't overwrite all data continuously (otherwise
there'll never be old rows), but otherwise it should now be fairly easy
to test that kind of scenario.


There's also been a tool for forcibly advancing XID floating around for 
quite some time. Using that could have the added benefit of verifying 
anti-wrap still works correctly. (Might be worth testing mxid wrap too...)

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-10 Thread Masahiko Sawada
On Tue, May 10, 2016 at 11:30 PM, Robert Haas  wrote:
> On Mon, May 9, 2016 at 7:40 PM, Ants Aasma  wrote:
>> On Mon, May 9, 2016 at 10:53 PM, Robert Haas  wrote:
>>> On Sun, May 8, 2016 at 10:42 PM, Masahiko Sawada  
>>> wrote:
 Attached draft patch adds SCANALL option to VACUUM in order to scan
 all pages forcibly while ignoring visibility map information.
 The option name is SCANALL for now but we could change it after got 
 consensus.
>>>
>>> If we're going to go that way, I'd say it should be scan_all rather
>>> than scanall.  Makes it clearer, at least IMHO.
>>
>> Just to add some diversity to opinions, maybe there should be a
>> separate command for performing integrity checks. Currently the best
>> ways to actually verify database correctness do so as a side effect.
>> The question that I get pretty much every time after I explain why we
>> have data checksums, is "how do I check that they are correct" and we
>> don't have a nice answer for that now. We could also use some ways to
>> sniff out corrupted rows that don't involve crashing the server in a
>> loop. Vacuuming pages that supposedly don't need vacuuming just to
>> verify integrity seems very much in the same vein.
>>
>> I know right now isn't exactly the best time to hastily slap on such a
>> feature, but I just wanted the thought to be out there for
>> consideration.
>
> I think that it's quite reasonable to have ways of performing an
> integrity check that are separate from VACUUM, but this is about
> having a way to force VACUUM to scan all-frozen pages

Or second way I came up with is having tool to remove particular _vm
file safely, which is executed via SQL or client tool like
pg_resetxlog.

Attached updated VACUUM SCAN_ALL patch.
Please find it.

Regards,

--
Masahiko Sawada
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 19fd748..8f63fad 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -21,9 +21,9 @@ PostgreSQL documentation
 
  
 
-VACUUM [ ( { FULL | FREEZE | VERBOSE | ANALYZE } [, ...] ) ] [ table_name [ (column_name [, ...] ) ] ]
-VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ table_name ]
-VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] ANALYZE [ table_name [ (column_name [, ...] ) ] ]
+VACUUM [ ( { FULL | FREEZE | VERBOSE | ANALYZE | SCAN_ALL } [, ...] ) ] [ table_name [ (column_name [, ...] ) ] ]
+VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ SCAN_ALL ] [ table_name ]
+VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ SCAN_ALL ] ANALYZE [ table_name [ (column_name [, ...] ) ] ]
 
  
 
@@ -120,6 +120,17 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] ANALYZE [ 

 

+SCAN_ALL
+
+ 
+  Selects forcibly full page scanning vacuum while ignoring visibility map.
+  Forcibly full page scanning vacuum is always performed when the table is
+  rewritten so this option is redundant when FULL is specified.
+ 
+
+   
+   
+   
 ANALYZE
 
  
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 426e756..eee93c4 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -138,7 +138,7 @@ static BufferAccessStrategy vac_strategy;
 
 /* non-export function prototypes */
 static void lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
-			   Relation *Irel, int nindexes, bool aggressive);
+			   Relation *Irel, int nindexes, bool aggressive, bool scan_all);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
 static void lazy_vacuum_index(Relation indrel,
@@ -185,6 +185,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 	double		read_rate,
 write_rate;
 	bool		aggressive;		/* should we scan all unfrozen pages? */
+	bool		scan_all;		/* should we scan all pages forcibly? */
 	bool		scanned_all_unfrozen;	/* actually scanned all such pages? */
 	TransactionId xidFullScanLimit;
 	MultiXactId mxactFullScanLimit;
@@ -233,6 +234,9 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 	aggressive |= MultiXactIdPrecedesOrEquals(onerel->rd_rel->relminmxid,
 			  mxactFullScanLimit);
 
+	/* If SCAN_ALL option is specified, we have to scan all pages forcibly */
+	scan_all = options & VACOPT_SCANALL;
+
 	vacrelstats = (LVRelStats *) palloc0(sizeof(LVRelStats));
 
 	vacrelstats->old_rel_pages = onerel->rd_rel->relpages;
@@ -246,14 +250,14 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 	vacrelstats->hasindex = (nindexes > 0);
 
 	/* Do the vacuuming */
-	lazy_scan_heap(onerel, vacrelstats, Irel, nindexes, aggressive);
+	lazy_scan_heap(onerel, vacrelstats, Irel, nindexes, aggressive, scan_all);
 
 	/* Done with indexes */
 	vac_close_indexes(nindexes, Irel, NoLock);
 
 	/*
-	 * Compute whether we actually scanned the whole relation. If we did, we
-	 

Re: [HACKERS] Reviewing freeze map code

2016-05-10 Thread Robert Haas
On Mon, May 9, 2016 at 7:40 PM, Ants Aasma  wrote:
> On Mon, May 9, 2016 at 10:53 PM, Robert Haas  wrote:
>> On Sun, May 8, 2016 at 10:42 PM, Masahiko Sawada  
>> wrote:
>>> Attached draft patch adds SCANALL option to VACUUM in order to scan
>>> all pages forcibly while ignoring visibility map information.
>>> The option name is SCANALL for now but we could change it after got 
>>> consensus.
>>
>> If we're going to go that way, I'd say it should be scan_all rather
>> than scanall.  Makes it clearer, at least IMHO.
>
> Just to add some diversity to opinions, maybe there should be a
> separate command for performing integrity checks. Currently the best
> ways to actually verify database correctness do so as a side effect.
> The question that I get pretty much every time after I explain why we
> have data checksums, is "how do I check that they are correct" and we
> don't have a nice answer for that now. We could also use some ways to
> sniff out corrupted rows that don't involve crashing the server in a
> loop. Vacuuming pages that supposedly don't need vacuuming just to
> verify integrity seems very much in the same vein.
>
> I know right now isn't exactly the best time to hastily slap on such a
> feature, but I just wanted the thought to be out there for
> consideration.

I think that it's quite reasonable to have ways of performing an
integrity check that are separate from VACUUM, but this is about
having a way to force VACUUM to scan all-frozen pages - and it's hard
to imagine that we want a different command name for that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-09 Thread Ants Aasma
On Mon, May 9, 2016 at 10:53 PM, Robert Haas  wrote:
> On Sun, May 8, 2016 at 10:42 PM, Masahiko Sawada  
> wrote:
>> Attached draft patch adds SCANALL option to VACUUM in order to scan
>> all pages forcibly while ignoring visibility map information.
>> The option name is SCANALL for now but we could change it after got 
>> consensus.
>
> If we're going to go that way, I'd say it should be scan_all rather
> than scanall.  Makes it clearer, at least IMHO.

Just to add some diversity to opinions, maybe there should be a
separate command for performing integrity checks. Currently the best
ways to actually verify database correctness do so as a side effect.
The question that I get pretty much every time after I explain why we
have data checksums, is "how do I check that they are correct" and we
don't have a nice answer for that now. We could also use some ways to
sniff out corrupted rows that don't involve crashing the server in a
loop. Vacuuming pages that supposedly don't need vacuuming just to
verify integrity seems very much in the same vein.

I know right now isn't exactly the best time to hastily slap on such a
feature, but I just wanted the thought to be out there for
consideration.

Regards,
Ants Aasma


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-09 Thread Robert Haas
On Sun, May 8, 2016 at 10:42 PM, Masahiko Sawada  wrote:
> Attached draft patch adds SCANALL option to VACUUM in order to scan
> all pages forcibly while ignoring visibility map information.
> The option name is SCANALL for now but we could change it after got consensus.

If we're going to go that way, I'd say it should be scan_all rather
than scanall.  Makes it clearer, at least IMHO.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-08 Thread Masahiko Sawada
On Tue, May 3, 2016 at 6:48 AM, Andres Freund  wrote:
> fd31cd2 Don't vacuum all-frozen pages.

-   appendStringInfo(, _("pages: %u removed,
%u remain, %u skipped due to pins\n"),
+   appendStringInfo(, _("pages: %u removed,
%u remain, %u skipped due to pins, %u skipped frozen\n"),

vacrelstats->pages_removed,
 vacrelstats->rel_pages,
-
vacrelstats->pinskipped_pages);
+
vacrelstats->pinskipped_pages,
+
vacrelstats->frozenskipped_pages);

The verbose information about skipping frozen page is emitted by only
autovacuum.
But I think that this information is also helpful for manual vacuum.

Please find attached patch which fixes that.

Regards,

--
Masahiko Sawada
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 426e756..fa6e5fa 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -1316,6 +1316,8 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
 	"Skipped %u pages due to buffer pins.\n",
 	vacrelstats->pinskipped_pages),
 	 vacrelstats->pinskipped_pages);
+	appendStringInfo(, _("Skipped %u frozen pages.\n"),
+	 vacrelstats->frozenskipped_pages);
 	appendStringInfo(, ngettext("%u page is entirely empty.\n",
 	"%u pages are entirely empty.\n",
 	empty_pages),

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-08 Thread Masahiko Sawada
On Sun, May 8, 2016 at 3:18 PM, Masahiko Sawada  wrote:
> On Sat, May 7, 2016 at 11:08 PM, Masahiko Sawada  
> wrote:
>> On Sat, May 7, 2016 at 6:00 AM, Joshua D. Drake  
>> wrote:
>>> On 05/06/2016 01:58 PM, Stephen Frost wrote:

 * Joshua D. Drake (j...@commandprompt.com) wrote:
>
> Yeah I thought about that, it is the word "FORCE" that bothers me.
> When you use FORCE there is an assumption that no matter what, it
> plows through (think rm -f). So if we don't use FROZEN, that's cool
> but FORCE doesn't work either.


 Isn't that exactly what this FORCE option being contemplated would do
 though?  Plow through the entire relation, regardless of what the VM
 says is all frozen or not?

 Seems like FORCE is a good word for that to me.
>>>
>>>
>>> Except that we aren't FORCING a vacuum. That is the part I have contention
>>> with. To me, FORCE means:
>>>
>>> No matter what else is happening, we are vacuuming this relation (think
>>> locks).
>>>
>>> But I am also not going to dig in my heals. If that is truly what -hackers
>>> come up with, thank you at least considering what I said.
>>>
>>> Sincerely,
>>>
>>> JD
>>>
>>
>> As Joshua mentioned, FORCE word might imply doing VACUUM while plowing
>> through locks.
>> I guess that it might confuse the users.
>> IMO, since this option will be a way for emergency, SCANALL word works for 
>> me.
>>
>> Or other ideas are,
>> VACUUM IGNOREVM
>> VACUUM RESCURE
>>
>
> Oops, VACUUM RESCUE is correct.
>

Attached draft patch adds SCANALL option to VACUUM in order to scan
all pages forcibly while ignoring visibility map information.
The option name is SCANALL for now but we could change it after got consensus.

Regards,

--
Masahiko Sawada
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 19fd748..130a70e 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -21,9 +21,9 @@ PostgreSQL documentation
 
  
 
-VACUUM [ ( { FULL | FREEZE | VERBOSE | ANALYZE } [, ...] ) ] [ table_name [ (column_name [, ...] ) ] ]
-VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ table_name ]
-VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] ANALYZE [ table_name [ (column_name [, ...] ) ] ]
+VACUUM [ ( { FULL | FREEZE | VERBOSE | ANALYZE | SCANALL } [, ...] ) ] [ table_name [ (column_name [, ...] ) ] ]
+VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ SCANALL ] [ table_name ]
+VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ SCANALL ] ANALYZE [ table_name [ (column_name [, ...] ) ] ]
 
  
 
@@ -120,6 +120,17 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] ANALYZE [ 

 

+SCANALL
+
+ 
+  Selects forcibly full page scanning vacuum while ignoring visibility map.
+  Forcibly full page scanning vacuum is always performed when the table is
+  rewritten so this option is redundant when FULL is specified.
+ 
+
+   
+   
+   
 ANALYZE
 
  
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 426e756..85e04ac 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -138,7 +138,7 @@ static BufferAccessStrategy vac_strategy;
 
 /* non-export function prototypes */
 static void lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
-			   Relation *Irel, int nindexes, bool aggressive);
+			   Relation *Irel, int nindexes, bool aggressive, bool scan_all);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
 static void lazy_vacuum_index(Relation indrel,
@@ -185,6 +185,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 	double		read_rate,
 write_rate;
 	bool		aggressive;		/* should we scan all unfrozen pages? */
+	bool		scan_all;		/* should we scan all pages forcibly? */
 	bool		scanned_all_unfrozen;	/* actually scanned all such pages? */
 	TransactionId xidFullScanLimit;
 	MultiXactId mxactFullScanLimit;
@@ -233,6 +234,9 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 	aggressive |= MultiXactIdPrecedesOrEquals(onerel->rd_rel->relminmxid,
 			  mxactFullScanLimit);
 
+	/* If SCANALL option is specified, we have to scan all pages forcibly */
+	scan_all = options & VACOPT_SCANALL;
+
 	vacrelstats = (LVRelStats *) palloc0(sizeof(LVRelStats));
 
 	vacrelstats->old_rel_pages = onerel->rd_rel->relpages;
@@ -246,14 +250,14 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 	vacrelstats->hasindex = (nindexes > 0);
 
 	/* Do the vacuuming */
-	lazy_scan_heap(onerel, vacrelstats, Irel, nindexes, aggressive);
+	lazy_scan_heap(onerel, vacrelstats, Irel, nindexes, aggressive, scan_all);
 
 	/* Done with indexes */
 	vac_close_indexes(nindexes, Irel, NoLock);
 
 	/*
-	 * Compute whether we actually scanned the whole relation. If we did, we
-	 * can adjust relfrozenxid and relminmxid.
+	 * Compute whether 

Re: [HACKERS] Reviewing freeze map code

2016-05-08 Thread Masahiko Sawada
On Sat, May 7, 2016 at 11:08 PM, Masahiko Sawada  wrote:
> On Sat, May 7, 2016 at 6:00 AM, Joshua D. Drake  
> wrote:
>> On 05/06/2016 01:58 PM, Stephen Frost wrote:
>>>
>>> * Joshua D. Drake (j...@commandprompt.com) wrote:

 Yeah I thought about that, it is the word "FORCE" that bothers me.
 When you use FORCE there is an assumption that no matter what, it
 plows through (think rm -f). So if we don't use FROZEN, that's cool
 but FORCE doesn't work either.
>>>
>>>
>>> Isn't that exactly what this FORCE option being contemplated would do
>>> though?  Plow through the entire relation, regardless of what the VM
>>> says is all frozen or not?
>>>
>>> Seems like FORCE is a good word for that to me.
>>
>>
>> Except that we aren't FORCING a vacuum. That is the part I have contention
>> with. To me, FORCE means:
>>
>> No matter what else is happening, we are vacuuming this relation (think
>> locks).
>>
>> But I am also not going to dig in my heals. If that is truly what -hackers
>> come up with, thank you at least considering what I said.
>>
>> Sincerely,
>>
>> JD
>>
>
> As Joshua mentioned, FORCE word might imply doing VACUUM while plowing
> through locks.
> I guess that it might confuse the users.
> IMO, since this option will be a way for emergency, SCANALL word works for me.
>
> Or other ideas are,
> VACUUM IGNOREVM
> VACUUM RESCURE
>

Oops, VACUUM RESCUE is correct.

Regards,

--
Masahiko Sawada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-07 Thread Masahiko Sawada
On Sat, May 7, 2016 at 6:00 AM, Joshua D. Drake  wrote:
> On 05/06/2016 01:58 PM, Stephen Frost wrote:
>>
>> * Joshua D. Drake (j...@commandprompt.com) wrote:
>>>
>>> Yeah I thought about that, it is the word "FORCE" that bothers me.
>>> When you use FORCE there is an assumption that no matter what, it
>>> plows through (think rm -f). So if we don't use FROZEN, that's cool
>>> but FORCE doesn't work either.
>>
>>
>> Isn't that exactly what this FORCE option being contemplated would do
>> though?  Plow through the entire relation, regardless of what the VM
>> says is all frozen or not?
>>
>> Seems like FORCE is a good word for that to me.
>
>
> Except that we aren't FORCING a vacuum. That is the part I have contention
> with. To me, FORCE means:
>
> No matter what else is happening, we are vacuuming this relation (think
> locks).
>
> But I am also not going to dig in my heals. If that is truly what -hackers
> come up with, thank you at least considering what I said.
>
> Sincerely,
>
> JD
>

As Joshua mentioned, FORCE word might imply doing VACUUM while plowing
through locks.
I guess that it might confuse the users.
IMO, since this option will be a way for emergency, SCANALL word works for me.

Or other ideas are,
VACUUM IGNOREVM
VACUUM RESCURE

Regards,

--
Masahiko Sawada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Alvaro Herrera
Alvaro Herrera wrote:

> We touched this question in connection with multixact freezing and
> wraparound.  Testers seem to want to be given a script that they can
> install and run, then go for a beer and get back to a bunch of errors to
> report.

Here I spent some time trying to explain what to test to try and find
certain multixact bugs
http://www.postgresql.org/message-id/20150605213832.gz133...@postgresql.org

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Andres Freund
On 2016-05-07 10:00:27 +1200, Thomas Munro wrote:
> On Sat, May 7, 2016 at 8:34 AM, Robert Haas  wrote:
> >> Did somebody verify the new contents are correct?
> >
> > I admit that I didn't.  It seemed like an unlikely place for a goof,
> > but I guess we should verify.
> 
> Looks correct.  The tables match the output of the attached script.

Great!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Thomas Munro
On Sat, May 7, 2016 at 8:34 AM, Robert Haas  wrote:
> On Mon, May 2, 2016 at 8:25 PM, Andres Freund  wrote:
>> +static const uint8 number_of_ones_for_visible[256] = {
>> ...
>> +};
>> +static const uint8 number_of_ones_for_frozen[256] = {
>> ...
>>  };
>>
>> Did somebody verify the new contents are correct?
>
> I admit that I didn't.  It seemed like an unlikely place for a goof,
> but I guess we should verify.

Looks correct.  The tables match the output of the attached script.

-- 
Thomas Munro
http://www.enterprisedb.com
def popcount(value):
  return bin(value).count("1")

def make_popcount_table(mask):
  for high_nibble in range(0, 16):
line = "\t"
for low_nibble in range(0, 16):
  value = (high_nibble << 4) | low_nibble
  line += str(popcount(value & mask))
  if low_nibble != 15:
line += ", "
  elif high_nibble != 15:
line += ","
print line

if __name__ == "__main__":
  print "static const uint8 number_of_ones_for_visible[256] = {"
  make_popcount_table(0b01010101) # for each bit pair, the lower bit
  print "};"
  print "static const uint8 number_of_ones_for_frozen[256] = {"
  make_popcount_table(0b10101010) # for each bit pair, the higher bit
  print "};"

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Peter Geoghegan
On Fri, May 6, 2016 at 2:49 PM, Andres Freund  wrote:
>> Jeff Janes has done astounding work in these matters.  (I don't think
>> we credit him enough for that.)
>
> +many.

Agreed. I'm a huge fan of what Jeff has been able to do in this area.
I often say so. It would be even better if Jeff's approach to testing
was followed as an example by other people, but I wouldn't bet on it
ever happening. It requires real persistence and deep understanding to
do well.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Andres Freund
On 2016-05-06 18:31:03 -0300, Alvaro Herrera wrote:
> I don't know what happens when the freeze_table_age threshold is
> reached.

We scan all non-frozen pages, whereas we earlier had to scan all pages.

That's really both the significant benefit, and the danger. Because if
we screw up the all-frozen bits in the visibilitymap, we'll be screwed
soon after.

> Do we scan the whole table when that happens?

No, there's atm no way to force a whole-table vacuum, besides manually
rm'ing the _vm fork.


> Another question on this feature is what happens with the table age
> (relfrozenxid, relminmxid) when the table is not wholly scanned by
> vacuum.

Basically we increase the horizons whenever scanning all pages that are
not known to be frozen (+ potentially some frozen ones due to the
skipping logic). Without that there'd really not be a point in the
freeze map feature, as we'd continue to have the expensive anti
wraparound vacuums.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Joshua D. Drake

On 05/06/2016 02:48 PM, Andres Freund wrote:

On 2016-05-06 14:39:57 -0700, Joshua D. Drake wrote:



Yes, this is true but with a proper testing framework, I don't need a 15
minute break. I need 1 hour to configure, the rest just "happens" and
reports back.


That only works if somebody writes such tests.


Agreed.


And in that case the
tester having run will often suffice (until related changes are being
made). I'm not arguing against introducing more tests into the codebase
- I rather fervently for that. But that really isn't what's going to
avoid issues like this feature (or multixact) causing problems, because
those tests will just test what the author thought of.



Good point. I am not sure how to address the alternative though.




You want me (or people like me) to test more? Give us an easy way to
do it.


Useful additional testing and easy just don't go well together. By the
time I have made it easy I've done the testing that's needed.


I don't know that I can agree with this. A proper harness allows you to
execute: go.sh and boom... 2, 4, even 8 hours later you get a report. I will
not argue that it isn't easy to implement but I know it can be done.


The problem is that the contents of go.sh are the much more relevant
part than the 8 hours.


True.

Please don't misunderstand, I am not saying this is "easy". I just hope 
that it is something we work for.


Sincerely,

JD



--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Andres Freund
On 2016-05-06 18:36:52 -0300, Alvaro Herrera wrote:
> Andres Freund wrote:
> 
> > On 2016-05-06 14:17:13 -0700, Joshua D. Drake wrote:
> > > How do I test?
> > > 
> > > Is there a script I can run?
> > 
> > Unfortunately there's few interesting things to test with pre-made
> > scripts. There's no relevant OS dependency here, so each already
> > existing test doesn't really lead to significantly increased coverage
> > being run by other people.  Generally, when testing for correctness
> > issues, it's often of limited benefit to run tests written by the author
> > of reviewer - such scripts will usually just test things either has
> > thought of.  The dangerous areas are the ones neither author or reviewer
> > has considered.
> 
> We touched this question in connection with multixact freezing and
> wraparound.  Testers seem to want to be given a script that they can
> install and run, then go for a beer and get back to a bunch of errors to
> report.  But it doesn't work that way; writing a useful test script
> requires a lot of effort.

Right. And once written, often enough running it on a lot more instances
only marginally increases the coverage.


> Jeff Janes has done astounding work in these matters.  (I don't think
> we credit him enough for that.)

+many.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Andres Freund
On 2016-05-06 14:39:57 -0700, Joshua D. Drake wrote:
> > > What are we looking for exactly?
> > 
> > Data corruption, efficiency problems.
> > 
> 
> I am really not trying to be difficult here but Data Corruption is an easy
> one... what is the metric we accept as an efficiency problem?

That's indeed not easy to define.  In this case I'd say vacuums taking
longer, index only scans being slower, more WAL being generated would
count?


> > I think tests without reading the code are quite sensible and
> > important. And it perfectly makes sense to ask for information about
> > what to test.  But fundamentally testing is a lot of work, as is writing
> > and reviewing code; unless you're really really good at destructive
> > testing, you won't find much in a 15 minute break.
> > 
> 
> Yes, this is true but with a proper testing framework, I don't need a 15
> minute break. I need 1 hour to configure, the rest just "happens" and
> reports back.

That only works if somebody writes such tests. And in that case the
tester having run will often suffice (until related changes are being
made). I'm not arguing against introducing more tests into the codebase
- I rather fervently for that. But that really isn't what's going to
avoid issues like this feature (or multixact) causing problems, because
those tests will just test what the author thought of.


> > > You want me (or people like me) to test more? Give us an easy way to
> > > do it.
> > 
> > Useful additional testing and easy just don't go well together. By the
> > time I have made it easy I've done the testing that's needed.
> 
> I don't know that I can agree with this. A proper harness allows you to
> execute: go.sh and boom... 2, 4, even 8 hours later you get a report. I will
> not argue that it isn't easy to implement but I know it can be done.

The problem is that the contents of go.sh are the much more relevant
part than the 8 hours.


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Joshua D. Drake

On 05/06/2016 02:29 PM, Andres Freund wrote:

Hi,


On 2016-05-06 14:17:13 -0700, Joshua D. Drake wrote:

How do I test?

Is there a script I can run?


Unfortunately there's few interesting things to test with pre-made
scripts. There's no relevant OS dependency here, so each already
existing test doesn't really lead to significantly increased coverage
being run by other people.  Generally, when testing for correctness
issues, it's often of limited benefit to run tests written by the author
of reviewer - such scripts will usually just test things either has
thought of.  The dangerous areas are the ones neither author or reviewer
has considered.


I can't argue with that.





Are there specific things I can do to try and break it?


Upgrade clusters using pg_upgrade and make sure things like index only
scans still work and yield correct data.  Set up workloads that involve
freezing, and check that less WAL (and not more!) is generated with 9.6
than with 9.5.  Make sure queries still work.



What are we looking for exactly?


Data corruption, efficiency problems.



I am really not trying to be difficult here but Data Corruption is an 
easy one... what is the metric we accept as an efficiency problem?





A lot of -hackers seem to forget that although we have 100 -hackers, we have
1 "consultant/practitioners". Could I read the code and with a weekend
of WTF and -hackers questions figure out what is going on, yes but a lot of
people couldn't and I don't have the time.


I think tests without reading the code are quite sensible and
important. And it perfectly makes sense to ask for information about
what to test.  But fundamentally testing is a lot of work, as is writing
and reviewing code; unless you're really really good at destructive
testing, you won't find much in a 15 minute break.



Yes, this is true but with a proper testing framework, I don't need a 15 
minute break. I need 1 hour to configure, the rest just "happens" and 
reports back.


I have cycles to test, I have team members to help test (as do *lots* of 
other people) but sometimes we just get lost in how to help.





You want me (or people like me) to test more? Give us an easy way to
do it.


Useful additional testing and easy just don't go well together. By the
time I have made it easy I've done the testing that's needed.


I don't know that I can agree with this. A proper harness allows you to 
execute: go.sh and boom... 2, 4, even 8 hours later you get a report. I 
will not argue that it isn't easy to implement but I know it can be done.



Otherwise, we do what we can, which is try and interface on the things that
will directly and immediately affect us (like keywords and syntax).


The amount of bikeshedding on -hackers steals energy and time for
actually working on stuff, including testing. So I have little sympathy
for the amount of bike shedding done.


Insuring a reasonable and thought out interface for users is not bike 
shedding, it is at least as important and possibly more important than 
any feature we add.


Sincerely,

JD



Greetings,

Andres Freund




--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Alvaro Herrera
Andres Freund wrote:

> On 2016-05-06 14:17:13 -0700, Joshua D. Drake wrote:
> > How do I test?
> > 
> > Is there a script I can run?
> 
> Unfortunately there's few interesting things to test with pre-made
> scripts. There's no relevant OS dependency here, so each already
> existing test doesn't really lead to significantly increased coverage
> being run by other people.  Generally, when testing for correctness
> issues, it's often of limited benefit to run tests written by the author
> of reviewer - such scripts will usually just test things either has
> thought of.  The dangerous areas are the ones neither author or reviewer
> has considered.

We touched this question in connection with multixact freezing and
wraparound.  Testers seem to want to be given a script that they can
install and run, then go for a beer and get back to a bunch of errors to
report.  But it doesn't work that way; writing a useful test script
requires a lot of effort.  Jeff Janes has done astounding work in these
matters.  (I don't think we credit him enough for that.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Alvaro Herrera
Joshua D. Drake wrote:
> On 05/06/2016 01:40 PM, Robert Haas wrote:
> >On Wed, May 4, 2016 at 8:08 PM, Andres Freund  wrote:
> >>On 2016-05-02 14:48:18 -0700, Andres Freund wrote:
> >>>77a1d1e Department of second thoughts: remove PD_ALL_FROZEN.
> >>
> >>Nothing to say here.
> >>
> >>>fd31cd2 Don't vacuum all-frozen pages.
> >>
> >>Hm. I do wonder if it's going to bite us that we don't have a way to
> >>actually force vacuuming of the whole table (besides manually rm'ing the
> >>VM). I've more than once seen VACUUM used to try to do some integrity
> >>checking of the database.  How are we actually going to test that the
> >>feature works correctly? They'd have to write checks ontop of
> >>pg_visibility to see whether things are borked.
> >
> >Let's add VACUUM (FORCE) or something like that.
> 
> This is actually inverted. Vacuum by default should vacuum the entire
> relation, however if we are going to keep the existing behavior of this
> patch, VACUUM (FROZEN) seems to be better than (FORCE)?

Prior to some 7.x release, VACUUM actually did what we ripped out in
9.0 release as VACUUM FULL.  We actually changed the mode of operation
quite heavily into the "lazy" mode which didn't acquire access exclusive
lock, and it was a huge relief.  I think that changing the mode of
operation to be the lightest possible thing that gets the job done
is convenient for users, because their existing scripts continue to
clean their tables only they take less time.  No need to tweak the
maintenance scripts.

I don't know what happens when the freeze_table_age threshold is
reached.  Do we scan the whole table when that happens?  Because if we
do, then we don't need a new keyword: just invoke the command after
lowering the setting.

Another question on this feature is what happens with the table age
(relfrozenxid, relminmxid) when the table is not wholly scanned by
vacuum.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Andres Freund
Hi,


On 2016-05-06 14:17:13 -0700, Joshua D. Drake wrote:
> How do I test?
> 
> Is there a script I can run?

Unfortunately there's few interesting things to test with pre-made
scripts. There's no relevant OS dependency here, so each already
existing test doesn't really lead to significantly increased coverage
being run by other people.  Generally, when testing for correctness
issues, it's often of limited benefit to run tests written by the author
of reviewer - such scripts will usually just test things either has
thought of.  The dangerous areas are the ones neither author or reviewer
has considered.


> Are there specific things I can do to try and break it?

Upgrade clusters using pg_upgrade and make sure things like index only
scans still work and yield correct data.  Set up workloads that involve
freezing, and check that less WAL (and not more!) is generated with 9.6
than with 9.5.  Make sure queries still work.


> What are we looking for exactly?

Data corruption, efficiency problems.


> A lot of -hackers seem to forget that although we have 100 -hackers, we have
> 1 "consultant/practitioners". Could I read the code and with a weekend
> of WTF and -hackers questions figure out what is going on, yes but a lot of
> people couldn't and I don't have the time.

I think tests without reading the code are quite sensible and
important. And it perfectly makes sense to ask for information about
what to test.  But fundamentally testing is a lot of work, as is writing
and reviewing code; unless you're really really good at destructive
testing, you won't find much in a 15 minute break.


> You want me (or people like me) to test more? Give us an easy way to
> do it.

Useful additional testing and easy just don't go well together. By the
time I have made it easy I've done the testing that's needed.


> Otherwise, we do what we can, which is try and interface on the things that
> will directly and immediately affect us (like keywords and syntax).

The amount of bikeshedding on -hackers steals energy and time for
actually working on stuff, including testing. So I have little sympathy
for the amount of bike shedding done.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Andres Freund
On 2016-05-06 14:15:47 -0700, Josh berkus wrote:
> For the serious testing, does anyone have a good technique for creating
> loads which would stress-test vacuum freezing?  It's hard for me to come
> up with anything which wouldn't be very time-and-resource intensive
> (like running at 10,000 TPS for a week).

I've changed the limits for freezing options a while back, so you can
now set autovacuum_freeze_max as low as 10 (best set
vacuum_freeze_table_age accordingly).  You'll have to come up with a
workload that doesn't overwrite all data continuously (otherwise
there'll never be old rows), but otherwise it should now be fairly easy
to test that kind of scenario.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Joshua D. Drake

On 05/06/2016 02:08 PM, Andres Freund wrote:


VACUUM THEWHOLEDAMNTHING



I know that would never fly but damn if that wouldn't be an awesome keyword
for VACUUM.


It bothers me more than it probably should: Nobdy tests, reviews,
whatever a complex patch with significant data-loss potential. But as
soon somebody dares to mention an option name...


That is a fair complaint but let me ask you something:

How do I test?

Is there a script I can run? Are there specific things I can do to try 
and break it? What are we looking for exactly?


A lot of -hackers seem to forget that although we have 100 -hackers, we 
have 1 "consultant/practitioners". Could I read the code and with a 
weekend of WTF and -hackers questions figure out what is going on, yes 
but a lot of people couldn't and I don't have the time.


You want me (or people like me) to test more? Give us an easy way to do 
it. Otherwise, we do what we can, which is try and interface on the 
things that will directly and immediately affect us (like keywords and 
syntax).


Sincerely,


JD


--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Josh berkus
On 05/06/2016 02:12 PM, Andres Freund wrote:
> On 2016-05-06 14:10:04 -0700, Josh berkus wrote:
>> On 05/06/2016 02:08 PM, Andres Freund wrote:
>>
>>> It bothers me more than it probably should: Nobdy tests, reviews,
>>> whatever a complex patch with significant data-loss potential. But as
>>> soon somebody dares to mention an option name...
>>
>> Definitely more than it should, because it's gonna happen *every* time.
>>
>> https://en.wikipedia.org/wiki/Law_of_triviality
> 
> Doesn't mean it should not be frowned upon.

Or made light of, hence my post.  Personally I don't care what the
option is called, as long as we have docs for it.

For the serious testing, does anyone have a good technique for creating
loads which would stress-test vacuum freezing?  It's hard for me to come
up with anything which wouldn't be very time-and-resource intensive
(like running at 10,000 TPS for a week).

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Andres Freund
On 2016-05-06 14:10:04 -0700, Josh berkus wrote:
> On 05/06/2016 02:08 PM, Andres Freund wrote:
> 
> > It bothers me more than it probably should: Nobdy tests, reviews,
> > whatever a complex patch with significant data-loss potential. But as
> > soon somebody dares to mention an option name...
> 
> Definitely more than it should, because it's gonna happen *every* time.
> 
> https://en.wikipedia.org/wiki/Law_of_triviality

Doesn't mean it should not be frowned upon.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Josh berkus
On 05/06/2016 02:08 PM, Andres Freund wrote:

> It bothers me more than it probably should: Nobdy tests, reviews,
> whatever a complex patch with significant data-loss potential. But as
> soon somebody dares to mention an option name...

Definitely more than it should, because it's gonna happen *every* time.

https://en.wikipedia.org/wiki/Law_of_triviality

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Joshua D. Drake

On 05/06/2016 02:03 PM, Stephen Frost wrote:



VACUUM THEWHOLEDAMNTHING


+100

(hahahaha)


You know what? Why not? Seriously? We aren't product. This is supposed 
to be a bit fun. Let's have some fun with it? It would be so easy to 
turn that into a positive advocacy opportunity.




JD




--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Andres Freund
On 2016-05-06 14:03:11 -0700, Joshua D. Drake wrote:
> On 05/06/2016 02:01 PM, Josh berkus wrote:
> > On 05/06/2016 01:58 PM, Andres Freund wrote:
> > > On 2016-05-06 13:54:09 -0700, Joshua D. Drake wrote:
> > > > On 05/06/2016 01:50 PM, Andres Freund wrote:
> > 
> > > > > There already is FREEZE - meaning something different - so I doubt it.
> > > > 
> > > > Yeah I thought about that, it is the word "FORCE" that bothers me. When 
> > > > you
> > > > use FORCE there is an assumption that no matter what, it plows through
> > > > (think rm -f). So if we don't use FROZEN, that's cool but FORCE doesn't 
> > > > work
> > > > either.
> > > 
> > > SCANALL?
> > > 
> > 
> > VACUUM THEWHOLEDAMNTHING
> > 
> 
> I know that would never fly but damn if that wouldn't be an awesome keyword
> for VACUUM.

It bothers me more than it probably should: Nobdy tests, reviews,
whatever a complex patch with significant data-loss potential. But as
soon somebody dares to mention an option name...


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Stephen Frost
* Josh berkus (j...@agliodbs.com) wrote:
> On 05/06/2016 01:58 PM, Andres Freund wrote:
> > On 2016-05-06 13:54:09 -0700, Joshua D. Drake wrote:
> >> On 05/06/2016 01:50 PM, Andres Freund wrote:
> 
> >>> There already is FREEZE - meaning something different - so I doubt it.
> >>
> >> Yeah I thought about that, it is the word "FORCE" that bothers me. When you
> >> use FORCE there is an assumption that no matter what, it plows through
> >> (think rm -f). So if we don't use FROZEN, that's cool but FORCE doesn't 
> >> work
> >> either.
> > 
> > SCANALL?
> > 
> 
> VACUUM THEWHOLEDAMNTHING

+100

(hahahaha)

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Joshua D. Drake

On 05/06/2016 02:01 PM, Josh berkus wrote:

On 05/06/2016 01:58 PM, Andres Freund wrote:

On 2016-05-06 13:54:09 -0700, Joshua D. Drake wrote:

On 05/06/2016 01:50 PM, Andres Freund wrote:



There already is FREEZE - meaning something different - so I doubt it.


Yeah I thought about that, it is the word "FORCE" that bothers me. When you
use FORCE there is an assumption that no matter what, it plows through
(think rm -f). So if we don't use FROZEN, that's cool but FORCE doesn't work
either.


SCANALL?



VACUUM THEWHOLEDAMNTHING



I know that would never fly but damn if that wouldn't be an awesome 
keyword for VACUUM.


JD


--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Josh berkus
On 05/06/2016 01:58 PM, Andres Freund wrote:
> On 2016-05-06 13:54:09 -0700, Joshua D. Drake wrote:
>> On 05/06/2016 01:50 PM, Andres Freund wrote:

>>> There already is FREEZE - meaning something different - so I doubt it.
>>
>> Yeah I thought about that, it is the word "FORCE" that bothers me. When you
>> use FORCE there is an assumption that no matter what, it plows through
>> (think rm -f). So if we don't use FROZEN, that's cool but FORCE doesn't work
>> either.
> 
> SCANALL?
> 

VACUUM THEWHOLEDAMNTHING


-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Joshua D. Drake

On 05/06/2016 01:58 PM, Stephen Frost wrote:

* Joshua D. Drake (j...@commandprompt.com) wrote:

Yeah I thought about that, it is the word "FORCE" that bothers me.
When you use FORCE there is an assumption that no matter what, it
plows through (think rm -f). So if we don't use FROZEN, that's cool
but FORCE doesn't work either.


Isn't that exactly what this FORCE option being contemplated would do
though?  Plow through the entire relation, regardless of what the VM
says is all frozen or not?

Seems like FORCE is a good word for that to me.


Except that we aren't FORCING a vacuum. That is the part I have 
contention with. To me, FORCE means:


No matter what else is happening, we are vacuuming this relation (think 
locks).


But I am also not going to dig in my heals. If that is truly what 
-hackers come up with, thank you at least considering what I said.


Sincerely,

JD



Thanks!

Stephen




--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Stephen Frost
* Joshua D. Drake (j...@commandprompt.com) wrote:
> Yeah I thought about that, it is the word "FORCE" that bothers me.
> When you use FORCE there is an assumption that no matter what, it
> plows through (think rm -f). So if we don't use FROZEN, that's cool
> but FORCE doesn't work either.

Isn't that exactly what this FORCE option being contemplated would do
though?  Plow through the entire relation, regardless of what the VM
says is all frozen or not?

Seems like FORCE is a good word for that to me.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Andres Freund
On 2016-05-06 13:54:09 -0700, Joshua D. Drake wrote:
> On 05/06/2016 01:50 PM, Andres Freund wrote:
> 
> > > > Let's add VACUUM (FORCE) or something like that.
> > 
> > Yes, that makes sense.
> > 
> > 
> > > This is actually inverted. Vacuum by default should vacuum the entire
> > > relation
> > 
> > What? Why on earth would that be a good idea? Not to speak of hte fact
> > that that's not been the case since ~8.4?
> 
> Sorry, I just meant the default behavior shouldn't change but I do agree
> that we need the ability to keep the same behavior.

Which default behaviour shouldn't change? The one in master where we
skip known frozen pages? Or the released branches where can't skip those?

> > > ,however if we are going to keep the existing behavior of this
> > > patch, VACUUM (FROZEN) seems to be better than (FORCE)?
> > 
> > There already is FREEZE - meaning something different - so I doubt it.
> 
> Yeah I thought about that, it is the word "FORCE" that bothers me. When you
> use FORCE there is an assumption that no matter what, it plows through
> (think rm -f). So if we don't use FROZEN, that's cool but FORCE doesn't work
> either.

SCANALL?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Joshua D. Drake

On 05/06/2016 01:50 PM, Andres Freund wrote:


Let's add VACUUM (FORCE) or something like that.


Yes, that makes sense.



This is actually inverted. Vacuum by default should vacuum the entire
relation


What? Why on earth would that be a good idea? Not to speak of hte fact
that that's not been the case since ~8.4?


Sorry, I just meant the default behavior shouldn't change but I do agree 
that we need the ability to keep the same behavior.



,however if we are going to keep the existing behavior of this
patch, VACUUM (FROZEN) seems to be better than (FORCE)?


There already is FREEZE - meaning something different - so I doubt it.


Yeah I thought about that, it is the word "FORCE" that bothers me. When 
you use FORCE there is an assumption that no matter what, it plows 
through (think rm -f). So if we don't use FROZEN, that's cool but FORCE 
doesn't work either.


Sincerely,

JD


--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Andres Freund
On 2016-05-06 13:48:09 -0700, Joshua D. Drake wrote:
> On 05/06/2016 01:40 PM, Robert Haas wrote:
> > On Wed, May 4, 2016 at 8:08 PM, Andres Freund  wrote:
> > > On 2016-05-02 14:48:18 -0700, Andres Freund wrote:
> > > > 77a1d1e Department of second thoughts: remove PD_ALL_FROZEN.
> > >
> > > Nothing to say here.
> > >
> > > > fd31cd2 Don't vacuum all-frozen pages.
> > >
> > > Hm. I do wonder if it's going to bite us that we don't have a way to
> > > actually force vacuuming of the whole table (besides manually rm'ing the
> > > VM). I've more than once seen VACUUM used to try to do some integrity
> > > checking of the database.  How are we actually going to test that the
> > > feature works correctly? They'd have to write checks ontop of
> > > pg_visibility to see whether things are borked.
> >
> > Let's add VACUUM (FORCE) or something like that.

Yes, that makes sense.


> This is actually inverted. Vacuum by default should vacuum the entire
> relation

What? Why on earth would that be a good idea? Not to speak of hte fact
that that's not been the case since ~8.4?


>,however if we are going to keep the existing behavior of this
> patch, VACUUM (FROZEN) seems to be better than (FORCE)?

There already is FREEZE - meaning something different - so I doubt it.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Joshua D. Drake

On 05/06/2016 01:40 PM, Robert Haas wrote:

On Wed, May 4, 2016 at 8:08 PM, Andres Freund  wrote:

On 2016-05-02 14:48:18 -0700, Andres Freund wrote:

77a1d1e Department of second thoughts: remove PD_ALL_FROZEN.


Nothing to say here.


fd31cd2 Don't vacuum all-frozen pages.


Hm. I do wonder if it's going to bite us that we don't have a way to
actually force vacuuming of the whole table (besides manually rm'ing the
VM). I've more than once seen VACUUM used to try to do some integrity
checking of the database.  How are we actually going to test that the
feature works correctly? They'd have to write checks ontop of
pg_visibility to see whether things are borked.


Let's add VACUUM (FORCE) or something like that.


This is actually inverted. Vacuum by default should vacuum the entire 
relation, however if we are going to keep the existing behavior of this 
patch, VACUUM (FROZEN) seems to be better than (FORCE)?


Sincerely,

JD

--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Robert Haas
On Thu, May 5, 2016 at 2:20 PM, Andres Freund  wrote:
> On 2016-05-02 14:48:18 -0700, Andres Freund wrote:
>> 7087166 pg_upgrade: Convert old visibility map format to new format.
>
> +const char *
> +rewriteVisibilityMap(const char *fromfile, const char *tofile, bool force)
> ...
>
> +   while ((bytesRead = read(src_fd, buffer, BLCKSZ)) == BLCKSZ)
> +   {
> ..
>
> Uh, shouldn't we actually fail if we read incompletely? Rather than
> silently ignoring the problem? Ok, this causes no corruption, but it
> indicates that something went significantly wrong.

Sure, that's reasonable.

> +   charnew_vmbuf[BLCKSZ];
> +   char   *new_cur = new_vmbuf;
> +   boolempty = true;
> +   boolold_lastpart;
> +
> +   /* Copy page header in advance */
> +   memcpy(new_vmbuf, , SizeOfPageHeaderData);
>
> Shouldn't we zero out new_vmbuf? Afaics we're not necessarily zeroing it
> with old_lastpart && !empty, right?

Oh, dear.  That seems like a possible data corruption bug.  Maybe we'd
better fix that right away (although I don't actually have time before
the wrap).

> +   if ((dst_fd = open(tofile, O_RDWR | O_CREAT | (force ? 0 : O_EXCL), 
> S_IRUSR | S_IWUSR)) < 0)
> +   {
> +   close(src_fd);
> +   return getErrorText();
> +   }
>
> I know you guys copied this, but what's the force thing about?
> Expecially as it's always set to true by the callers (i.e. what is the
> parameter even about?)?  Wouldn't we at least have to specify O_TRUNC in
> the force case?

I just work here.

> +   old_cur += BITS_PER_HEAPBLOCK_OLD;
> +   new_cur += BITS_PER_HEAPBLOCK;
>
> I'm not sure I'm understanding the point of the BITS_PER_HEAPBLOCK_OLD
> stuff - as long as it's hardcoded into rewriteVisibilityMap() we'll not
> be able to have differing ones anyway, should we decide to add a third
> bit?

I think that's just a matter of style.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Robert Haas
On Wed, May 4, 2016 at 8:08 PM, Andres Freund  wrote:
> On 2016-05-02 14:48:18 -0700, Andres Freund wrote:
>> 77a1d1e Department of second thoughts: remove PD_ALL_FROZEN.
>
> Nothing to say here.
>
>> fd31cd2 Don't vacuum all-frozen pages.
>
> Hm. I do wonder if it's going to bite us that we don't have a way to
> actually force vacuuming of the whole table (besides manually rm'ing the
> VM). I've more than once seen VACUUM used to try to do some integrity
> checking of the database.  How are we actually going to test that the
> feature works correctly? They'd have to write checks ontop of
> pg_visibility to see whether things are borked.

Let's add VACUUM (FORCE) or something like that.

> /*
>  * Compute whether we actually scanned the whole relation. If we did, 
> we
>  * can adjust relfrozenxid and relminmxid.
>  *
>  * NB: We need to check this before truncating the relation, because 
> that
>  * will change ->rel_pages.
>  */
>
> Comment is out-of-date now.

OK.

> -   if (blkno == next_not_all_visible_block)
> +   if (blkno == next_unskippable_block)
> {
> -   /* Time to advance next_not_all_visible_block */
> -   for (next_not_all_visible_block++;
> -next_not_all_visible_block < nblocks;
> -next_not_all_visible_block++)
> +   /* Time to advance next_unskippable_block */
> +   for (next_unskippable_block++;
> +next_unskippable_block < nblocks;
> +next_unskippable_block++)
>
> Hm. So we continue with the course of re-processing pages, even if
> they're marked all-frozen. For all-visible there at least can be a
> benefit by freezing earlier, but for all-frozen pages there's really no
> point.  I don't really buy the arguments for the skipping logic. But
> even disregarding that, maybe we should skip processing a block if it's
> all-frozen (without preventing the page from being read?); as there's no
> possible benefit?  Acquring the exclusive/content lock and stuff is far
> from free.

I wanted to tinker with this logic as little as possible in the
interest of ending up with something that worked.  I would not have
written it this way.

> Not really related to this patch, but the FORCE_CHECK_PAGE is rather
> ugly.

+1.

> +   /*
> +* The current block is potentially skippable; if 
> we've seen a
> +* long enough run of skippable blocks to justify 
> skipping it, and
> +* we're not forced to check it, then go ahead and 
> skip.
> +* Otherwise, the page must be at least all-visible 
> if not
> +* all-frozen, so we can set 
> all_visible_according_to_vm = true.
> +*/
> +   if (skipping_blocks && !FORCE_CHECK_PAGE())
> +   {
> +   /*
> +* Tricky, tricky.  If this is in aggressive 
> vacuum, the page
> +* must have been all-frozen at the time we 
> checked whether it
> +* was skippable, but it might not be any 
> more.  We must be
> +* careful to count it as a skipped 
> all-frozen page in that
> +* case, or else we'll think we can't update 
> relfrozenxid and
> +* relminmxid.  If it's not an aggressive 
> vacuum, we don't
> +* know whether it was all-frozen, so we have 
> to recheck; but
> +* in this case an approximate answer is OK.
> +*/
> +   if (aggressive || VM_ALL_FROZEN(onerel, 
> blkno, ))
> +   vacrelstats->frozenskipped_pages++;
> continue;
> +   }
>
> Hm. This indeed seems a bit tricky.  Not sure how to make it easier
> though without just ripping out the SKIP_PAGES_THRESHOLD stuff.

Yep, I had the same problem.

> Hm. This also doubles the number of VM accesses. While I guess that's
> not noticeable most of the time, it's still not nice; especially when a
> large relation is entirely frozen, because it'll mean we'll sequentially
> go through the visibilityma twice.

Compared to what we're saving, that's obviously a trivial cost.
That's not to say that we might not want to improve it, but it's
hardly a disaster.

In short: wah, wah, wah.

> I wondered for a minute whether #14057 could cause really bad issues
> here
> http://www.postgresql.org/message-id/20160331103739.8956.94...@wrigleys.postgresql.org
> but I don't see it being more relevant here.

I 

Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Robert Haas
On Mon, May 2, 2016 at 8:25 PM, Andres Freund  wrote:
> + * heap_tuple_needs_eventual_freeze
> + *
> + * Check to see whether any of the XID fields of a tuple (xmin, xmax, xvac)
> + * will eventually require freezing.  Similar to heap_tuple_needs_freeze,
> + * but there's no cutoff, since we're trying to figure out whether freezing
> + * will ever be needed, not whether it's needed now.
> + */
> +bool
> +heap_tuple_needs_eventual_freeze(HeapTupleHeader tuple)
>
> Wouldn't redefining this to heap_tuple_is_frozen() and then inverting the
> checks be easier to understand?

I thought it much safer to keep this as close to a copy of
heap_tuple_needs_freeze() as possible.  Copying a function and
inverting all of the return values is much more likely to introduce
bugs, IME.

> +   /*
> +* If xmax is a valid xact or multixact, this tuple is also not 
> frozen.
> +*/
> +   if (tuple->t_infomask & HEAP_XMAX_IS_MULTI)
> +   {
> +   MultiXactId multi;
> +
> +   multi = HeapTupleHeaderGetRawXmax(tuple);
> +   if (MultiXactIdIsValid(multi))
> +   return true;
> +   }
>
> Hm. What's the test inside the if() for? There shouldn't be any case
> where xmax is invalid if HEAP_XMAX_IS_MULTI is set.   Now there's a
> check like that outside of this commit, but it seems strange to me
> (Alvaro, perhaps you could comment on this?).

Here again I was copying existing code, with appropriate simplifications.

> + *
> + * Clearing both visibility map bits is not separately WAL-logged.  The 
> callers
>   * must make sure that whenever a bit is cleared, the bit is cleared on WAL
>   * replay of the updating operation as well.
>
> I think including "both" here makes things less clear, because it
> differentiates clearing one bit from clearing both. There's no practical
> differentce atm, but still.

I agree.

>   *
>   * VACUUM will normally skip pages for which the visibility map bit is set;
>   * such pages can't contain any dead tuples and therefore don't need 
> vacuuming.
> - * The visibility map is not used for anti-wraparound vacuums, because
> - * an anti-wraparound vacuum needs to freeze tuples and observe the latest 
> xid
> - * present in the table, even on pages that don't have any dead tuples.
>   *
>
> I think the remaining sentence isn't entirely accurate, there's now more
> than one bit, and they're different with regard to scan_all/!scan_all
> vacuums (or will be - maybe this updated further in a later commit? But
> if so, that sentence shouldn't yet be removed...).

We can adjust the language, but I don't really see a big problem here.

> -/* Number of heap blocks we can represent in one byte. */
> -#define HEAPBLOCKS_PER_BYTE 8
> -
> Hm, why was this moved to the header? Sounds like something the outside
> shouldn't care about.

Oh... yeah.  Let's undo that.

> #define HEAPBLK_TO_MAPBIT(x) (((x) % HEAPBLOCKS_PER_BYTE) * 
> BITS_PER_HEAPBLOCK)
>
> Hm. This isn't really a mapping to an individual bit anymore - but I
> don't really have a better name in mind. Maybe TO_OFFSET?

Well, it sorta is... but we could change it, I suppose.

> +static const uint8 number_of_ones_for_visible[256] = {
> ...
> +};
> +static const uint8 number_of_ones_for_frozen[256] = {
> ...
>  };
>
> Did somebody verify the new contents are correct?

I admit that I didn't.  It seemed like an unlikely place for a goof,
but I guess we should verify.

> /*
> - * visibilitymap_clear - clear a bit in visibility map
> + * visibilitymap_clear - clear all bits in visibility map
>   *
>
> This seems rather easy to misunderstand, as this really only clears all
> the bits for one page, not actually all the bits.

We could change "in" to "for one page in the".

>   * the bit for heapBlk, or InvalidBuffer. The caller is responsible for
> - * releasing *buf after it's done testing and setting bits.
> + * releasing *buf after it's done testing and setting bits, and must pass 
> flags
> + * for which it needs to check the value in visibility map.
>   *
>   * NOTE: This function is typically called without a lock on the heap page,
>   * so somebody else could change the bit just after we look at it.  In fact,
> @@ -327,17 +351,16 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, 
> Buffer heapBuf,
>
> I'm not seing what flags the above comment change is referring to?

Ugh.  I think that's leftover cruft from an earlier patch version that
should have been excised from what got committed.

> /*
> -* A single-bit read is atomic.  There could be memory-ordering 
> effects
> +* A single byte read is atomic.  There could be memory-ordering 
> effects
>  * here, but for performance reasons we make it the caller's job to 
> worry
>  * about that.
>  */
> -   result = (map[mapByte] & (1 << mapBit)) ? true : false;
> -
> -   return result;
> +   return ((map[mapByte] >> mapBit) & 

Re: [HACKERS] Reviewing freeze map code

2016-05-05 Thread Andres Freund
Hi,

On 2016-05-02 14:48:18 -0700, Andres Freund wrote:
> 7087166 pg_upgrade: Convert old visibility map format to new format.

+const char *
+rewriteVisibilityMap(const char *fromfile, const char *tofile, bool force)
...

+   while ((bytesRead = read(src_fd, buffer, BLCKSZ)) == BLCKSZ)
+   {
..

Uh, shouldn't we actually fail if we read incompletely? Rather than
silently ignoring the problem? Ok, this causes no corruption, but it
indicates that something went significantly wrong.

+   charnew_vmbuf[BLCKSZ];
+   char   *new_cur = new_vmbuf;
+   boolempty = true;
+   boolold_lastpart;
+
+   /* Copy page header in advance */
+   memcpy(new_vmbuf, , SizeOfPageHeaderData);

Shouldn't we zero out new_vmbuf? Afaics we're not necessarily zeroing it
with old_lastpart && !empty, right?


+   if ((dst_fd = open(tofile, O_RDWR | O_CREAT | (force ? 0 : O_EXCL), 
S_IRUSR | S_IWUSR)) < 0)
+   {
+   close(src_fd);
+   return getErrorText();
+   }

I know you guys copied this, but what's the force thing about?
Expecially as it's always set to true by the callers (i.e. what is the
parameter even about?)?  Wouldn't we at least have to specify O_TRUNC in
the force case?


+   old_cur += BITS_PER_HEAPBLOCK_OLD;
+   new_cur += BITS_PER_HEAPBLOCK;

I'm not sure I'm understanding the point of the BITS_PER_HEAPBLOCK_OLD
stuff - as long as it's hardcoded into rewriteVisibilityMap() we'll not
be able to have differing ones anyway, should we decide to add a third
bit?

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-04 Thread Andres Freund
On 2016-05-02 14:48:18 -0700, Andres Freund wrote:
> 77a1d1e Department of second thoughts: remove PD_ALL_FROZEN.

Nothing to say here.


> fd31cd2 Don't vacuum all-frozen pages.

Hm. I do wonder if it's going to bite us that we don't have a way to
actually force vacuuming of the whole table (besides manually rm'ing the
VM). I've more than once seen VACUUM used to try to do some integrity
checking of the database.  How are we actually going to test that the
feature works correctly? They'd have to write checks ontop of
pg_visibility to see whether things are borked.


/*
 * Compute whether we actually scanned the whole relation. If we did, we
 * can adjust relfrozenxid and relminmxid.
 *
 * NB: We need to check this before truncating the relation, because 
that
 * will change ->rel_pages.
 */

Comment is out-of-date now.


-   if (blkno == next_not_all_visible_block)
+   if (blkno == next_unskippable_block)
{
-   /* Time to advance next_not_all_visible_block */
-   for (next_not_all_visible_block++;
-next_not_all_visible_block < nblocks;
-next_not_all_visible_block++)
+   /* Time to advance next_unskippable_block */
+   for (next_unskippable_block++;
+next_unskippable_block < nblocks;
+next_unskippable_block++)

Hm. So we continue with the course of re-processing pages, even if
they're marked all-frozen. For all-visible there at least can be a
benefit by freezing earlier, but for all-frozen pages there's really no
point.  I don't really buy the arguments for the skipping logic. But
even disregarding that, maybe we should skip processing a block if it's
all-frozen (without preventing the page from being read?); as there's no
possible benefit?  Acquring the exclusive/content lock and stuff is far
from free.


Not really related to this patch, but the FORCE_CHECK_PAGE is rather
ugly.


+   /*
+* The current block is potentially skippable; if we've 
seen a
+* long enough run of skippable blocks to justify 
skipping it, and
+* we're not forced to check it, then go ahead and skip.
+* Otherwise, the page must be at least all-visible if 
not
+* all-frozen, so we can set 
all_visible_according_to_vm = true.
+*/
+   if (skipping_blocks && !FORCE_CHECK_PAGE())
+   {
+   /*
+* Tricky, tricky.  If this is in aggressive 
vacuum, the page
+* must have been all-frozen at the time we 
checked whether it
+* was skippable, but it might not be any more. 
 We must be
+* careful to count it as a skipped all-frozen 
page in that
+* case, or else we'll think we can't update 
relfrozenxid and
+* relminmxid.  If it's not an aggressive 
vacuum, we don't
+* know whether it was all-frozen, so we have 
to recheck; but
+* in this case an approximate answer is OK.
+*/
+   if (aggressive || VM_ALL_FROZEN(onerel, blkno, 
))
+   vacrelstats->frozenskipped_pages++;
continue;
+   }

Hm. This indeed seems a bit tricky.  Not sure how to make it easier
though without just ripping out the SKIP_PAGES_THRESHOLD stuff.

Hm. This also doubles the number of VM accesses. While I guess that's
not noticeable most of the time, it's still not nice; especially when a
large relation is entirely frozen, because it'll mean we'll sequentially
go through the visibilityma twice.


I wondered for a minute whether #14057 could cause really bad issues
here
http://www.postgresql.org/message-id/20160331103739.8956.94...@wrigleys.postgresql.org
but I don't see it being more relevant here.


Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-03 Thread Masahiko Sawada
On Tue, May 3, 2016 at 6:48 AM, Andres Freund  wrote:
> Hi,
>
> The freeze map changes, besides being very important, seem to be one of
> the patches with a high risk profile in 9.6.  Robert had asked whether
> I'd take a look.  I thought it'd be a good idea to review that while
> running tests for
> http://www.postgresql.org/message-id/CAMkU=1w85Dqt766AUrCnyqCXfZ+rsk1witAc_=v5+pce93s...@mail.gmail.com

Thank you for reviewing.

> For starters, I'm just going through the commits. It seems the relevant
> pieces are:
>
> a892234 Change the format of the VM fork to add a second bit per page.
> 77a1d1e Department of second thoughts: remove PD_ALL_FROZEN.
> fd31cd2 Don't vacuum all-frozen pages.
> 7087166 pg_upgrade: Convert old visibility map format to new format.
> ba0a198 Add pg_visibility contrib module.
>
> did I miss anything important?
>

That's all.

Regards,

--
Masahiko Sawada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-02 Thread Andres Freund
Hi,

some of the review items here are mere matters of style/preference. Feel
entirely free to discard them, but I thought if I'm going through the
change anyway...


On 2016-05-02 14:48:18 -0700, Andres Freund wrote:
> a892234 Change the format of the VM fork to add a second bit per page.

TL;DR: fairly minor stuff.


+ * heap_tuple_needs_eventual_freeze
+ *
+ * Check to see whether any of the XID fields of a tuple (xmin, xmax, xvac)
+ * will eventually require freezing.  Similar to heap_tuple_needs_freeze,
+ * but there's no cutoff, since we're trying to figure out whether freezing
+ * will ever be needed, not whether it's needed now.
+ */
+bool
+heap_tuple_needs_eventual_freeze(HeapTupleHeader tuple)

Wouldn't redefining this to heap_tuple_is_frozen() and then inverting the
checks be easier to understand?


+   /*
+* If xmax is a valid xact or multixact, this tuple is also not frozen.
+*/
+   if (tuple->t_infomask & HEAP_XMAX_IS_MULTI)
+   {
+   MultiXactId multi;
+
+   multi = HeapTupleHeaderGetRawXmax(tuple);
+   if (MultiXactIdIsValid(multi))
+   return true;
+   }

Hm. What's the test inside the if() for? There shouldn't be any case
where xmax is invalid if HEAP_XMAX_IS_MULTI is set.   Now there's a
check like that outside of this commit, but it seems strange to me
(Alvaro, perhaps you could comment on this?).


+ *
+ * Clearing both visibility map bits is not separately WAL-logged.  The callers
  * must make sure that whenever a bit is cleared, the bit is cleared on WAL
  * replay of the updating operation as well.

I think including "both" here makes things less clear, because it
differentiates clearing one bit from clearing both. There's no practical
differentce atm, but still.

  *
  * VACUUM will normally skip pages for which the visibility map bit is set;
  * such pages can't contain any dead tuples and therefore don't need vacuuming.
- * The visibility map is not used for anti-wraparound vacuums, because
- * an anti-wraparound vacuum needs to freeze tuples and observe the latest xid
- * present in the table, even on pages that don't have any dead tuples.
  *

I think the remaining sentence isn't entirely accurate, there's now more
than one bit, and they're different with regard to scan_all/!scan_all
vacuums (or will be - maybe this updated further in a later commit? But
if so, that sentence shouldn't yet be removed...).


-
-/* Number of heap blocks we can represent in one byte. */
-#define HEAPBLOCKS_PER_BYTE 8
-

Hm, why was this moved to the header? Sounds like something the outside
shouldn't care about.


#define HEAPBLK_TO_MAPBIT(x) (((x) % HEAPBLOCKS_PER_BYTE) * BITS_PER_HEAPBLOCK)

Hm. This isn't really a mapping to an individual bit anymore - but I
don't really have a better name in mind. Maybe TO_OFFSET?


+static const uint8 number_of_ones_for_visible[256] = {
...
+};
+static const uint8 number_of_ones_for_frozen[256] = {
...
 };

Did somebody verify the new contents are correct?


/*
- * visibilitymap_clear - clear a bit in visibility map
+ * visibilitymap_clear - clear all bits in visibility map
  *

This seems rather easy to misunderstand, as this really only clears all
the bits for one page, not actually all the bits.



  * the bit for heapBlk, or InvalidBuffer. The caller is responsible for
- * releasing *buf after it's done testing and setting bits.
+ * releasing *buf after it's done testing and setting bits, and must pass flags
+ * for which it needs to check the value in visibility map.
  *
  * NOTE: This function is typically called without a lock on the heap page,
  * so somebody else could change the bit just after we look at it.  In fact,
@@ -327,17 +351,16 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, 
Buffer heapBuf,

I'm not seing what flags the above comment change is referring to?


/*
-* A single-bit read is atomic.  There could be memory-ordering effects
+* A single byte read is atomic.  There could be memory-ordering effects
 * here, but for performance reasons we make it the caller's job to 
worry
 * about that.
 */
-   result = (map[mapByte] & (1 << mapBit)) ? true : false;
-
-   return result;
+   return ((map[mapByte] >> mapBit) & VISIBILITYMAP_VALID_BITS);
 }

Not a new issue, and *very* likely to be irrelevant in practice (given
the value is only referenced once): But there's really no guarantee
map[mapByte] is only read once here.


-BlockNumber
-visibilitymap_count(Relation rel)
+void
+visibilitymap_count(Relation rel, BlockNumber *all_visible, BlockNumber 
*all_frozen)

Not really a new issue again: The parameter types (previously return
type) to this function seem wrong to me.



@@ -1934,5 +1992,14 @@ heap_page_is_all_visible(Relation rel, Buffer buf, 
TransactionId *visibility_cut
} 
+   /*
+* We don't bother clearing *all_frozen when the page is discovered 

[HACKERS] Reviewing freeze map code

2016-05-02 Thread Andres Freund
Hi,

The freeze map changes, besides being very important, seem to be one of
the patches with a high risk profile in 9.6.  Robert had asked whether
I'd take a look.  I thought it'd be a good idea to review that while
running tests for
http://www.postgresql.org/message-id/CAMkU=1w85Dqt766AUrCnyqCXfZ+rsk1witAc_=v5+pce93s...@mail.gmail.com

For starters, I'm just going through the commits. It seems the relevant
pieces are:

a892234 Change the format of the VM fork to add a second bit per page.
77a1d1e Department of second thoughts: remove PD_ALL_FROZEN.
fd31cd2 Don't vacuum all-frozen pages.
7087166 pg_upgrade: Convert old visibility map format to new format.
ba0a198 Add pg_visibility contrib module.

did I miss anything important?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


<    1   2   3