Re: [PATCHES] HOT documentation README

2007-09-12 Thread Pavan Deolasee
On 9/12/07, Tom Lane [EMAIL PROTECTED] wrote:



 VACUUM
 --

 There is little change to regular vacuum. It removes dead HOT tuples,
 like pruning does, and cleans up any redirected dead line pointers.

 In lazy vacuum, we must not freeze a tuple that is in the middle of an
 update chain. That can happen when a tuple has xmin  xmax; it is the
 same scenario that requires hard pruning in VACUUM FULL. Freezing such
 tuples will break the check that xmin and xmax matches when following
 the chain. It is not a problem without HOT, because the preceding tuple
 in the chain must be dead as well so no one will try to follow the
 chain, but with HOT the preceding tuple would be DEAD_CHAIN, and someone
 might still need to follow the chain to find the live tuple. We avoid
 that by just not freezing such tuples. They can be frozen eventually,
 when the xmax of the preceding tuple is  OldestXmin as well.

 XXX doesn't the above completely break the anti-wraparound guarantees?
 And why couldn't we avoid the problem by pruning the previous tuple,
 which is surely dead?



We preserve anti-wraparound guarantees by not letting the relfrozenxid
advance past the minimum of cut-off xid and xmin/xmax of not-yet-frozen
tuples. Given that this is required to address corner case of DEAD tuple
following a RD tuple, the final relfrozenxid would be very close to the
cut-off xid. Isn't it ?

We could have actually pruned the preceding RD tuples (as we do in
vacuum full), but we were worried about missing some corner case
where someone may still want to follow the chain from the RD tuple.
We don't have any such concern with vacuum full because we run
with exclusive lock on the table. But if we agree that there is no
problem with pruning RD tuple preceding a DEAD tuple, we can
actually prune that tuple as well.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


Re: [PATCHES] HOT documentation README

2007-09-12 Thread Pavan Deolasee
On 9/12/07, Tom Lane [EMAIL PROTECTED] wrote:


 VACUUM FULL
 ---

 To make vacuum full work, any DEAD tuples in the middle of an update
 chain need to be removed (see comments at the top of
 heap_prune_hotchain_hard() for details). Vacuum full performs a more
 aggressive pruning that not only removes dead tuples at the beginning of
 an update chain, but scans the whole chain and removes any intermediate
 dead tuples as well.  It also removes redirected line pointers by making
 them directly point to the first tuple in the HOT chain. This causes
 a user-visible change in the tuple's CTID, but since VACUUM FULL has
 always moved tuple CTIDs, that should not break anything.

 XXX any extra complexity here needs justification --- a lot of it.



We hard prune the chains and also clear up redirect line pointers
because doing so is safe within VACUUM FULL and it reduces addition
complexity in the actual VACUUM FULL work.

When we move tuples and tuple chains, we don't try to preserve
their HOT properties. So when tuples in a HOT chain are moved,
we reset their HEAP_ONLY_TUPLE and HEAP_HOT_UPDATED flags
and each tuple has its own index entry. This requires us to some
more book keeping work in terms on number of indexed tuples expected
etc because they are checked at the end of the index scan.


Statistics
 --

 XXX: How do HOT-updates affect statistics? How often do we need to run
 autovacuum and autoanalyze?



Auotovacuum needs to be run much less frequently with HOT. This is because
defragmentation reclaims dead space in a page, thus reducing total dead
space in a table. Right now we don't update FSM information about the page
after defragmenting it, so a UPDATE on a different page can still cause
relation extension even though there is free space in some other page.
The rational for not updating FSM is to let subsequent UPDATEs on the page
to use the freed up space. But one can argue that we should let the free
space to be used for other UPDATEs/INSERTs after leaving fillfactor worth
of space.

Another significant change regarding autovacuum is that we now track
the total dead space in the table instead of number of dead tuples. This
seems like a better approach because it takes into account varying tuple
sizes
into account. The tracked dead space is increased whenever we update/delete
a tuple (or insert is aborted) and reduced when a page is defragmented.
autovacuum_vacuum_scale_factor considers the percentage of dead space
to the size of the relation whereas autovacuum_vacuum_threshold
considers the absolute amount of dead space in terms of blocks.

Every UPDATE (HOT or COLD) contributes to the autoanalyze stats and
defragmentation/pruning has no effect on autoanalyze. IOW autoanalyze
would work just the way it does today. One change that is worh mentioning
and discussing is that we don't follow HOT chains while fetching tuples
during
autoanalyze and autoanalyze would consider all such tuples as DEAD.
In the worst case when all the tuples in the table are reachable  via
redirected line pointers, this would confuse autoanalyze since it would
consider all tuples in the table as DEAD.

I think we should change this to follow HOT chain in analyze. Since we
fetch using SnapshotNow, if there is a live tuple at the end of the
chain, analyze would use that. Otherwise the tuple is considered as
DEAD.

Thanks,
Pavan



-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


Re: [PATCHES] HOT patch - version 15

2007-09-12 Thread Pavan Deolasee
On 9/11/07, Heikki Linnakangas [EMAIL PROTECTED] wrote:



 The crucial design
 decision for HOT is when to prune and when to defragment the page, so
 that when we're doing the UPDATE, there's room in the page.


Right.

For defragmentation, I am still inclined towards doing it when we see
that the free space is less than (or slightly more than) the average tuple
size of the table - unless we have a better solution.

For pruning, we can do the quick pruning. But I won't feel comfortable
doing it without an exclusive lock on the buffer. Also I would avoid
line pointer redirection during quick prune. A simpler solution would
be to flag the page whenever HOT chain becomes longer than, say
5, (which can easily be detected in heap_hot_fetch) and prune it in
the next lookup. Hopefully we would only rarely have long HOT chains
and if at all we have them, we will have some mechanism to recover
from it.

Any other ideas ?

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


[PATCHES] Rename latestCompletedXid to latestCommittedXid

2007-09-12 Thread Florian G. Pflug

Hi

Per my post to the hackers list, here is a patch that renamed
latestCompletedXid to latestCommittedXid, and updates it only
on commits.

greetings, Florian Pflug


latestcommitted.patch.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] Rename latestCompletedXid to latestCommittedXid

2007-09-12 Thread Tom Lane
Florian G. Pflug [EMAIL PROTECTED] writes:
 Per my post to the hackers list, here is a patch that renamed
 latestCompletedXid to latestCommittedXid, and updates it only
 on commits.

This is wrong --- it will break early detection of aborted
subtransactions.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] Rename latestCompletedXid to latestCommittedXid

2007-09-12 Thread Florian G. Pflug

Tom Lane wrote:

Florian G. Pflug [EMAIL PROTECTED] writes:

Per my post to the hackers list, here is a patch that renamed
latestCompletedXid to latestCommittedXid, and updates it only
on commits.


This is wrong --- it will break early detection of aborted
subtransactions.


I don't see how - TransactionIdIsInProgress doesn't even touch 
latestCompletedXid. And I didn't mess with XidCacheRemoveRunningXids,

other than removing the latestCompletedXid updating.

Am I missing something?

greetings, Florian Pflug


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PATCHES] Rename latestCompletedXid to latestCommittedXid

2007-09-12 Thread Tom Lane
Florian G. Pflug [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 This is wrong --- it will break early detection of aborted
 subtransactions.

 I don't see how - TransactionIdIsInProgress doesn't even touch 
 latestCompletedXid.

Nah, I take that back --- I was worried that XidInMVCCSnapshot might
say it was still running when it wasn't, but we never apply
XidInMVCCSnapshot to XIDs that we haven't already determined to be
committed, so it doesn't matter.

It's still a bad idea though.  Let me go respond to your -hackers
post (which arrived here after the other one).

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] prevent invalidly encoded input

2007-09-12 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
   
 addlitchar(unescape_single_char(yytext[1]));
 + if 
 (IS_HIGHBIT_SET(literalbuf[literallen]))
 + saw_high_bit = true;

Isn't that array subscript off-by-one?  Probably better to put the test
inside unescape_single_char(), anyway.

Otherwise it looks sane, though maybe shy a comment or so.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] HOT documentation README

2007-09-12 Thread Tom Lane
Pavan Deolasee [EMAIL PROTECTED] writes:
 On 9/12/07, Tom Lane [EMAIL PROTECTED] wrote:
 XXX doesn't the above completely break the anti-wraparound guarantees?
 And why couldn't we avoid the problem by pruning the previous tuple,
 which is surely dead?

 We preserve anti-wraparound guarantees by not letting the relfrozenxid
 advance past the minimum of cut-off xid and xmin/xmax of not-yet-frozen
 tuples. Given that this is required to address corner case of DEAD tuple
 following a RD tuple, the final relfrozenxid would be very close to the
 cut-off xid. Isn't it ?
 We could have actually pruned the preceding RD tuples (as we do in
 vacuum full), but we were worried about missing some corner case
 where someone may still want to follow the chain from the RD tuple.

This seems all wrong to me.  We'd only be considering freezing a tuple
whose xmin precedes the global xmin.  If it has a predecessor, that
predecessor has xmax equal to this one's xmin, therefore also preceding
global xmin, therefore it would be seen as DEAD not RECENTLY_DEAD.
So we should never need to freeze a tuple that isn't the start of its
HOT chain.

Also, if you find a DEAD tuple after a RECENTLY_DEAD one, you can
certainly prune both, because what this tells you is that both tuples
are in fact dead to all observers.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] actualised forgotten Magnus's patch for plpgsql MOVE statement

2007-09-12 Thread Bruce Momjian

This has been saved for the 8.4 release:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---

Pavel Stehule wrote:
 
 I would argue that we should likewise not allow them in plpgsql's MOVE,
 although this is more of a judgment call than is the case for FETCH.
 I just don't think it's a good idea to provide two redundant ways to do
 the same thing, when we might want to make one of the ways mean
 something else later.  There's no upside and there might be a downside.
 
 
 It's question. There are lot of links to FETCH in doc, and we support from 
 FETCH direction only subset. It needs at least notice in documentation. When 
 I testeid MOVE I found an form
 MOVE FORWARD 10 ... more natural than MOVE RELATIVE 10 and if we support 
 MOVE FORWARD ... then is logic support MOVE FORWARD n ,
 
 else FORWARD, BACKWARD are nonstandard and MOVE statement too.
 
 Regards
 Pavel Stehule
 
 _
 Citite se osamele? Poznejte nekoho vyjmecneho diky Match.com. 
 http://www.msn.cz/
 
 
 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] [HACKERS] New Zealand - TZ change

2007-09-12 Thread Bruce Momjian
Zdenek Kotala wrote:
 I would like to inform, that New Zealand changed DST rules and new 
 timezone files are available.  See 
 http://www.dia.govt.nz/diawebsite.nsf/wpg_URL/Services-Daylight-Saving-Daylight-saving-to-be-extended
 
 Patch for head attached. I kept zic.c untouched, but I think it would be 
 nice to update it as well.
 
 Are there any updated release scheduled 8.0-8.2?

FYI, we will have a release containing the New Zealand time zone changes
early next week.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] HOT documentation README

2007-09-12 Thread Pavan Deolasee
On 9/12/07, Tom Lane [EMAIL PROTECTED] wrote:



 This seems all wrong to me.  We'd only be considering freezing a tuple
 whose xmin precedes the global xmin.  If it has a predecessor, that
 predecessor has xmax equal to this one's xmin, therefore also preceding
 global xmin, therefore it would be seen as DEAD not RECENTLY_DEAD.
 So we should never need to freeze a tuple that isn't the start of its
 HOT chain.


hm.. What you are saying is right. I fail to recollect any other scenario
that
had forced me to think freezing is a problem with HOT.


Also, if you find a DEAD tuple after a RECENTLY_DEAD one, you can
 certainly prune both, because what this tells you is that both tuples
 are in fact dead to all observers.


I agree. I ran a long test with this change and there doesn't seem to be
any issue. So lets prune everything including the latest DEAD tuple. That
would let us take out the changes related to freezing. I also think that
should let us remove the DEAD_CHAIN concept, but let me check.

Thanks,
Pavan



-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com