Re: CREATE INDEX and HOT (was [HACKERS] Question:pg_classattributes and race conditions ?)

2007-03-19 Thread Simon Riggs
On Mon, 2007-03-19 at 10:29 -0500, Merlin Moncure wrote:
> On 3/17/07, Simon Riggs <[EMAIL PROTECTED]> wrote:
> > I'm very comfortable with the idea that HOT can be turned on/off for a
> > table. That gives us a workaround to bugs. Previously, changing things
> > like WITHOUT OIDS was done over two releases, so I'd suggest the same
> > thing here. Add the option now, disabled, then look to make it the
> > default option in the next release. We can override that with the
> > default_use_hot parameter for those that feel bold, at least initially.
> > I know Bruce has been long opposed to the idea of a table-level switch,
> > which is why we've been trying so hard to avoid it. So we should add his
> > -1 to this idea from the start.
> 
> Is fear of bugs a justification of guc setting?  

Probably not on its own, but the inspiration was that we currently have
user-visible behaviour in the recent proposals, hence the GUC. 

> Or is there a trade-off involved with HOT?

At the moment, there is no downside to HOT in normal operation that I'm
aware of, but its a great question. 

The problem we have is with normal CREATE INDEX because there are two
sources of race conditions that complicate this: concurrent index scans
and crash safety. Currently there are no perfect solutions to this. We
have two main options:
1. additional locking, either within CIDX or as a separate DDL
2. additional complexity and possible limitation in the number of
indexes to just 3 before we stop doing HOT updates.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: CREATE INDEX and HOT (was [HACKERS] Question:pg_classattributes and race conditions ?)

2007-03-19 Thread Heikki Linnakangas

Hannu Krosing wrote:

Ühel kenal päeval, E, 2007-03-19 kell 12:05, kirjutas Simon Riggs:

On Mon, 2007-03-19 at 10:51 +, Heikki Linnakangas wrote:

Pavan Deolasee wrote:

We've already used three of those, two for tracking HEAP_ONLY
and HOT_UPDATED tuples and one for tracking fragmented tuple.
HEAP_ONLY_TUPLE would go away in favor of the per-index bits. So we have 
bits available for three indexes.


But you probably have to do some kind of SUPERFULL VACUUM if you want to
DROP and CREATE the third index. You will probably have to touch all
tuples, regardless of weather they are live or not, or if will be moved
or not, just to kclean ot bits for the just-deleted index.


DROP INDEX wouldn't do anything extra. CREATE INDEX would have to clear 
the bit assigned to the new index, which would mean dirtying every heap 
page in the worst case. As a further optimization, CREATE INDEX could 
skip index inserts for HOT-updated tuples, if the key for new index 
wasn't changed, and leave the flag set.


Vacuum isn't needed.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: CREATE INDEX and HOT (was [HACKERS]Question:pg_classattributes and race conditions ?)

2007-03-19 Thread Simon Riggs
On Mon, 2007-03-19 at 16:06 +0530, Pavan Deolasee wrote:
> Simon Riggs wrote:
>  >
>  > This problem is solved by moving the wait (for all transactions in
>  > reference snapshot to finish) so that it is now between the first and
>  > second scans, as described.
>  >
>  > During the second Vscan we would prune each block, so the only remaining
>  > tuple in the block when the second scan sees it would be (10,30) and it
>  > would no longer be a HOT tuple - the index would have a pointer to it,
>  > so no new index pointer would be added. The pointer to (10,30) is the
>  > same pointer that was added in the first phase for the tuple (10,20).
>  >
> 
> The problem is that in the first phase, the pointer was inserted
> with key=20 whereas now its changed to 30. So we need to delete the old
> index entry and add a new one.

So don't index HOT tuples in the first phase, wait until the second.
That should be just a single if() test in IndexBuildHeapScan().

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: CREATE INDEX and HOT (was [HACKERS] Question:pg_classattributes and race conditions ?)

2007-03-19 Thread Pavan Deolasee

Simon Riggs wrote:
>
> This problem is solved by moving the wait (for all transactions in
> reference snapshot to finish) so that it is now between the first and
> second scans, as described.
>
> During the second Vscan we would prune each block, so the only remaining
> tuple in the block when the second scan sees it would be (10,30) and it
> would no longer be a HOT tuple - the index would have a pointer to it,
> so no new index pointer would be added. The pointer to (10,30) is the
> same pointer that was added in the first phase for the tuple (10,20).
>

The problem is that in the first phase, the pointer was inserted
with key=20 whereas now its changed to 30. So we need to delete the old
index entry and add a new one.

Thanks,
Pavan


--


EnterpriseDBhttp://www.enterprisedb.com


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: CREATE INDEX and HOT (was [HACKERS]Question:pg_classattributes and race conditions ?)

2007-03-19 Thread Simon Riggs
On Mon, 2007-03-19 at 09:28 +, Simon Riggs wrote:
> On Mon, 2007-03-19 at 14:33 +0530, Pavan Deolasee wrote:
> > Simon Riggs wrote:
> >  >
> >  >
> >  > We *must* make CREATE INDEX CONCURRENTLY work with HOT. The good news is
> >  > I think we can without significant difficulty.
> >  >
> > 
> > Yeah, I think CREATE INDEX CONCURRENTLY is much easier to solve. Though
> > I am not completely convinced that we can do that without much changes
> > to CREATE INDEX CONCURRENTLY logic. For example, I believe we still
> > need to lock out HOT-updates before we start CREATE INDEX CONCURRENTLY.
> > Otherwise we might end up creating two paths to the same tuple in
> > the new index.
> > 
> > Say, we have a table with two columns (int a, int b). We have an
> > index on 'a' and building another index on 'b'. We got a tuple
> > (10, 20) in the heap. In the first phase of CREATE INDEX CONCURRENTLY,
> > this tuple would be indexed. If the tuple is HOT-updated to (10, 30)
> > before the first phase ends, the updated tuple would again get
> > indexed in the second phase. This would lead to two paths to the
> > latest visible tuple from the new index.
> > 
> > Am I missing something in your design that stops this from
> > happening ?
> 
> This problem is solved by moving the wait (for all transactions in
> reference snapshot to finish) so that it is now between the first and
> second scans, as described.
> 
> During the second scan we would prune each block, so the only remaining
> tuple in the block when the second scan sees it would be (10,30) and it
> would no longer be a HOT tuple - the index would have a pointer to it,
> so no new index pointer would be added. The pointer to (10,30) is the
> same pointer that was added in the first phase for the tuple (10,20).
> 
> The wait and subsequent prune ensures that all HOT tuples are now the
> root of their HOT chain. The index created in the fist phase ensures
> that the HOT chains are never added to.

AFAICS this is all you need to make CREATE INDEX CONCURRENTLY work with
HOT, which is even simpler than my original post. [This presumes that we
do pruning automatically on a heap scan, not sure what the current state
of that is, but it could be a scan option].

Index: src/backend/commands/indexcmds.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/commands/indexcmds.c,v
retrieving revision 1.157
diff -c -r1.157 indexcmds.c
*** src/backend/commands/indexcmds.c13 Mar 2007 00:33:39 -
1.157
--- src/backend/commands/indexcmds.c19 Mar 2007 09:49:56 -
***
*** 497,507 
ActiveSnapshot = snapshot;

/*
-* Scan the index and the heap, insert any missing index
entries.
-*/
-   validate_index(relationId, indexRelationId, snapshot);
-
-   /*
 * The index is now valid in the sense that it contains all
currently
 * interesting tuples.  But since it might not contain tuples
deleted just
 * before the reference snap was taken, we have to wait out any
--- 497,502 
***
*** 514,519 
--- 509,519 
for (ixcnt = 0; ixcnt < snapshot->xcnt; ixcnt++)
XactLockTableWait(snapshot->xip[ixcnt]);

+   /*
+* Scan the index and the heap, insert any missing index
entries.
+*/
+   validate_index(relationId, indexRelationId, snapshot);
+
/* Index can now be marked valid -- update its pg_index entry */
pg_index = heap_open(IndexRelationId, RowExclusiveLock);



-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: CREATE INDEX and HOT (was [HACKERS] Question:pg_classattributes and race conditions ?)

2007-03-19 Thread Simon Riggs
On Mon, 2007-03-19 at 14:33 +0530, Pavan Deolasee wrote:
> Simon Riggs wrote:
>  >
>  >
>  > We *must* make CREATE INDEX CONCURRENTLY work with HOT. The good news is
>  > I think we can without significant difficulty.
>  >
> 
> Yeah, I think CREATE INDEX CONCURRENTLY is much easier to solve. Though
> I am not completely convinced that we can do that without much changes
> to CREATE INDEX CONCURRENTLY logic. For example, I believe we still
> need to lock out HOT-updates before we start CREATE INDEX CONCURRENTLY.
> Otherwise we might end up creating two paths to the same tuple in
> the new index.
> 
> Say, we have a table with two columns (int a, int b). We have an
> index on 'a' and building another index on 'b'. We got a tuple
> (10, 20) in the heap. In the first phase of CREATE INDEX CONCURRENTLY,
> this tuple would be indexed. If the tuple is HOT-updated to (10, 30)
> before the first phase ends, the updated tuple would again get
> indexed in the second phase. This would lead to two paths to the
> latest visible tuple from the new index.
> 
> Am I missing something in your design that stops this from
> happening ?

This problem is solved by moving the wait (for all transactions in
reference snapshot to finish) so that it is now between the first and
second scans, as described.

During the second scan we would prune each block, so the only remaining
tuple in the block when the second scan sees it would be (10,30) and it
would no longer be a HOT tuple - the index would have a pointer to it,
so no new index pointer would be added. The pointer to (10,30) is the
same pointer that was added in the first phase for the tuple (10,20).

The wait and subsequent prune ensures that all HOT tuples are now the
root of their HOT chain. The index created in the fist phase ensures
that the HOT chains are never added to.


-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: CREATE INDEX and HOT (was [HACKERS] Question:pg_classattributes and race conditions ?)

2007-03-17 Thread Simon Riggs
On Sun, 2007-03-18 at 00:44 +0530, Pavan Deolasee wrote:
> Simon Riggs wrote:
>  >
>  > We need to be clear that we already have a solution to CREATE INDEX
>  > CONCURRENTLY. Do you agree that we do? Does anyone see a problem with
>  > the posted design for that?

If we have solved CREATE INDEX CONCURRENTLY, then I would propose that
this becomes the default option for creating an index, when the
statement is issued outside of a statement block.

That seems better than reminding everybody to run with the CONCURRENTLY
option, or advise them of different performance characteristics or
behaviour of the normal CREATE INDEX route.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly