Re: [PATCHES] GiST: memory allocation, cleanup

2004-11-07 Thread Neil Conway
Tom Lane wrote:
I think you can assume that noplace else in the backend will invoke the
sk_func, period.
Great, thanks for the info.
- mark the array that indicates NULLs that is passed to
index_formtuple() as 'const', and fix the resulting fallout
I'm a bit dubious about this, mainly because you did not likewise
const-ify the other input arguments; it seems confusing to do a partial
const-ification.
Well, partial const-ification is the rule rather than the exception in 
the backend right now. I'll take a look at adding more const qualifiers, 
but I don't really see why partial const-ification is confusing.

The only thing I don't like about this is that it's not apparent that
the function would close the heap  index relations, so the calling code
will now look like it's leaking the relation references.  Maybe call it
IndexCloseAndUpdateStats or something like that?
Good point -- I'll make that change.
-Neil
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PATCHES] GiST: memory allocation, cleanup

2004-11-07 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 I'm a bit dubious about this, mainly because you did not likewise
 const-ify the other input arguments; it seems confusing to do a partial
 const-ification.

 Well, partial const-ification is the rule rather than the exception in 
 the backend right now. I'll take a look at adding more const qualifiers, 
 but I don't really see why partial const-ification is confusing.

In this particular case I think it's confusing because the Datum and
nulls arrays are really two halves of a single data structure.
Const-ifying just one of them obscures that fact.  I'd be happy if you
marked both of them const.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PATCHES] GiST: memory allocation, cleanup

2004-11-05 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 Attached is a patch that makes some cleanups and improvements to GiST,
 as well as a few semi-related cleanups. Changes:
 ...

 QUESTION: given a ScanKey for an index scan, GiST overwrites the
 ScanKey's sk_func to point to the user-defined Consistent method
 (gistrescan() in gistscan.c). When doing a GiST index scan we can ensure
 that the sk_func is invoked in a short-lived memory context
 (gistfindnext() in gistget.c). Is it safe to assume that anywhere else
 in the backend that invokes the ScanKey's sk_func, it is done in a
 short-lived memory context?

I think you can assume that noplace else in the backend will invoke the
sk_func, period.  If it did, it'd be passing the wrong parameters ---
the Consistent method doesn't take the same args as a plain indexable
operator would.

 - mark the array that indicates NULLs that is passed to
 index_formtuple() as 'const', and fix the resulting fallout

I'm a bit dubious about this, mainly because you did not likewise
const-ify the other input arguments; it seems confusing to do a partial
const-ification.

 - after doing an index build, there was previously a long comment and a
 few lines of code that used the # of index tuples and heap tuples
 (computed by the index build) to update the optimizer's statistics. This
 code was duplicated 4 times (once for each type of index); I added a
 function called IndexUpdateStats() in catalog/index.c and changed each
 index's build function to call that function.

The only thing I don't like about this is that it's not apparent that
the function would close the heap  index relations, so the calling code
will now look like it's leaking the relation references.  Maybe call it
IndexCloseAndUpdateStats or something like that?

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] GiST: memory allocation, cleanup

2004-11-05 Thread Neil Conway
Alvaro Herrera wrote:
What are you going to base your concurrency work on?  I wonder because I
skimmed through Marcel Kornacker's theses, and for example it mentioned
use of predicate locking (among other interesting things).  Do you have
any ideas yet on how to handle that?
I've read a paper[1] on concurrency and recovery by Kornacker, C. Mohan 
and Hellerstein, which is presumably pretty close to Kornacker's thesis 
work (which I'll definitely read later). My understanding is that 
there's no need to implement predicate locking for GiST any more than 
for B+-trees -- it's a nice feature to prevent phantoms, but not a 
requirement for implementing page-level concurrency. So I haven't given 
any thought to implementing it. If I've misunderstood, please let me know.

-Neil
[1] http://db.cs.berkeley.edu/papers/sigmod97-gist.pdf
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


[PATCHES] GiST: memory allocation, cleanup

2004-11-04 Thread Neil Conway
Attached is a patch that makes some cleanups and improvements to GiST,
as well as a few semi-related cleanups. Changes:

- previously, GiST did not make any use of the backend's memory context
infrastructure. This made implementing indexes using GiST unnecessarily
difficult and fragile: if your implementation of a GiST method leaked
memory, this could result in a potentially large memory leak in
PostgreSQL itself. This patch changes GiST so that all invocations of
used-defined GiST methods are done within a short-lived memory context.
In practice, this means GiST implementors need not use pfree, unless
they are doing a _lot_ of allocations in their method implementations.

As part of this work, memory contexts were also used to cleanup a lot of
ugly and difficult to maintain memory management code in gist.c itself.

QUESTION: given a ScanKey for an index scan, GiST overwrites the
ScanKey's sk_func to point to the user-defined Consistent method
(gistrescan() in gistscan.c). When doing a GiST index scan we can ensure
that the sk_func is invoked in a short-lived memory context
(gistfindnext() in gistget.c). Is it safe to assume that anywhere else
in the backend that invokes the ScanKey's sk_func, it is done in a
short-lived memory context?

- previously, src/include/access/gist.h included both the public API
that is intended for use by GiST implementors as well as declarations
for the GiST implementation itself. I've created a new header file
gist_private.h and moved the latter declarations there.

- remove some extern keywords from function definitions in
contrib/btree_gist, and do some more minor code cleanup there

- minor GiST documentation cleanup

- mark the array that indicates NULLs that is passed to
index_formtuple() as 'const', and fix the resulting fallout

- after doing an index build, there was previously a long comment and a
few lines of code that used the # of index tuples and heap tuples
(computed by the index build) to update the optimizer's statistics. This
code was duplicated 4 times (once for each type of index); I added a
function called IndexUpdateStats() in catalog/index.c and changed each
index's build function to call that function.

- in GiST, move the definition of a bunch of scope-local variables to
the most tightly-enclosing scope. In other words, code like:

{
int a, b;

a = ...;
if (a)
{
b = ...;
}
}

was changed to move the definition of b into the inner scope. This is
per good style, IMHO (and it is consistent with the rest of the
PostgreSQL source).

- moved gistfreestack() to gistscan.c and made it static, since it is
only used there

- cleaned up various other things in gist.c: added some comments,
basically tried to impose some sanity

Comments welcome. I would like to apply this to 8.1 at some point after
we branch for 8.0. I thought I would post this patch to get some
feedback before starting on further GiST work (speculatively, adding
support for concurrency and making it WAL-safe).

-Neil



gist_work-1.patch.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match