Re: [PATCHES] Relation forks & FSM rewrite patches

Tom Lane Sat, 12 Jul 2008 11:28:06 -0700

"Heikki Linnakangas" <[EMAIL PROTECTED]> writes:
> Here's an updated version of the "relation forks" patch, and an 
> incremental FSM rewrite patch on top of that. The relation forks patch 
> is ready for review. The FSM implementation is more work-in-progress 
> still, but I'd like to get some review on that as well, with the goal of 
> doing more performance testing and committing it after the commit fest.


I looked through the "relation forks" patch, and think it's close to
committable, but I have a few minor gripes.

> The one part that I'm not totally satisfied in the relation forks patch 
> is the smgrcreate() function. The question problem is: which piece of 
> code decides which forks to create for a relation, and when to create 
> them? I settled on the concept that all forks that a relation will need 
> are created at once, in one smgrcreate() call. There's no function to 
> create additional forks later on.

I think that's an extremely bad idea.  You need look no further than
Zdenek's hopes of in-place-upgrade to see a need for adding a fork
to an existing relation; but even without that I can see possibly
wanting to not create a fork right away.  I think smgrcreate() should
just create one fork as it's told (ie, same API you gave mdcreate).

> Currently, heap_create() decides which forks to create. That's fine at 
> the moment, but will become problematic in the future, as it's 
> indexam-specific which forks an index requires. That decision should 
> really be done in the indexam. One possibility would be to only create 
> the main fork in heap_create(), and let indexam create any additional 
> forks it needs in ambuild function.

Problem goes away if you redo smgrcreate API as above.

Other nits:

relpath() is now in danger of buffer overrun: you need to increase
the palloc() sizes.  Seems like a #define for the max number of digits
in a ForkNumber wouldn't be out of place (though I sure hope it never
gets past 1 ...).

Also, I strongly suggest *not* appending _0 to the name of the main fork's
file.  This'd break on-disk compatibility and people's expectations,
for no particular reason.

Don't like the name NUM_FORKS; it seems to suggest that's the actual
number of forks in existence.  MAX_NUM_FORKS would be better.

I think that setting InvalidForkNumber to -1 is unportable: there is
nothing compelling the compiler to represent enum ForkNumber as a signed
type, so the places where you assume a ForkNumber variable can hold
InvalidForkNumber all look like trouble.  One solution is to include
InvalidForkNumber in the enum:

        enum ForkNumber {
                InvalidForkNumber = -1,
                MAIN_FORKNUM = 0
        };

This would be a bit annoying if someone wrote a switch statement listing
different possible fork numbers, as the compiler would complain if
there's no case for InvalidForkNumber; but I can't see a reason for code
like that to be common.

Another possibility is to set InvalidForkNumber = 0, MAIN_FORKNUM = 1;
which is still a bit of a type cheat if InvalidForkNumber isn't listed
in the enum, but one that's very unlikely to cause trouble.  A bigger
problem is it would mess up a lot of your loops.

BTW, don't put a comma at end of enum ForkNumber, I think some compilers
will barf on that.

Last, don't forget to update the sgml docs chapter on database physical
storage.

                        regards, tom lane

-- 
Sent via pgsql-patches mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches

Re: [PATCHES] Relation forks & FSM rewrite patches

Reply via email to