Re: [HACKERS] Bootstrap DATA is a pita

Tom Lane Sat, 12 Dec 2015 10:29:27 -0800

Mark Dilger <[email protected]> writes:
>> On Dec 11, 2015, at 2:54 PM, Caleb Welton <[email protected]> wrote:
>> Compare:
>> CREATE FUNCTION lo_export(oid, text) RETURNS integer LANGUAGE internal 
>> STRICT AS 'lo_export' WITH (OID=765);  
>> 
>> DATA(insert OID = 765 (  lo_export              PGNSP PGUID 12 1 0 0 0 f f f 
>> f t f v u 2 0 23 "26 25" _null_ _null_ _null_ _null_ _null_ lo_export _null_ 
>> _null_ _null_ ));


> I would like to hear more about this idea.  Are you proposing that we use 
> something
> like the above CREATE FUNCTION format to express what is currently being 
> expressed
> with DATA statements?

Yes, that sort of idea has been kicked around some already, see the
archives.

> That is an interesting idea, though I don't know what exactly
> that would look like.  If you want to forward this idea, I'd be eager to hear 
> your thoughts.
> If not, I'll try to make progress with my idea of tab delimited files and 
> such (or really,
> Alvaro's idea of csv files that I only slightly corrupted).

Personally I would like to see both approaches explored.  Installing as
much as we can via SQL commands is attractive for a number of reasons;
but there is going to be an irreducible minimum amount of stuff that
has to be inserted by something close to the current bootstrapping
process.  (And I'm not convinced that that "minimum amount" is going
to be very small...)  So it's not impossible that we'd end up accepting
*both* types of patches, one to do more in the post-bootstrap SQL world
and one to make the bootstrap data notation less cumbersome.  In any
case it would be useful to push both approaches forward some more before
we make any decisions between them.

BTW, there's another thing I'd like to see improved in this area, which is
a problem already but will get a lot worse if we push more work into the
post-bootstrap phase of initdb.  That is that the post-bootstrap phase is
both inefficient and impossible to debug.  If you've ever had a failure
there, you'll have seen that the backend spits out an entire SQL script
and says there's an error in it somewhere; that's because it gets the
whole per-stage script as one submission.  (Try introducing a syntax error
somewhere in information_schema.sql, and you'll see what I mean.)
Breaking the stage scripts down further would help, but that is
unattractive because each one requires a fresh backend startup/shutdown,
including a full checkpoint.  I'd like to see things rejiggered so that
there's only one post-bootstrap standalone backend session that performs
all the steps, but initdb feeds it just one SQL command at a time so that
errors are better localized.  That should both speed up initdb noticeably
and make debugging easier.

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bootstrap DATA is a pita

Reply via email to