Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-12-09 Thread Peter Eisentraut
On 11/13/16 12:19 PM, Tom Lane wrote:
>> It'd also be very pg_proc specific, which isn't where I think this
>> should go..
> 
> The presumption is that we have a CREATE command for every type of
> object that we need to put into the system catalogs.  But yes, the
> other problem with this approach is that you need to do a lot more
> work per-catalog to build the converter script.  I'm not sure how
> much of that could be imported from gram.y, but I'm afraid the
> answer would be "not enough".

I'd think about converting about 75% of what is currently in the catalog
headers into some sort of built-in extension that is loaded via an SQL
script.  There are surely some details about that that would need to be
worked out, but I think that's a more sensible direction than inventing
another custom format.

I wonder how big the essential bootstrap set of pg_proc.h would be and
how manageable the file would be if it were to be reduced like that.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-15 Thread Robert Haas
On Sun, Nov 13, 2016 at 9:48 AM, Andrew Dunstan  wrote:
> I'm not convinced the line prefix part is necessary, though. What I'm
> thinking of is something like this:
>
> PROCDATA( oid=1242 name=boolin isstrict=t volatile=i parallel=s nargs=1
> rettype=bool argtypes="cstring" src=boolin );

I liked Tom's format a lot better.  If we put this in a separate file
rather than in the header, which I favor, the PROCDATA stuff is just
noise.  On the other hand, having the name as the first thing on the
line seems *excellent* for readability.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-14 Thread Tom Lane
Andrew Dunstan  writes:
> On 11/13/2016 11:11 AM, Tom Lane wrote:
>> 1. Are we going to try to keep these things in the .h files, or split
>> them out?  I'd like to get them out, as that eliminates both the need
>> to keep the things looking like macro calls, and the need for the data
>> within the macro call to be at least minimally parsable as C.

> That would work fine for pg_proc.h, less so for pg_type.h where we have 
> a whole lot of
> #define FOOOID nn
> directives in among the data lines. Moving these somewhere remote from 
> the catalog lines they relate to seems like quite a bad idea.

We certainly don't want multiple files to be sources of truth for that.
What I was anticipating is that those #define's would also be generated
from the same input files, much as fmgroids.h is handled today.  We
could imagine driving the creation of a macro off an additional, optional
field in the data entries, say "macro=FOOOID", if we want only selected
entries to have #defines.  Or we could do like we do with pg_proc.h and
generate macros for everything according to some fixed naming rule.
I could see approaching pg_type that way, but am less excited about
pg_operator, pg_opclass, etc, where we only need macros for a small
fraction of the entries.

>> 2. Andrew's example above implies some sort of mapping between the
>> keywords and the actual column names (or at least column positions).
>> Where and how is that specified?

> There are several possibilities. The one I was leaning towards was to 
> parse out the Anum_pg_foo_* definitions.

I'm okay with that if the field labels used in the data entries are to be
exactly the same as the column names.  Your example showed abbreviated
names (omitting "pro"), which is something I doubt we want to try to
hard-wire a rule for.  Also, if we are going to abbreviate at all,
I think it might be useful to abbreviate *a lot*, say like "v" for
"provolatile", and that would be something that ought to be set up with
some explicit manually-provided declarations.

>> 3. Also where are we going to provide the per-column default values?
>> How does the converter script know which columns to convert to type oids,
>> proc oids, etc?  Is it going to do any data validation beyond that, and
>> if so on what basis?

> a) something like DATA_DEFAULTS( foo=bar );
> b) something like DATA_TYPECONV ( rettype argtypes allargtypes );

I'm thinking a bit about per-column declarations in the input file,
along the line of this for provolatile:

declare v col=15 type=char default='v'

Some of those items could be gotten out of pg_proc.h, but not all.
I guess a second alternative would be to add the missing info to
pg_proc.h and have the conversion script parse it out of there.

>> I think we want to do them all.  pg_proc.h is actually one of the easier
>> catalogs to work on presently, IMO, because the only kind of
>> cross-references it has are type OIDs.  Things like pg_amop are a mess.
>> And I really don't want to be dealing with multiple notations for catalog
>> data.  Also I think this will be subject to Polya's paradox: designing a
>> general solution will be easier and cleaner than a hack that works only
>> for one catalog.

> I don't know that we need to handle everything at once, as long as the 
> solution is sufficiently general.

Well, we could convert the catalogs one at a time if that seems useful,
but I don't want to be rewriting the bki-generation script repeatedly.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-14 Thread Andrew Dunstan



On 11/13/2016 11:11 AM, Tom Lane wrote:

Andrew Dunstan  writes:

I'm not convinced the line prefix part is necessary, though. What I'm
thinking of is something like this:
PROCDATA( oid=1242 name=boolin isstrict=t volatile=i parallel=s nargs=1
  rettype=bool argtypes="cstring" src=boolin );

We could go in that direction too, but the apparent flexibility to split
entries into multiple lines is an illusion, at least if you try to go
beyond a few lines; you'd end up with duplicated line sequences in
different entries and thus ambiguity for patch(1).  I don't have any
big objection to the above, but it's not obviously better either.



Yeah, I looked and there are too many cases where the name would be 
outside the normal 3 lines of context.





Some things we should try to resolve before settling definitively on
a data representation:

1. Are we going to try to keep these things in the .h files, or split
them out?  I'd like to get them out, as that eliminates both the need
to keep the things looking like macro calls, and the need for the data
within the macro call to be at least minimally parsable as C.



That would work fine for pg_proc.h, less so for pg_type.h where we have 
a whole lot of


   #define FOOOID nn

directives in among the data lines. Moving these somewhere remote from 
the catalog lines they relate to seems like quite a bad idea.





2. Andrew's example above implies some sort of mapping between the
keywords and the actual column names (or at least column positions).
Where and how is that specified?



There are several possibilities. The one I was leaning towards was to 
parse out the Anum_pg_foo_* definitions.





3. Also where are we going to provide the per-column default values?
How does the converter script know which columns to convert to type oids,
proc oids, etc?  Is it going to do any data validation beyond that, and
if so on what basis?



a) something like DATA_DEFAULTS( foo=bar );
b) something like DATA_TYPECONV ( rettype argtypes allargtypes );


Hadn't thought about procoids, but something similar.



4. What will we do about the #define's that some of the .h files provide
for (some of) their object OIDs?  I assume that we want to move in the
direction of autogenerating those macros a la fmgroids.h, but this needs
a concrete spec as well.  If we don't want this change to result in a big
hit to the source code, we're probably going to need to be able to specify
the macro names to generate in the data files.



Yeah, as I noted above it's a bit messy,




5. One of the requirements that was mentioned in previous discussions
was to make it easier to add new columns to catalogs.  This format
does that only to the extent that you don't have to touch entries that
can use the default value for such a column.  Is that good enough, and
if not, what might we be able to do to make it better?



I think it is good enough, at least for a first cut.




I'd actually like to roll up the DESCR lines in pg_proc.h into this too,
they strike me as a bit of a wart. But I'm flexible on that.

+1, if we can come up with a better syntax.  This together with the
OID-macro issue suggests that there will be items in each data entry that
correspond to something other than columns of the target catalog.  But
that seems fine.


If we can generalize this to other catalogs, then that will be good, but
my inclination is to handle the elephant in the room (pg_proc.h) and
worry about the gnats later.

I think we want to do them all.  pg_proc.h is actually one of the easier
catalogs to work on presently, IMO, because the only kind of
cross-references it has are type OIDs.  Things like pg_amop are a mess.
And I really don't want to be dealing with multiple notations for catalog
data.  Also I think this will be subject to Polya's paradox: designing a
general solution will be easier and cleaner than a hack that works only
for one catalog.



I don't know that we need to handle everything at once, as long as the 
solution is sufficiently general.




cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-13 Thread Tom Lane
Andres Freund  writes:
> On 2016-11-13 11:23:09 -0500, Tom Lane wrote:
>> We can't use CREATE FUNCTION as the representation in the .bki file,
>> because of the circularities involved (you can't fill pg_proc before
>> pg_type nor vice versa).  But I think Peter was suggesting that the
>> input to the bki-generator script could look like CREATE commands.
>> That's true, but I fear it would greatly increase the complexity
>> of the script for not much benefit.

> It'd also be very pg_proc specific, which isn't where I think this
> should go..

The presumption is that we have a CREATE command for every type of
object that we need to put into the system catalogs.  But yes, the
other problem with this approach is that you need to do a lot more
work per-catalog to build the converter script.  I'm not sure how
much of that could be imported from gram.y, but I'm afraid the
answer would be "not enough".

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-13 Thread Andres Freund
On 2016-11-13 11:23:09 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-11-13 00:20:22 -0500, Peter Eisentraut wrote:
> >> Then we're not very far away from just using CREATE FUNCTION SQL commands.
>
> > Well, those do a lot of syscache lookups, which in turn do lookups for
> > functions...
>
> We can't use CREATE FUNCTION as the representation in the .bki file,
> because of the circularities involved (you can't fill pg_proc before
> pg_type nor vice versa).  But I think Peter was suggesting that the
> input to the bki-generator script could look like CREATE commands.
> That's true, but I fear it would greatly increase the complexity
> of the script for not much benefit.

It'd also be very pg_proc specific, which isn't where I think this
should go..

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-13 Thread Andres Freund
On 2016-11-13 11:11:37 -0500, Tom Lane wrote:
> 1. Are we going to try to keep these things in the .h files, or split
> them out?  I'd like to get them out, as that eliminates both the need
> to keep the things looking like macro calls, and the need for the data
> within the macro call to be at least minimally parsable as C.

I vote for splitting them out.


> 2. Andrew's example above implies some sort of mapping between the
> keywords and the actual column names (or at least column positions).
> Where and how is that specified?

I don't know what andrew was planning, but before I stopped I had a 1:1
mapping beteween column names and keywords. Catalog.pm parses the
pg_*.h headers and thus knows the table definition via the CATALOG()
stuff.


> 3. Also where are we going to provide the per-column default values?

That's a good question, I suspect we should move that knowledge to the
headers as well. Possibly using something like BKI_DEFAULT(...)?


> How does the converter script know which columns to convert to type oids,
> proc oids, etc?

I simply had that based on the underlying reg* type. I.e. if a column
was regtype the script would map it to type oids and so on.  That
required some type changes, which does have some compatibility concerns.


> Is it going to do any data validation beyond that, and if so on what basis?

Hm, not sure if we really need something.


> 4. What will we do about the #define's that some of the .h files provide
> for (some of) their object OIDs?  I assume that we want to move in the
> direction of autogenerating those macros a la fmgroids.h, but this needs
> a concrete spec as well.

I suspect at least type oids we'll continue to have to maintain
manually. A good number of things rely on the builtin type oids being
essentially stable.


> > If we can generalize this to other catalogs, then that will be good, but 
> > my inclination is to handle the elephant in the room (pg_proc.h) and 
> > worry about the gnats later.
> 
> I think we want to do them all.

+1


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-13 Thread Tom Lane
Andres Freund  writes:
> On 2016-11-13 00:20:22 -0500, Peter Eisentraut wrote:
>> Then we're not very far away from just using CREATE FUNCTION SQL commands.

> Well, those do a lot of syscache lookups, which in turn do lookups for
> functions...

We can't use CREATE FUNCTION as the representation in the .bki file,
because of the circularities involved (you can't fill pg_proc before
pg_type nor vice versa).  But I think Peter was suggesting that the
input to the bki-generator script could look like CREATE commands.
That's true, but I fear it would greatly increase the complexity
of the script for not much benefit.  It does little for the question of
"how do you update the data when adding a new pg_proc column", for
instance.  And you'd still need some non-SQL warts, like how to specify
manually-assigned OIDs for types and functions.  (I'm not sure whether
we could get away with dropping fixed assignments of function OIDs,
but we absolutely can't do so for types.  Lots of client code knows
that text is oid 25, for example.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-13 Thread Tom Lane
Andrew Dunstan  writes:
> I'm not convinced the line prefix part is necessary, though. What I'm 
> thinking of is something like this:

> PROCDATA( oid=1242 name=boolin isstrict=t volatile=i parallel=s nargs=1
>  rettype=bool argtypes="cstring" src=boolin );

We could go in that direction too, but the apparent flexibility to split
entries into multiple lines is an illusion, at least if you try to go
beyond a few lines; you'd end up with duplicated line sequences in
different entries and thus ambiguity for patch(1).  I don't have any
big objection to the above, but it's not obviously better either.

Some things we should try to resolve before settling definitively on
a data representation:

1. Are we going to try to keep these things in the .h files, or split
them out?  I'd like to get them out, as that eliminates both the need
to keep the things looking like macro calls, and the need for the data
within the macro call to be at least minimally parsable as C.

2. Andrew's example above implies some sort of mapping between the
keywords and the actual column names (or at least column positions).
Where and how is that specified?

3. Also where are we going to provide the per-column default values?
How does the converter script know which columns to convert to type oids,
proc oids, etc?  Is it going to do any data validation beyond that, and
if so on what basis?

4. What will we do about the #define's that some of the .h files provide
for (some of) their object OIDs?  I assume that we want to move in the
direction of autogenerating those macros a la fmgroids.h, but this needs
a concrete spec as well.  If we don't want this change to result in a big
hit to the source code, we're probably going to need to be able to specify
the macro names to generate in the data files.

5. One of the requirements that was mentioned in previous discussions
was to make it easier to add new columns to catalogs.  This format
does that only to the extent that you don't have to touch entries that
can use the default value for such a column.  Is that good enough, and
if not, what might we be able to do to make it better?


> I'd actually like to roll up the DESCR lines in pg_proc.h into this too, 
> they strike me as a bit of a wart. But I'm flexible on that.

+1, if we can come up with a better syntax.  This together with the
OID-macro issue suggests that there will be items in each data entry that
correspond to something other than columns of the target catalog.  But
that seems fine.

> If we can generalize this to other catalogs, then that will be good, but 
> my inclination is to handle the elephant in the room (pg_proc.h) and 
> worry about the gnats later.

I think we want to do them all.  pg_proc.h is actually one of the easier
catalogs to work on presently, IMO, because the only kind of
cross-references it has are type OIDs.  Things like pg_amop are a mess.
And I really don't want to be dealing with multiple notations for catalog
data.  Also I think this will be subject to Polya's paradox: designing a
general solution will be easier and cleaner than a hack that works only
for one catalog.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-13 Thread Andrew Dunstan



On 11/13/2016 04:54 AM, Andres Freund wrote:

On 2016-11-12 12:30:45 -0500, Andrew Dunstan wrote:


On 11/11/2016 11:10 AM, Tom Lane wrote:

boolin: OID=1242 proname=boolin proargtypes="cstring" prorettype=bool
boolin: prosrc=boolin provolatile=i proparallel=s




I have written a little perl script to turn the pg_proc DATA lines into
something like the format suggested. In order to keep the space used as
small as possible, I used a prefix based on the OID. See attached result.

Still plenty of work to go, e.g. grabbing the DESCR lines, and turning this
all back into DATA/DESCR lines, but I wanted to get this out there before
going much further.

The defaults I used are below (commented out keys are not defaulted, they
are just there for completeness).

In the referenced thread I'd started to work on something like this,
until other people also said they'd be working on it.  I chose a
different output format (plain Data::Dumper), but I'd added the parsing
of DATA/DESCR and such to genbki.

Note that I found that initdb performance is greatly increased *and*
legibility is improvided, if types and such in the data files are
expanded, and converted to their oids when creating postgres.bki.



Yeah, I have the type name piece, it was close to trivial. I just read 
in pg_type.h and stored the names/oids in a hash.


Data::Dumper is too wasteful of space. The thing I like about Tom's 
format is that it's nice and concise.


I'm not convinced the line prefix part is necessary, though. What I'm 
thinking of is something like this:


PROCDATA( oid=1242 name=boolin isstrict=t volatile=i parallel=s nargs=1
rettype=bool argtypes="cstring" src=boolin );

Teaching Catalog.pm how to parse that and turn the type names back into 
oids won't be difficult. I already have code for the prefix version, and 
this would be easier since there is an end marker.


I'd actually like to roll up the DESCR lines in pg_proc.h into this too, 
they strike me as a bit of a wart. But I'm flexible on that.


If we can generalize this to other catalogs, then that will be good, but 
my inclination is to handle the elephant in the room (pg_proc.h) and 
worry about the gnats later.




I basically made genbki/catalog.pm accept text whenever a column is of
type regtype/regprocedure/. To then make use of that I converted a bunch
of plain oid columns to their their reg* equivalent. That's also nice
for just plain qureying of the catalogs ;)

I don't think the code is going to be much use for you directlky, but it
might be worthwhile to crib some stuff from the 0002 of the attached
patches (based on 74811c4050921959d54d42e2c15bb79f0e2c37f3).



Thanks, I will take a look.

cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-13 Thread Andres Freund
On 2016-11-13 00:20:22 -0500, Peter Eisentraut wrote:
> On 11/11/16 11:10 AM, Tom Lane wrote:
> > boolin: OID=1242 proname=boolin proargtypes="cstring" prorettype=bool
> > boolin: prosrc=boolin provolatile=i proparallel=s
> 
> Then we're not very far away from just using CREATE FUNCTION SQL commands.

Well, those do a lot of syscache lookups, which in turn do lookups for
functions...

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-12 Thread Peter Eisentraut
On 11/11/16 11:10 AM, Tom Lane wrote:
> boolin: OID=1242 proname=boolin proargtypes="cstring" prorettype=bool
> boolin: prosrc=boolin provolatile=i proparallel=s

Then we're not very far away from just using CREATE FUNCTION SQL commands.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-11 Thread Tom Lane
Andrew Dunstan  writes:
> +1. If we come up with an agreed format I will undertake to produce the 
> requisite perl script. So let's reopen the debate on the data format. I 
> want something that doesn't consume large numbers of lines per entry. If 
> we remove defaults in most cases we should be able to fit a set of 
> key/value pairs on just a handful of lines.

The other reason for keeping the entries short is to prevent patch
misapplications: you want three or less lines of context to be enough
to uniquely identify which line you're changing.  So something with,
say, a bunch of  overhead, with that markup split onto
separate lines, would be a disaster.  This may mean that we can't
get too far away from the DATA-line approach :-(.

Or maybe what we need to do is ensure that there's identification info on
every line, something like (from the first entry in pg_proc.h)

boolin: OID=1242 proname=boolin proargtypes="cstring" prorettype=bool
boolin: prosrc=boolin provolatile=i proparallel=s

(I'm imagining the prefix as having no particular semantic significance,
except that identical values on successive lines denote fields for a
single catalog row.)

With this approach, even if you had blocks of boilerplate-y lines
that were the same for many successive functions, the prefixes would
keep them looking unique to "patch".

On the other hand, Andrew might be right that with reasonable defaults
available, the entries would mostly be short enough that there wouldn't
be much of a problem anyway.  This example certainly looks that way.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-11 Thread Andrew Dunstan



On 11/11/2016 03:03 AM, Magnus Hagander wrote:


On Nov 11, 2016 00:53, "Jan de Visser" > wrote:

>
> On 2016-11-09 10:47 AM, Tom Lane wrote:
>
>> Amit Langote > writes:

>>>
>>> On Wed, Nov 9, 2016 at 11:47 PM, Tom Lane > wrote:


 Hmm, that's from 2009.  I thought I remembered something much 
more recent,

 like last year or so.
>>>
>>> This perhaps:
>>> * Re: Bootstrap DATA is a pita *
>>> 
https://www.postgresql.org/message-id/flat/CAOjayEfKBL-_Q9m3Jsv6V-mK1q8h%3Dca5Hm0fecXGxZUhPDN9BA%40mail.gmail.com

>>
>> Yeah, that's the thread I remembered.  I think the basic conclusion was
>> that we needed a Perl script that would suck up a bunch of data 
from some

>> representation that's more edit-friendly than the DATA lines, expand
>> symbolic representations (regprocedure etc) into numeric OIDs, and 
write
>> out the .bki script from that.  I thought some people had 
volunteered to

>> work on that, but we've seen no results ...
>>
>> regards, tom lane
>>
>>
>
> Would a python script converting something like json or yaml be 
acceptable? I think right now only perl is used, so it would be a new 
build chain tool, albeit one that's in my (very humble) opinion much 
better suited to the task.

>

Python or perl is not what matters here really. For something as 
simple as this (for the script) it doesn't make a real difference. I 
personally prefer python over perl in most cases, but our standard is 
perl so we should stick to that.


The issues is coming up with a format that people like and think is an 
improvement.


If we have that and a python script for our, someone would surely 
volunteer to convert that part. But we need to start by solving the 
actual problem.






+1. If we come up with an agreed format I will undertake to produce the 
requisite perl script. So let's reopen the debate on the data format. I 
want something that doesn't consume large numbers of lines per entry. If 
we remove defaults in most cases we should be able to fit a set of 
key/value pairs on just a handful of lines.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-11 Thread Magnus Hagander
On Nov 11, 2016 00:53, "Jan de Visser"  wrote:
>
> On 2016-11-09 10:47 AM, Tom Lane wrote:
>
>> Amit Langote  writes:
>>>
>>> On Wed, Nov 9, 2016 at 11:47 PM, Tom Lane  wrote:

 Hmm, that's from 2009.  I thought I remembered something much more
recent,
 like last year or so.
>>>
>>> This perhaps:
>>> * Re: Bootstrap DATA is a pita *
>>>
https://www.postgresql.org/message-id/flat/CAOjayEfKBL-_Q9m3Jsv6V-mK1q8h%3Dca5Hm0fecXGxZUhPDN9BA%40mail.gmail.com
>>
>> Yeah, that's the thread I remembered.  I think the basic conclusion was
>> that we needed a Perl script that would suck up a bunch of data from some
>> representation that's more edit-friendly than the DATA lines, expand
>> symbolic representations (regprocedure etc) into numeric OIDs, and write
>> out the .bki script from that.  I thought some people had volunteered to
>> work on that, but we've seen no results ...
>>
>> regards, tom lane
>>
>>
>
> Would a python script converting something like json or yaml be
acceptable? I think right now only perl is used, so it would be a new build
chain tool, albeit one that's in my (very humble) opinion much better
suited to the task.
>

Python or perl is not what matters here really. For something as simple as
this (for the script) it doesn't make a real difference. I personally
prefer python over perl in most cases, but our standard is perl so we
should stick to that.

The issues is coming up with a format that people like and think is an
improvement.

If we have that and a python script for our, someone would surely volunteer
to convert that part. But we need to start by solving the actual problem.

/Magnus


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-10 Thread Corey Huinker
On Thu, Nov 10, 2016 at 6:41 PM, Tom Lane  wrote:

> I think you've fundamentally missed the point here.  A data dump from a
> table would be semantically indistinguishable from the lots-o-DATA-lines
> representation we have now.  What we want is something that isn't that.
> In particular I don't see how that would let us have any extra level of
> abstraction that's not present in the finished form of the catalog tables.
>

I was thinking several tables, with the central table having column values
which we find semantically descriptive, and having lookup tables to map
those semantically descriptive keys to the value we actually want in the
pg_proc column. It'd be a tradeoff of macros for entries in lookup tables.


> I'm not very impressed with the suggestion of making a competing product
> part of our build dependencies, either.
>

I don't see the products as competing, nor did the presenter of
https://www.pgcon.org/2014/schedule/events/736.en.html (title: SQLite:
Protégé of PostgreSQL). That talk made the case that SQLite's goal is to be
the foundation of file formats, not an RDBMS. I do understand wanting to
minimize build dependencies.


> If we wanted to get into build
> dependency circularities, we could consider using a PG database in this
> way ... but I prefer to leave such headaches to compiler authors for whom
> it comes with the territory.
>

Agreed, bootstrapping builds aren't fun. This suggestion was a way to have
a self-contained format that uses concepts (joining a central table to
lookup tables) already well understood in our community.


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-10 Thread Jan de Visser

On 2016-11-09 10:47 AM, Tom Lane wrote:


Amit Langote  writes:

On Wed, Nov 9, 2016 at 11:47 PM, Tom Lane  wrote:

Hmm, that's from 2009.  I thought I remembered something much more recent,
like last year or so.

This perhaps:
* Re: Bootstrap DATA is a pita *
https://www.postgresql.org/message-id/flat/CAOjayEfKBL-_Q9m3Jsv6V-mK1q8h%3Dca5Hm0fecXGxZUhPDN9BA%40mail.gmail.com

Yeah, that's the thread I remembered.  I think the basic conclusion was
that we needed a Perl script that would suck up a bunch of data from some
representation that's more edit-friendly than the DATA lines, expand
symbolic representations (regprocedure etc) into numeric OIDs, and write
out the .bki script from that.  I thought some people had volunteered to
work on that, but we've seen no results ...

regards, tom lane




Would a python script converting something like json or yaml be 
acceptable? I think right now only perl is used, so it would be a new 
build chain tool, albeit one that's in my (very humble) opinion much 
better suited to the task.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-10 Thread Tom Lane
Corey Huinker  writes:
> On Wed, Nov 9, 2016 at 10:47 AM, Tom Lane  wrote:
>> Yeah, that's the thread I remembered.  I think the basic conclusion was
>> that we needed a Perl script that would suck up a bunch of data from some
>> representation that's more edit-friendly than the DATA lines, expand
>> symbolic representations (regprocedure etc) into numeric OIDs, and write
>> out the .bki script from that.  I thought some people had volunteered to
>> work on that, but we've seen no results ...

> If there are no barriers to adding it to our toolchain, could that
> more-edit-friendly representation be a SQLite database?

I think you've fundamentally missed the point here.  A data dump from a
table would be semantically indistinguishable from the lots-o-DATA-lines
representation we have now.  What we want is something that isn't that.
In particular I don't see how that would let us have any extra level of
abstraction that's not present in the finished form of the catalog tables.
(An example that's already there is FLOAT8PASSBYVAL for the value of
typbyval appropriate to float8 and allied types.)

I'm not very impressed with the suggestion of making a competing product
part of our build dependencies, either.  If we wanted to get into build
dependency circularities, we could consider using a PG database in this
way ... but I prefer to leave such headaches to compiler authors for whom
it comes with the territory.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-10 Thread Corey Huinker
On Wed, Nov 9, 2016 at 10:47 AM, Tom Lane  wrote:

> Yeah, that's the thread I remembered.  I think the basic conclusion was
> that we needed a Perl script that would suck up a bunch of data from some
> representation that's more edit-friendly than the DATA lines, expand
> symbolic representations (regprocedure etc) into numeric OIDs, and write
> out the .bki script from that.  I thought some people had volunteered to
> work on that, but we've seen no results ...
>

If there are no barriers to adding it to our toolchain, could that
more-edit-friendly representation be a SQLite database?

I'm not suggesting we store a .sqlite file in our repo. I'm suggesting that
we store the dump-restore script in our repo, and the program that
generates the .bki script would query the generated SQLite db.

>From that initial dump, any changes to pg_proc.h would be appended to the
dumped script

...

/* add new frombozulation feature */

ALTER TABLE pg_proc_template ADD frombozulator text;
/* bubbly frombozulation is the default for volatile functions */
UPDATE pg_proc_template SET frombozulator = 'bubbly' WHERE provolatile =
'v';

/* proposed new function */

INSERT INTO pg_proc_template(proname,proleakproof) VALUES ("new_func",'f');



That'd communicate the meaning of our changes rather nicely. A way to eat
our own conceptual dogfood.

Eventually it'd get cluttered and we'd replace the populate script with a
fresh ".dump". Maybe we do that as often as we reformat our C code.

I think Stephen Frost suggested something like this a while back, but I
couldn't find it after a short search.


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-09 Thread Tom Lane
Amit Langote  writes:
> On Wed, Nov 9, 2016 at 11:47 PM, Tom Lane  wrote:
>> Hmm, that's from 2009.  I thought I remembered something much more recent,
>> like last year or so.

> This perhaps:

> * Re: Bootstrap DATA is a pita *
> https://www.postgresql.org/message-id/flat/CAOjayEfKBL-_Q9m3Jsv6V-mK1q8h%3Dca5Hm0fecXGxZUhPDN9BA%40mail.gmail.com

Yeah, that's the thread I remembered.  I think the basic conclusion was
that we needed a Perl script that would suck up a bunch of data from some
representation that's more edit-friendly than the DATA lines, expand
symbolic representations (regprocedure etc) into numeric OIDs, and write
out the .bki script from that.  I thought some people had volunteered to
work on that, but we've seen no results ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-09 Thread Amit Langote
On Wed, Nov 9, 2016 at 11:47 PM, Tom Lane  wrote:
> Michael Paquier  writes:
>> On Wed, Nov 9, 2016 at 1:44 PM, Tom Lane  wrote:
>>> I don't think we need "named constants", especially not
>>> manually-maintained ones.  The thing that would help in pg_proc.h is for
>>> numeric type OIDs to be replaced by type names.  We talked awhile back
>>> about introducing some sort of preprocessing step that would allow doing
>>> that --- ie, it would look into some precursor file for pg_type.h and
>>> extract the appropriate OID automatically.  I'm too tired to go find the
>>> thread right now, but it was mostly about building the long-DATA-lines
>>> representation from something easier to edit.
>
>> You mean that I guess:
>> https://www.postgresql.org/message-id/4d191a530911041228v621286a7q6a98d9ab8a2ed...@mail.gmail.com
>
> Hmm, that's from 2009.  I thought I remembered something much more recent,
> like last year or so.

This perhaps:

* Re: Bootstrap DATA is a pita *
https://www.postgresql.org/message-id/flat/CAOjayEfKBL-_Q9m3Jsv6V-mK1q8h%3Dca5Hm0fecXGxZUhPDN9BA%40mail.gmail.com

Thanks,
Amit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-09 Thread Tom Lane
Michael Paquier  writes:
> On Wed, Nov 9, 2016 at 1:44 PM, Tom Lane  wrote:
>> I don't think we need "named constants", especially not
>> manually-maintained ones.  The thing that would help in pg_proc.h is for
>> numeric type OIDs to be replaced by type names.  We talked awhile back
>> about introducing some sort of preprocessing step that would allow doing
>> that --- ie, it would look into some precursor file for pg_type.h and
>> extract the appropriate OID automatically.  I'm too tired to go find the
>> thread right now, but it was mostly about building the long-DATA-lines
>> representation from something easier to edit.

> You mean that I guess:
> https://www.postgresql.org/message-id/4d191a530911041228v621286a7q6a98d9ab8a2ed...@mail.gmail.com

Hmm, that's from 2009.  I thought I remembered something much more recent,
like last year or so.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-08 Thread Michael Paquier
On Wed, Nov 9, 2016 at 1:44 PM, Tom Lane  wrote:
> I don't think we need "named constants", especially not
> manually-maintained ones.  The thing that would help in pg_proc.h is for
> numeric type OIDs to be replaced by type names.  We talked awhile back
> about introducing some sort of preprocessing step that would allow doing
> that --- ie, it would look into some precursor file for pg_type.h and
> extract the appropriate OID automatically.  I'm too tired to go find the
> thread right now, but it was mostly about building the long-DATA-lines
> representation from something easier to edit.

You mean that I guess:
https://www.postgresql.org/message-id/4d191a530911041228v621286a7q6a98d9ab8a2ed...@mail.gmail.com
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-08 Thread Tom Lane
Robert Haas  writes:
> Most of these files don't have that many entries, and they're not
> modified that often.  The elephant in the room is pg_proc.h, which is
> huge, frequently-modified, and hard to decipher.  But I think that's
> going to need more surgery than just introducing named constants -
> which would also have the downside of making the already-long lines
> even longer.

I don't think we need "named constants", especially not
manually-maintained ones.  The thing that would help in pg_proc.h is for
numeric type OIDs to be replaced by type names.  We talked awhile back
about introducing some sort of preprocessing step that would allow doing
that --- ie, it would look into some precursor file for pg_type.h and
extract the appropriate OID automatically.  I'm too tired to go find the
thread right now, but it was mostly about building the long-DATA-lines
representation from something easier to edit.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-08 Thread Craig Ringer
On 9 November 2016 at 10:20, Hao Lee  wrote:
> yes, i agree with you. These catalogs are not modified often. As your said,
> the pg_proc modified often, therefore, there are another issues, the
> dependency between these system catalogs and system views. it's hard to gain
> maintenance the consistency between these catalogs and views. It's need more
> cares when do modifying. So that i think that whether there are some more
> smarter approaches to make it smarter or not.
>
> On Wed, Nov 9, 2016 at 6:33 AM, Robert Haas  wrote:
>>
>> On Mon, Nov 7, 2016 at 9:10 PM, Michael Paquier
>>  wrote:
>> > On Tue, Nov 8, 2016 at 10:57 AM, Hao Lee  wrote:
>> >> It's a tedious work to figure out these numbers real meaning. for
>> >> example,
>> >> if i want to know the value of '71'  represent what it is. I should go
>> >> back
>> >> to refer to definition of pg_class struct. It's a tedious work and it's
>> >> not
>> >> maintainable or readable.  I THINK WE SHOULD USE a meaningful variable
>> >> instead of '71'. For Example:
>> >>
>> >> #define PG_TYPE_RELTYPE 71
>> >
>> > You'd need to make genbki.pl smarter regarding the way to associate
>> > those variables with the defined variables, greatly increasing the
>> > amount of work it is doing as well as its maintenance (see for PGUID
>> > handling for example). I am not saying that this is undoable, just
>> > that the complexity may not be worth the potential readability gains.
>>
>> Most of these files don't have that many entries, and they're not
>> modified that often.  The elephant in the room is pg_proc.h, which is
>> huge, frequently-modified, and hard to decipher.  But I think that's
>> going to need more surgery than just introducing named constants -
>> which would also have the downside of making the already-long lines
>> even longer.

I'd be pretty happy to see pg_proc.h in particular replaced with some
pg_proc.h.in with something sane doing the preprocessing. It's a
massive pain right now.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-08 Thread Hao Lee
yes, i agree with you. These catalogs are not modified often. As your said,
the pg_proc modified often, therefore, there are another issues, the
dependency between these system catalogs and system views. it's hard to
gain maintenance the consistency between these catalogs and views. It's
need more cares when do modifying. So that i think that whether there are
some more smarter approaches to make it smarter or not.

On Wed, Nov 9, 2016 at 6:33 AM, Robert Haas  wrote:

> On Mon, Nov 7, 2016 at 9:10 PM, Michael Paquier
>  wrote:
> > On Tue, Nov 8, 2016 at 10:57 AM, Hao Lee  wrote:
> >> It's a tedious work to figure out these numbers real meaning. for
> example,
> >> if i want to know the value of '71'  represent what it is. I should go
> back
> >> to refer to definition of pg_class struct. It's a tedious work and it's
> not
> >> maintainable or readable.  I THINK WE SHOULD USE a meaningful variable
> >> instead of '71'. For Example:
> >>
> >> #define PG_TYPE_RELTYPE 71
> >
> > You'd need to make genbki.pl smarter regarding the way to associate
> > those variables with the defined variables, greatly increasing the
> > amount of work it is doing as well as its maintenance (see for PGUID
> > handling for example). I am not saying that this is undoable, just
> > that the complexity may not be worth the potential readability gains.
>
> Most of these files don't have that many entries, and they're not
> modified that often.  The elephant in the room is pg_proc.h, which is
> huge, frequently-modified, and hard to decipher.  But I think that's
> going to need more surgery than just introducing named constants -
> which would also have the downside of making the already-long lines
> even longer.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-08 Thread Robert Haas
On Mon, Nov 7, 2016 at 9:10 PM, Michael Paquier
 wrote:
> On Tue, Nov 8, 2016 at 10:57 AM, Hao Lee  wrote:
>> It's a tedious work to figure out these numbers real meaning. for example,
>> if i want to know the value of '71'  represent what it is. I should go back
>> to refer to definition of pg_class struct. It's a tedious work and it's not
>> maintainable or readable.  I THINK WE SHOULD USE a meaningful variable
>> instead of '71'. For Example:
>>
>> #define PG_TYPE_RELTYPE 71
>
> You'd need to make genbki.pl smarter regarding the way to associate
> those variables with the defined variables, greatly increasing the
> amount of work it is doing as well as its maintenance (see for PGUID
> handling for example). I am not saying that this is undoable, just
> that the complexity may not be worth the potential readability gains.

Most of these files don't have that many entries, and they're not
modified that often.  The elephant in the room is pg_proc.h, which is
huge, frequently-modified, and hard to decipher.  But I think that's
going to need more surgery than just introducing named constants -
which would also have the downside of making the already-long lines
even longer.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-07 Thread Michael Paquier
On Tue, Nov 8, 2016 at 10:57 AM, Hao Lee  wrote:
> It's a tedious work to figure out these numbers real meaning. for example,
> if i want to know the value of '71'  represent what it is. I should go back
> to refer to definition of pg_class struct. It's a tedious work and it's not
> maintainable or readable.  I THINK WE SHOULD USE a meaningful variable
> instead of '71'. For Example:
>
> #define PG_TYPE_RELTYPE 71

You'd need to make genbki.pl smarter regarding the way to associate
those variables with the defined variables, greatly increasing the
amount of work it is doing as well as its maintenance (see for PGUID
handling for example). I am not saying that this is undoable, just
that the complexity may not be worth the potential readability gains.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Do we need use more meaningful variables to replace 0 in catalog head files?

2016-11-07 Thread Hao Lee
Hi guys,
   Although, usually, we do not change the system catalog or modify the
catalog schema, or adding a new system catalog, but in these system catalog
head files, such as pg_xxx.h, i think we should use more meaningful
variables. As we known, in pg_xxx.h files, we insert some initial values
into system catalog, as following shown in pg_class.h.

DATA(insert OID = 1247 (  pg_type PGNSP 71 0 PGUID 0 0 0 0 0 0 0 f f p r 30
0 t f f f f f f t n 3 1 _null_ _null_ ));
DESCR("");
DATA(insert OID = 1249 (  pg_attribute PGNSP 75 0 PGUID 0 0 0 0 0 0 0 f f p
r 21 0 f f f f f f f t n 3 1 _null_ _null_ ));
DESCR("");

It's a tedious work to figure out these numbers real meaning. for example,
if i want to know the value of '71'  represent what it is. I should go back
to refer to definition of pg_class struct. It's a tedious work and it's not
maintainable or readable.  I THINK WE SHOULD USE a meaningful variable
instead of '71'. For Example:

#define PG_TYPE_RELTYPE 71



Regards,

Hom.