Re: [PATCHES] Database owner installable modules patch

2008-05-11 Thread Tom Dunstan
On Sat, May 10, 2008 at 11:02 AM, Bruce Momjian <[EMAIL PROTECTED]> wrote:
>
> Where are we on this?

I haven't had time to do any work since the original patch. That patch
was fairly basic - it just ran install / uninstall scripts and created
catalog entries, and introduced some slightly exotic syntax to do it
(INSTALL/UNINSTALL vs CREATE/DROP). The next version is intended to
handle dependencies, which should make uninstallation straight forward
for most cases. I was intending to revert the syntax creativity and
make the commands CREATE/DROP too.

I'll get a bit of time to look at both this and the enum patch this week.

Cheers

Tom

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] [HACKERS] Database owner installable modules patch

2008-04-07 Thread Tom Dunstan
On Mon, Apr 7, 2008 at 7:55 PM, Tom Lane <[EMAIL PROTECTED]> wrote:
> "Tom Dunstan" <[EMAIL PROTECTED]> writes:
>  > OK, I found an example that does NOT fit the "just drop all
>  > dependencies" scenario, but that I would still like to support. I just
>  > had a look at the postgis pl/java support, and its install does stuff
>  > like "SELECT sqlj.install_jar('file://${PWD}/postgis_pljava.jar',
>  > 'postgis_pljava_jar',  false);" and "SELECT
>  > sqlj.add_type_mapping('geometry', 'org.postgis.pljava.PLJGeometry');".
>  > There's no way we can deal with that sort of thing automatically, so
>  > we'll have to support uninstall scripts regardless.
>
>  Well, that just begs the question of what those commands actually *do*.
>  It seems not unlikely that they'd be inserting data into tables that
>  would belong to the module, in which case an uninstall that dropped
>  the table would be fine.

Those tables belong to a *different* module, though. I'm picturing
three modules here: pljava, postgis, and a postgis-pljava support
module that requires the first two, since it should be possible to
install postgis without requiring pljava. The above stuff was from the
install script of the postgis-pljava code, but inserted data into
tables owned by pljava.

Cheers

Tom

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] [HACKERS] Database owner installable modules patch

2008-04-07 Thread Tom Dunstan
Sorry to keep replying to myself, but part of the point of doing a
patch was to force myself (and whoever else is interested to examine
stuff that comes up...

On Mon, Apr 7, 2008 at 11:46 AM, Tom Dunstan <[EMAIL PROTECTED]> wrote:
>  None of that suggests that an uninstaller script would be needed if we
>  understood the deps well enough, but only allowing creates for
>  installs seems a bit restrictive.

OK, I found an example that does NOT fit the "just drop all
dependencies" scenario, but that I would still like to support. I just
had a look at the postgis pl/java support, and its install does stuff
like "SELECT sqlj.install_jar('file://${PWD}/postgis_pljava.jar',
'postgis_pljava_jar',  false);" and "SELECT
sqlj.add_type_mapping('geometry', 'org.postgis.pljava.PLJGeometry');".
There's no way we can deal with that sort of thing automatically, so
we'll have to support uninstall scripts regardless.

The question then becomes: is it worth trying to do stuff
automatically if we provide a manual method anyway? I think the answer
is probably yes, because having pgsql clean up automatically for the
vast majority of cases is a good thing. If it's only exotic cases that
need a manual uninstall script, why force one on everyone else?

Cheers

Tom

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] [HACKERS] Database owner installable modules patch

2008-04-07 Thread Tom Dunstan
On Mon, Apr 7, 2008 at 11:46 AM, Tom Dunstan <[EMAIL PROTECTED]> wrote:
> On Mon, Apr 7, 2008 at 3:59 AM, Gregory Stark <[EMAIL PROTECTED]> wrote:
>  >  I wonder if there's much of a use case for any statements aside from 
> CREATE
>  >  statements. If we restrict it to CREATE statements we could hack things to
>  >  create pg_depend entries automatically. In which case we wouldn't need an
>  >  uninstall script at all.

>  >  The hacks to do this seem pretty dirty but on the other hand the idea of
>  >  having modules consist of a bunch of objects rather than arbitrary SQL
>  >  actually seems cleaner and more robust.
>
>  It *does* seem cleaner for the examples that I've looked at. Are they
>  totally representative though? Not sure. It also implies a bunch more
>  work to create stuff, as we need to understand what's going on so as
>  to create those pg_depend entries.

This has been bouncing around in my head a bit. I was picturing the
module code itself having to understand all the CREATE statements in
order to set up the dependencies... but perhaps an easier way would
simply be to have the create statements themselves insert a pg_depend
entry when they're done, if they detect that we're currently
installing a module. There's already a flag for that that the
superuser code looks at in the patch. Maybe you were ahead of me, and
this was the hack that you were referring to. :) I tend to hate global
flags like that because they leave weird non-obvious dependencies
across the codebase, but perhaps it's the best way to do it in this
case. It would mean hacking every create command in the system to
understand it, though. :(

Cheers

Tom

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] [HACKERS] Database owner installable modules patch

2008-04-06 Thread Tom Dunstan
On Mon, Apr 7, 2008 at 3:59 AM, Gregory Stark <[EMAIL PROTECTED]> wrote:
>  I wonder if there's much of a use case for any statements aside from CREATE
>  statements. If we restrict it to CREATE statements we could hack things to
>  create pg_depend entries automatically. In which case we wouldn't need an
>  uninstall script at all.

Well, the example that got me interested in this stuff originally was
trying to make pl/java easier to install. It does a bunch of
CREATEs... and some GRANTs. Plus ISTM that a pretty common case might
be to create a table for some reference data and then fill it with
default values. Also, I just had a look at the postgis install script,
which at the very least seems to update an opclass entry after
creating it.

None of that suggests that an uninstaller script would be needed if we
understood the deps well enough, but only allowing creates for
installs seems a bit restrictive.

One thing that's nice about arbitrary sql for install / uninstall is
that module authors can test it outside the context of doing an actual
module installation - they just execute their scripts.

>  The hacks to do this seem pretty dirty but on the other hand the idea of
>  having modules consist of a bunch of objects rather than arbitrary SQL
>  actually seems cleaner and more robust.

It *does* seem cleaner for the examples that I've looked at. Are they
totally representative though? Not sure. It also implies a bunch more
work to create stuff, as we need to understand what's going on so as
to create those pg_depend entries. I'm receptive to the idea of
uninstall simply attempting to drop anything related to the module in
pg_depend in the correct order. I can't think of anything created by a
module that we couldn't represent there, and it's a nice way of
ensuring that an uninstall script cleans up properly.

Cheers

Tom

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


[PATCHES] Database owner installable modules patch

2008-04-06 Thread Tom Dunstan
Hi all

Here is a patch that provides an initial implementation of the module
idea that was kicked around over the last few days. While there
certainly wasn't consensus on list, enough people seemed interested in
the idea of database-owner-installable modules that I thought it was
worth having a play with.

The general idea, to recap, is to have modules, whether included in
the distribution a la contrib or installed separately, installed under
a directory such as $pkglib_dir/modules/foo. A typical module
directory might contain:
 - foo.so/foo.dll
 - install.sql
 - uninstall.sql
 - foo.conf
 - some-other-file-needed-by-foo-module.dat
The module would be installed on the system, but the necessary scripts
to install it in a particular database have not been run. In
particular, the modules would not usually be install in template1.
Database owners themselves can then opt to enable a particular
installed module in their own database - they do not have to hassle a
sysadmin to do it for them.


Features of the patch:
 - A database owner can issue the command "INSTALL MODULE foo", and
pgsql will look for a $pkglib_dir/modules/foo/install.sql file to run,
and run it.

 - The install script can do pretty much anything - the user is
treated as the superuser for the duration of the script. The main and
obvious usage is to create C language functions required by the
module.

 - An entry is created in a new pg_module catalog. This is mainly to
guard against someone trying to install a module twice at this point,
but it may have other uses in the future (see below).

 - "UNINSTALL MODULE foo" looks for and executes
$pkglib_dir/modules/foo/uninstall.sql and cleans up the catalog.



Here is a list of things that are either still to do before I'd
consider it worthy of inclusion (should the general approach be
considered acceptable), or which I'd like some guidance on:

 - Currently the script is executed in one big SPI_execute call, and
so errors and NOTICEs print out the entire script as context. I'm not
sure how to break it up without writing a full parser - we must have
something available in the backend to break a string up into multiple
statements to execute, but I'm not sure where to look. Also, is there
a better way to do this than SPI?

 - I've hacked in a bit of an end-run around permissions checks to
make the current user look like a super-user while a module script is
running. Is there a better way to do this?

 - I can't create C language functions from dlls under the modules
dir. I'd like to be able to specify 'modules/foo/foo' as the library
name, but the backend sees a slash and decides that must mean the path
is absolute. I see two ways to fix this: change the existing code in
dfmgr.c to *really* check for absolute/relative paths rather than the
current hack, or I could stick in a special case for when it starts
with "modules/". I thought I'd get some guidance on-list. Do people
think that sticking the dll in with other resources for the module
under $pkglib_dir is a bad thing? (I think having everything in one
place is a very good thing myself).Is the existing check written the
way it is for a particular reason, or is it just "good enough"?

 - It would be nice to create the empty modules dir when we install
pgsql, but while I suppose hacking a Makefile to install it is the way
to go, I'm not sure which one would be appropriate.

 - Hack pgxs to install stuff into a modules dir if we give it some
appropriate flag.

 - I'd like to add pg_depend entries for stuff installed by the module
on the pd_module entry, so that you can't drop stuff required by the
module without uninstalling the module itself. There would have to be
either a function or more syntax to allow a script to do that, or some
sort of module descriptor that let the backend do it itself.

 - Once the issue of loading a dll from inside the module's directory
is done, I'd like to look for an e.g. module_install() function inside
there, and execute that rather than the install.sql if found. Ditto
for uninstall.

 - Maybe a basic mechanism to allow a module to require another one.
Even just a "SELECT require_module('bar')" call at the top of a
script.

 - It would be nice to suppress NOTICEs when installing stuff - the
user almost certainly doesn't care.

 - Pick up config files in module directories, so that a module can
install and pick up config for itself rather than getting the sysadmin
to hack the global custom_variable_classes setting.

 - Should plperl etc be done as modules so that their config can live
independently as well? And to allow modules to "require" them?


Some other nice to haves for some point in the future:

 - Have some sort of install module privilege, rather than just a
check for database ownership
 - Allow looking for modules under some module path for e.g.
/usr/local module installs
 - Convert existing contrib to modules where appropriate :)
 - I really have no idea what happens if non-ascii characters are in
an install

Re: [PATCHES] Current enums patch

2007-04-02 Thread Tom Dunstan

Tom Lane wrote:

But probably making them act like identifiers is not a good idea,
because I doubt we want automatic downcasing in enum_in.  People
wouldn't be happy if they had to write WHERE color = '"Red"' or

> something like that to get at a mixed-case enum label.

Ah, that's what you had in mind. Yeah, that's a bit verbose.

> Let's

just throw the error instead.  (I agree that enum_in can just fail
with "no such label", but CREATE TYPE ought to give a specific
error about it.)


Agreed.

Andrew, you said you had a mostly-working patch already?

Cheers

Tom


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] Current enums patch

2007-04-02 Thread Tom Dunstan

Tom Lane wrote:

While all this reasoning is perfectly OK on its own terms, it ignores
the precedents of SQL identifier handling.  Maybe we should revisit the
question of whether the labels are identifiers.


Implying that they shouldn't be quoted like text (or should be 
double-quoted if required)? Originally when discussing the patch I 
thought that this was a good idea, but now I'm not so sure. How does one 
set an enum value from e.g. a JDBC PreparedStatement in that scenario?


Cheers

Tom

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PATCHES] Current enums patch

2007-04-02 Thread Tom Dunstan

Hm, I suppose we should apply truncate_identifier rather than letting
the strings be blindly truncated (perhaps in mid-character).  Should we
have it throw the truncation NOTICE, or not?  First thought is to do so
during CREATE TYPE but not during plain enum_in().
  


I don't see much point in truncating.

The patch I have so far gives the regression output shown below (yes, I 
should make the messages more consistent).
 
In fact the truncation and associated NOTICE just strike me as confusing.


[snip]

> + ERROR:  input value too long (74) for enum:
> "abcdefghijklmnopqrsatuvwxyz012345 etc etc

I was about to suggest that we just truncate the value in the input 
function and look it up on the basis that if it's too long, the lookup 
will fail and we can just tell the user that it wasn't a valid value. 
But if there's a valid value that has the same 63 bytes as the first 63 
of the too-long input string, we'll end up looking up the valid one 
wrongly. So we do need to test for length and die at that point. Can we 
just fail with the same error message as trying to input a smaller, but 
similarly invalid string?


At create time, we should definitely throw an error... if we just 
truncate the value at byte 64 (with a notice or not) we might truncate 
in the middle of a multi-byte character. Yuk.


Cheers

Tom

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PATCHES] Current enums patch

2007-04-02 Thread Tom Dunstan
> Looks like we need to check the length on type creation
> too.
> 
> I'll fix later if not beaten to it.

It works for me (ie fails with an appropriate error) locally
on Linux FC6 x86-64. Perhaps this platform initializes
memory to zero on allocation? I dunno. Anyway, if you can
reproduce it, please have a go, I'm at work and won't be
able to do anything more in depth for some time.

I could have sworn that I had some bounds checking in there
at one point, but it doesn't seem to be in the latest
version of the code. *shrug*. It should be both in
cstring_enum() in enum.c and in DefineEnum() in typecmds.c.

Cheers

Tom

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] Current enums patch

2007-04-01 Thread Tom Dunstan

Tom Lane wrote:

OK, I give up. :) Why?


Because the whole point is that the type has to be known at parse time.


Oh, duh. :)


I've got it working with get_fn_expr_argtype and it seems fairly
reasonable.


Cool!

Thanks

Tom

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PATCHES] Current enums patch

2007-04-01 Thread Tom Dunstan

Tom Lane wrote:

The null Datum itself obviously doesn't carry that info, but the
expression tree does, and there are provisions for letting functions
retrieve that info --- see get_fn_expr_rettype and get_fn_expr_argtype.


Hmm. I vaguely remember that there was some feeling that the PLs 
wouldn't always fill out the FmgrInfo struct, but perhaps that was just 
the case with I/O functions.


... could we 
have a special rule that would look for e.g. a regtype as the first 
parameter if the return type is generic and there are no generic parameters?


I thought about that too but don't like it much.  The problem is mainly
that it can only work for a constant regtype parameter.


OK, I give up. :) Why?

Thanks

Tom

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] Current enums patch

2007-04-01 Thread Tom Dunstan

text_enum: same problem as above, plus not acceptable to assume that it
gets the right enum OID, plus very definitely not acceptable to assume
that OID and typmod are the same size, plus I don't like the
undocumented kluge in getBaseTypeAndTypmod that is pretending to supply
the type OID for some small fraction of possible invocation cases.

I think text_enum is probably toast.


This was the ugliest part of the patch, and I mentioned it explicitly in 
the original submission because I was uncomfortable about it. The proper 
solution seemed to be to have another allowed cast function signature 
that would take the regtype rather than the typmod. That got just a 
little beyond what I wanted to do in the patch, and ugly as the 
getBaseTypeAndTypmod hack was, it seemed less intrusive.


Another way to skirt the issue was to simply set the typmod of enum 
types to have their own OID, but a) I wasn't sure what other things the 
system might be inferring from the presence of a typmod, and b) you just 
stated that we can't assume that typmod is big enough to hold an OID anyway.


I'm less concerned about the generic return type in this case, since the 
parser should know the return type when an explicit cast has been 
called.  Hmm, ok, maybe not. Looks like it's using the 
return type of the cast function and throwing away the explicit cast 
information. That's not very nice, but can probably be fixed in the parser.



Thoughts?

Tom

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PATCHES] Current enums patch

2007-04-01 Thread Tom Dunstan

Andrew Dunstan wrote:

Tom Lane wrote:

enum_first, enum_last: these return ANYENUM but violate the rule that
a polymorphic-result function must have a polymorphic input argument
from which the parser may deduce the actual output type.


Is this a tragedy when the supplied parameter gives the result type 
directly?


I've been having a play with this. If you create a function taking an 
enum type as a parameter, you can feed the output from enum_first into 
it, regardless of the type given to enum_first. I doubt that it would be 
a problem in practice, but it's certainly not great.


> One rather ugly possibility is that the argument is a
> value of the target type --- which you can always have as null, so
>
>enum_first(null::rainbow) = 'red'
>
> where enum_first is declared as taking and returning anyenum.

I don't think that'll fly. If it's passed a null value, how does 
enum_first know which enum type it's dealing with? Can you get the type 
from the null value in some way?


> This
> seems a bit yucky as opposed to the regtype approach in the patch,
> and yet there are cases where it would be more handy --- eg, if
> you are working with a table column "col" of some enum type, you
> can do enum_first(col) instead of hardwiring the enum name.

That's ok, as long as nulls work.

> There might be other better ways, though.  Thoughts?

*Ponder*. In java, you can tie a generic return value to a particular 
class by passing the class in as a bound generic parameter... could we 
have a special rule that would look for e.g. a regtype as the first 
parameter if the return type is generic and there are no generic parameters?



As a side note, the helper functions were intended for people writing 
functions in plpgsql or whatever, allowing them to not hardcode the 
values of the enum in their function. I consider them nice-to-have 
rather than definitely required. If we can't come up with a nice way to 
do them for 8.3, I'm not absolutely wedded to them. It would be *nice*, 
though.


I really would like the cast from text, though, but I'll deal with that 
in another email.


Regards

Tom

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PATCHES] Current enums patch

2007-03-31 Thread Tom Dunstan

Tom Lane wrote:

Andrew Dunstan <[EMAIL PROTECTED]> writes:
If you want to review or test the feature, the attached patch can be 
used as a replacement for the portion that affects parse_coerce.c, and 
with this it compiles and passes regression. I think it's correct but it 
should still be OKed by at least one Tom. :-)


> Barring objection from Tom D, I'll start with this version.

OK, I've now had a chance to look at Andrew's update of the patch, which 
just seems to pass through the new arrayCoerce parameter to the 
find_coercion_pathway calls. It almost doesn't matter what gets passed 
in: find_coercion_pathway should never set that to true in our calls to 
it in the enum code, as we're passing ANYENUMOID through to the recursed 
call and that'll never hit the array coercion branch.


In summary: looks good to me!

Cheers

Tom

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PATCHES] Current enums patch

2007-03-31 Thread Tom Dunstan

Tom Lane wrote:
> Unless someone objects, I'll change this and also revert to the
> enumlabel name that seems to have been used originally (it was still
> used in the docs).  It seems more readable somehow (I guess it's the
> lack of either ascenders or descenders in "enumname").

The name/text thing is discussed downthread. I actually started out 
calling the field the name and changed it to the label, but perhaps I 
only did that in the docs. It was probably while I was writing the docs 
that I realized that name could refer to the enum type name or the value 
name, which was confusing, but "value name" was kinda cumbersome, hence 
"label". Change it over with my blessing.


Cheers

Tom

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PATCHES] Current enums patch

2007-03-31 Thread Tom Dunstan

Hi all! Thanks for reviewing the patch!

Tom Lane wrote:

Andrew Dunstan <[EMAIL PROTECTED]> writes:

Tom Lane wrote:

Is there a specific reason for
pg_enum.enumname to be type name and not type text?


IIRC at one stage Tom wanted to try to make these identifiers, but that 
was quickly abandoned. This might be a hangover from that.


Actually I think I see the reason: it's a bit of a pain in the neck to
use the syscache mechanism with text-type lookup keys.  I'm not 100%
convinced that we really need to have syscaches on pg_enum, but if those
stay then it's probably not worth the trouble to avoid the limitation.


Yeah, that was the reason IIRC. The syscaches are used by the I/O 
functions: The one keyed on the pg_enum OID is for output, and the one 
keyed on the type OID and label, err, name, are for input. As suggested 
by a certain party here [1]. There didn't seem to be any text-like key 
types to use in the syscache except the name type, and I didn't see the 
63 byte limit being a big deal, that's way bigger than any sane enum 
name that I've ever seen.


If we ditched the second syscache, we'd want some other way to convert a 
 type OID and name into the enum value oid efficiently. I originally 
suggested having a cache that got hooked onto an fn_extra field; that 
idea could be resurrected if you don't like the syscache.


Cheers

Tom


1[] http://archives.postgresql.org/pgsql-hackers/2006-08/msg01022.php

---(end of broadcast)---
TIP 6: explain analyze is your friend


[PATCHES] Current enums patch

2007-03-26 Thread Tom Dunstan

Hi all

Here's the current version of the enums patch. Not much change from last 
time, the only thought-inducing stuff was fixing up some macros that 
changed with the VARLENA changes, and adding a regression test to do 
basic checking of RI behavior, after the discussions that we had 
recently on the ri_trigger stuff with generic types. The actual behavior 
was fixed by Tom's earlier patch, so this is just a sanity check.


Cheers

Tom


enums.patch.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] Enums patch v2

2007-02-13 Thread Tom Dunstan

Neil Conway wrote:

On Thu, 2007-02-01 at 22:50 -0500, Bruce Momjian wrote:

Where are we on this?


I can commit to reviewing this. The patch looked fairly solid on a quick
glance through, but I won't have the cycles to review it properly for a
week or two.


I've brought this up to date with the operator family stuff now, and the 
attached patch once more applies cleanly against HEAD. Hopefully that'll 
make life a little easier when reviewing it. Thanks.


I got sick of manually shifting the new OID values every time someone 
had added something new to the catalogs, so I moved all of my OIDs up to 
the 9000 range. Attached is a hacky bash script which can move them back 
to something sane. The idea is to run unused_oids first, see where the 
main block of unused OIDs starts, and then run shift_oids on the 
(unzipped) patch file before applying the patch. We discussed something 
similar a while ago, but no-one ever got around to implementing the script.


I've attached the script and left the OIDs in my patch in the upper 
range rather than just running it myself before submitting the patch as 
I reckon that this might be a useful convention for authors of patches 
which add a non-trivial number of OIDs to the catalogs. If there's 
agreement then anyone can feel free to commit the script to CVS or put 
it on the wiki or whatever.


Cheers

Tom


enums.patch.gz
Description: GNU Zip compressed data
#!/bin/sh

start=$1
end=$2
new_start=$3
filename=$4

if [ -z "$filename" ] ; then
echo Usage: $0 start-oid end-oid new-start-oid filename
exit 1
fi

if [ ! -f $filename ] ; then
echo $0: $filename is not a file
exit 1
fi

if [ $end -le $start ] ; then
echo $0: End of OID range must be greater than or equal to the start
exit 1
fi

start_len=`echo -n $start | wc -c`
end_len=`echo -n $end | wc -c`

if [ $start_len -ne 4 -o $end_len -ne 4 ] ; then
echo $0: Source OID range must have 4 digits
exit 1
fi

let new_end=$new_start+$end-$start
if [ $start -le $new_start -a $end -ge $new_start -o $new_start -le $start -a 
$new_end -ge $start ] ; then
echo $0: OID ranges may not overlap
exit 1
fi


i=$start
j=$new_start
while [ $i -le $end ] ; do
#echo $i $j
sed -i "s/$i/$j/g" $filename
let i=i+1
let j=j+1
done

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] Enums patch v2

2006-12-19 Thread Tom Dunstan

Peter Eisentraut wrote:
An objection to enums on the ground that foreign keys can accomplish the 
same thing could be extended to object to any data type with a finite 
domain.


Exactly. The extreme case is the boolean type, which could easily be 
represented by a two-value enum. Or, if you were feeling masochistic, a 
FK to a separate table. Which is easier?


People regularly do stuff like having domains over finite text values, 
or having a FK to a separate (static) table, or using some sort of EAV. 
Enums are type-safe, easily ordered, relatively efficient and don't 
leave zillions of little static tables all over the place, a combination 
of attributes that none of the alternative solutions in this space present.


Cheers

Tom


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] [PATCHES] Enums patch v2

2006-12-19 Thread Tom Dunstan

Alvaro Herrera wrote:

I don't, because there are always those that are knowledgeable enough to
know how to reduce space lost to padding.  So it would be nice to have
2-byte enums on-disk, and resolve them based on the column's typid.  But
then, I'm not familiar with the patch at all so I'm not sure if it's
possible.


Not with this patch, and AFAIK not possible generally, without writing 
separate I/O functions for each type. I'd love to be able to do that, 
but I don't think it's possible currently. The main stumbling block is 
the output function (and cast-to-text function), because output 
functions do not get provided the oid of the type that they're dealing 
with, for security reasons IIRC. It was never clear to me why I/O 
functions should ever be directly callable by a user (and hence open to 
security issues), but apparently it was enough to purge any that were 
designed like that from the system, so I wasn't going to go down that 
road with the patch.


Cheers

Tom



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PATCHES] Enums patch v2

2006-12-19 Thread Tom Dunstan

Heikki Linnakangas wrote:
I'm sorry I missed the original discussions, but I have to ask: Why do 
we want enums in core? The only potential advantage I can see over using 
a look-up table and FK references is performance.


Well, there are a few things. Sometimes its tidiness, sometimes 
integrity... I've seen more than one system with hundreds of these 
things, and they've either gone down the table-per-enum solution, with 
LOTS of extra tables whose values never change, or the EAV solution, 
with one or two globally referenced tables to which everything in your 
system has as a FK, and an integrity check in a trigger if you're very 
lucky. Yuck on both accounts. Enums hit a sweet spot in the middle and 
provide data integrity and performance for non-changing values.


1. What's the point of having comparison operators for enums? For most 
use cases, there's no natural ordering of enum values.


Well, there are a number of cases where ordering IS important, and 
indeed, enums provide a way to do it easily where many of the 
alternative solutions do not. It's one of the key benefits.


2. The comparison routine compares oids, right? If the oids wrap around 
when the enum values are created, the ordering isn't what the user expects.


As has been pointed out by others quicker on the draw than me, I do sort 
the OIDs at enum creation time, for exactly this reason.


3. 4 bytes per value is wasteful if you're storing simple status codes 
etc. Especially if you're doing this for performance, let's do no harm 
by wasting space. One byte seems enough for the typical use cases. I'd 
even argue that having a high upper limit on the number of enum values 
encourages misuse of the feature.


I'd really love to have these fit into a 1 or 2 byte value on disk, but 
AFAIK there's simply no way to do it currently in postgresql. If we ever 
move to a solution where on-disk representation is noticeably different 
from in-memory representation, then it might be doable. If that does 
happen, we might benefit from other improvements such as being able to 
order columns in a tuple on disk so as to minimize alignment padding, 
not having to store a composite type's oid, etc. Until that happens, 
though, if it ever does, this is probably the tightest on-disk 
representation we're likely to get, unless we're happy to impose some 
pretty severe restrictions, like 8 bits per enum, and only 256 enums in 
total (for a 2 byte total). I was already shot down trying to make 
similar restrictions when I first brought it up. :) The OID solution 
seems to offend the least. :)


We did discuss this somewhat earlier, and I'm happy to take alternative 
suggestions, but AFAIK this is about as good as it's going to get right now.


Cheers

Tom

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PATCHES] Enums patch v2

2006-12-18 Thread Tom Dunstan

Hi all

Here is an updated version of the enums patch. It has been brought up to 
date and applies against current CVS HEAD. The original email is at [1], 
and describes the implementation.


This version contains sgml documentation, and contains the missing 
bounds checks and error codes noted in the last email.


As mentioned before, the only part that I'm not super keen on is the 
hack required to pass the type oid in to the text-to-enum cast function, 
since normally those take type mods but not type oids. I wasn't sure how 
else to get a cast working though, short of allowing another type of 
cast function which accepts type oids as well. That seemed overkill for 
just one case, though, and was getting a bit beyond the realms of what I 
wanted to get done with this patch.


Anyway, it's reasonably feature complete and should be appropriately 
documented now, so feedback gratefully accepted.


Cheers

Tom

[1] http://archives.postgresql.org/pgsql-patches/2006-09/msg0.php


enums.patch.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PATCHES] Enums patch v1

2006-08-31 Thread Tom Dunstan

Hi folks

Here's a first cut of the enums patch for feedback when people have 
time. It follows an anyenum pseudo-type approach as foreseen by 
Nostradamus in one of the original threads. 
(http://archives.postgresql.org/pgsql-hackers/2005-11/msg00457.php).
That made the patch a little more intrusive than I had been hoping for, 
but hopefully the bits that touch core files like parse_coerce.c are 
easy to follow.


The patch is largely feature complete as far as what I wanted to 
achieve, although it's currently missing documentation patches. I'll add 
those at some point soonish. It's currently only been tested on my x86 
FC5 box.


A few notes:

 - In much of the code, anyenum is treated identically to anyelement, 
except when trying to tie down generic parameters: at this point types 
which aren't enums will be rejected if the declared type is anyenum. 
Thus there are quite a few places in the patch where ANYENUMOID has just 
kinda been added in next to ANYELEMENTOID.


 - The error messages in enum.c need some love. Apparently there's a 
document that I need to read in the sgml docco which explains error 
codes to me; currently I've got FIXME TOM comments in there.


 - One difficulty that I had was trying to write a cast-text-to-enum 
function. While input functions pass their type oid as a second 
parameter, all that cast functions can get is the typmod. I did a bit of 
an ugly hack to pass the enum oid in as an int in that parameter if it's 
an enum type; it might be cleaner to allow cast functions that take an 
oid there rather than an int and pass the correct parameter depending on 
that type. I wasn't comfortable making a change like that in this patch 
without discussion, though.


 - It appeared that to be cached in a syscache properly, the enum names 
had to be a Name/NameData thing rather than a text or whatever, so 
there's a limit of 64 characters per enum value. 64 characters should be 
enough for anybody :). Hmm, as I write this I realize that I should 
probably add some length checking to the create type function. And 
something prohibiting commas, since they're the array delimiter. 
Consider that on the TODO as well. :) Do we want to allow a customizable 
delimiter? IMO someone putting commas inside an enum value is doing 
something sick and twisted anyway, but maybe there's an argument to be 
made. They could always hack the delimiter on the array type, although 
that wouldn't survive a dump.


 - You can't create generic anyenum functions in the PL's except for 
plpgsql. You can create functions taking/returning the concrete type, 
but not anyenum. I don't consider that a huge loss; we couldn't really 
think of a use case. I haven't added a regression test involving enums 
to e.g. plperl, although I have tested it with perlpython/tcl. Is that 
worth doing, or is it being paranoid?


 - This patch requires an initdb, but I haven't bumped the catalog 
version number in the patch since I don't know when (if ever) it will be 
applied


Now a few questions from a first-time contributor:

 - Do I need to call CatalogUpdateIndexes() after I delete rows when 
dropping the type? Following the code for creating dropping standard 
user types, it's called on creation but not deletion.


 - I'm a little unclear about when pfree() should be called. My 
understanding is that it will deallocate memory from the current memory 
context, but that context will be chucked once the current statement or 
whatever is dead. Is it just nice to do when we know that we're done 
with a particular chunk and to stop the context from growing, or are 
there other occasions? Or have I misunderstood what it does?


 - Is pg_indent intended to be run by us mortals? I tried following the 
instructions in the README to make sure my indentation was kosher, but 
ran into issues with pgindent complaining about something called detab 
being missing. It's not on my Fedora box, and some quick Googling didn't 
unearth an obvious candidate, although a few dodgy looking scripts 
popped up. Is it a BSD tool?


Anyway, all comments / criticisms welcome. Have a look at the regression 
test to see what should be working.


Cheers

Tom


enums-v1.patch.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 6: explain analyze is your friend