Re: [HACKERS] Custom Data Type Question

2006-11-18 Thread Simon Riggs
On Wed, 2006-11-15 at 16:38 -0500, Andrew Dunstan wrote:

 My little enumkit tool allows you to create enumerations today very 
 easily, but its values are completely hardcoded. However, the above 
 trick still works. The downside is that each enumeration type requires a 
 tiny bit of compilation.

Andrew,

Your enum sounds good, apart from the hardcoded/compilation thing. That
is a data management nightmare AFAICS and so restricts the usefulness of
the solution.

It would be much better to read things dynamically into an array, so
using an init function in *preload_libraries would work well.

I'd also love any suggestions as to how we might be able to use a
similar local-data-cacheing mechanism to work when we specify SQL
standard FOREIGN KEYs. It would be really cool to say USING LOCAL CACHE
or some way of avoiding the overhead of all those stored after triggers
and SPI SELECT statements when we've got checks against tables with only
a few rows where the values hardly ever change. The enum concept departs
radically from the declarative Referential Integrity concepts that many
of us are already used to. I'd like to be able to speed things up
without radical re-design of the database... so a few nicely sprinked
ALTER TABLE statements would be a much better way of implementing this
IMHO.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Custom Data Type Question

2006-11-18 Thread Andrew Dunstan



Simon Riggs wrote:

On Wed, 2006-11-15 at 16:38 -0500, Andrew Dunstan wrote:

  
My little enumkit tool allows you to create enumerations today very 
easily, but its values are completely hardcoded. However, the above 
trick still works. The downside is that each enumeration type requires a 
tiny bit of compilation.



Andrew,

Your enum sounds good, apart from the hardcoded/compilation thing. That
is a data management nightmare AFAICS and so restricts the usefulness of
the solution.


  


Simon, Tom Dunstan has submitted a patch for first class enum types that 
do not have the compilation requirement - it's in the queue for 8.3. You 
might want to review that.


cheers

andrew

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Custom Data Type Question

2006-11-16 Thread Greg Mitchell



A simpler way to do this might be to only cache the list per query
context. In your IO functions, you could whack a pointer to your cache
onto fcinfo-flinfo-fn_extra, and the same flinfo gets passed in for
e.g. all output function calls for that column for that query, IIRC.
This was what I had in mind originally when I did the enum patch, but I
ended up just using syscaches, which I think would be unavailable to you
writing a UDT.


If my understanding is correct, if fn_extra is null, I would palloc() my
data cache and store the pointer in fn_extra? What about freeing this
pointer? Or is cleanup automatic?

Also, are there any ADTs like a hash-map or tree-map in the server
libraries (my background is C++ and am use to having std::map) or do I
need to role my  own?

I am using enumkit for some true enums I have in the DB and like it very
much. Though I tend to customize the C-code to optimize it for my use.

Thanks,
Greg

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Custom Data Type Question

2006-11-16 Thread Greg Mitchell
As far as memory management goes, do I just use hash_create() and assign 
that pointer to fn_extra and at the end of the query it will be freed? 
Or will it not be freed until this end of the transaction? I'm really 
having trouble understanding the memory management issues with Postgres.


Greg

Andrew Dunstan wrote:

Greg Mitchell wrote:




Also, are there any ADTs like a hash-map or tree-map in the server
libraries (my background is C++ and am use to having std::map) or do I
need to role my  own?




Look at the dynahash code. I just used it for the first time in a plperl 
patch, and it's reasonably straightforward.


cheers

andrew



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Custom Data Type Question

2006-11-16 Thread Andrew Dunstan

Greg Mitchell wrote:




Also, are there any ADTs like a hash-map or tree-map in the server
libraries (my background is C++ and am use to having std::map) or do I
need to role my  own?




Look at the dynahash code. I just used it for the first time in a plperl 
patch, and it's reasonably straightforward.


cheers

andrew


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Custom Data Type Question

2006-11-16 Thread Tom Lane
Greg Mitchell [EMAIL PROTECTED] writes:
 As far as memory management goes, do I just use hash_create() and assign 
 that pointer to fn_extra and at the end of the query it will be freed? 
 Or will it not be freed until this end of the transaction? I'm really 
 having trouble understanding the memory management issues with Postgres.

You have to be careful that the hashtable is created in the correct
memory context --- in this case you want it to be in a query-lifespan
context, not the short-term (per-tuple-lifespan) context that your
function will be called in.  The usual procedure for cases like this is
to use the context identified by fn_mcxt.  src/backend/utils/mmgr/README
might make useful reading for you.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[HACKERS] Custom Data Type Question

2006-11-15 Thread Greg Mitchell
I'm trying to create a custom data type similar to an enumeration type. 
However, I'd like the mapping of the int-string to be dynamic instead 
of hard coded. I'd like to have a table that contains this mapping that 
can be appended to. Creating this type is not very difficult. However, 
for performance reasons, I'd like to cache the mapping so that the table 
is only queried once every connection unless it changes. I'm thinking a 
combination of a flag that can be triggered on insert and a transaction 
id could be used to decide if the table needs to be reloaded. 
Unfortunately, I'm not exactly sure how to get started on this, any ideas?


Thanks,
Greg

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Custom Data Type Question

2006-11-15 Thread Andrew Dunstan

Greg Mitchell wrote:
I'm trying to create a custom data type similar to an enumeration 
type. However, I'd like the mapping of the int-string to be dynamic 
instead of hard coded. I'd like to have a table that contains this 
mapping that can be appended to. Creating this type is not very 
difficult. However, for performance reasons, I'd like to cache the 
mapping so that the table is only queried once every connection unless 
it changes. I'm thinking a combination of a flag that can be triggered 
on insert and a transaction id could be used to decide if the table 
needs to be reloaded. Unfortunately, I'm not exactly sure how to get 
started on this, any ideas?





Are you aware that there is a patch for first class enumeration types 
waiting to be reviewed for 8.3? The mapping is kept entirely internal, 
and you should never see what it is kept as underneath. It does not 
provide for dynamically extending the enumeration set, for various 
reasons, but there is an easy workaround, namely to create a new type 
with the extra member(s) and then do:


  alter table foo alter column  bar type newtype using bar::newtype;


My little enumkit tool allows you to create enumerations today very 
easily, but its values are completely hardcoded. However, the above 
trick still works. The downside is that each enumeration type requires a 
tiny bit of compilation.


cheers

andrew



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Custom Data Type Question

2006-11-15 Thread Tom Dunstan

Hi Greg

Greg Mitchell wrote:
I'm trying to create a custom data type similar to an enumeration type. 
However, I'd like the mapping of the int-string to be dynamic instead 
of hard coded. I'd like to have a table that contains this mapping that 
can be appended to.  Creating this type is not very difficult. However,
for performance reasons, I'd like to cache the mapping so that the table 
is only queried once every connection unless it changes.


A simpler way to do this might be to only cache the list per query 
context. In your IO functions, you could whack a pointer to your cache 
onto fcinfo-flinfo-fn_extra, and the same flinfo gets passed in for 
e.g. all output function calls for that column for that query, IIRC. 
This was what I had in mind originally when I did the enum patch, but I 
ended up just using syscaches, which I think would be unavailable to you 
writing a UDT.


The upside of the above is that for a given query, the contents of your 
 table shouldn't change, so there's no mucking about with trying to 
keep things in other backends up to date. The downside is that you have 
to do the lookup per query, but if you're dealing with lots of data then 
it'll get dwarfed by the actual query, and if not, who cares?


The other question that leaps to mind is whether you want to have more 
than one of these types. If you do, you may have to have multiple 
versions of the IO functions, otherwise e.g. your output function might 
be passed the value 0, but was that the 0 representing the 'red' string 
from the rgb enum, or the 'northern' string from the hemisphere enum? 
You don't know, and postgresql won't tell you directly.


There are a few ways around this. In your case, it might be ok to 
compile different versions of the IO functions for each enum which point 
to different tables, or the same table with a discriminator. Or you 
could see the various different proposals when my patch was first 
discussed. See the thread starting at 
http://archives.postgresql.org/pgsql-hackers/2006-08/msg00979.php or if 
you want a peek at the patch, see 
http://archives.postgresql.org/pgsql-patches/2006-09/msg0.php. A 
rather simpler starting point might be Andrew's enumkit 
http://www.oreillynet.com/pub/a/databases/2006/01/06/enumerated-fields-in-postgresql.html?page=lastx-showcontent=text, 
or possibly Martijn's tagged types at 
http://svana.org/kleptog/pgsql/taggedtypes.html.


Cheers

Tom


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings