Martijn van Oosterhout wrote:
On Wed, Sep 20, 2006 at 10:56:08AM -0700, Mark Dilger wrote:
If the system chooses cast chains based on a breadth-first search, then the
existing int2 -> int8 cast would be chosen over an int2 -> int4 -> int8
chain, or an int2 -> int3 -> int4 -> int8 chain, or in fact any chain at
all, because the int2 -> int8 cast is the shortest.
But we're not talking about a search here, we don't always know where
the endpoint is. Imagine you have the following three functions:
abs(int8)
abs(float4)
abs(numeric)
And you have an int2. Which is the best cast to use? What's the answer
if you have a float8? What if it's an unknown type text string?
Now, consider that functions can have up to 32 arguments and that this
resolution might have to be applied to each argument and you find that
searching is going to get very expensive very quickly.
The current system of requiring only a single step is at least
predictable. If you have the choice between:
- first argument matches, second needs three "safe" conversions, and
- first argument need one "unsafe" conversion, second matches exactly
Which is cheaper?
To make this manageable you have to keep the number of types you can
cast to small, or you'll get lost in the possibilites. Adding just a
single step domain to base type conversion seems pretty safe, but
anything more is going to be hard.
Have a nice day,
The searching never needs to be done at runtime. It should be computable at
cast creation time. A new cast creates a potential bridge between any two types
in the system. Using a shortest path algorithm, the best chain (if any exists)
from one type to another can be computed and pre-compiled, right?
So, assume the following already exists:
Types A,B,C, fully connected with casts A->B, B->A, A->C, C->A, B->C, C->B, with
some marked IMPLICIT, some marked EXPLICIT, and some marked SAFE.
Types X,Y,Z, also fully connected with casts, as above.
Then assume someone comes along and creates a new type M with conversions A->M,
M->A, X->M, and M->X. At the time that type and those casts are added to the
system, the system could calculate any additional casts to/from B, C, Y, and Z.
A simple implementation (but maybe not optimal) would be for the system to
autogenerate code like:
CREATE FUNCTION cast_M_Y (arg M) RETURNS Y AS $$
SELECT arg::X::Y;
$$ LANGUAGE SQL;
CREATE CAST (M AS Y) WITH FUNCTION cast_M_Y(M) [ AS ASSIGNMENT | AS IMPLICIT ]
And then load that function and cast. The only real trick seems to be
determining the rules for which cast chain gets used within that autogenerated
function, and whether the generated cast is IMPLICIT, EXPLICIT, or ASSIGNMENT.
Looking over what I have just written, another idea pops up. To avoid having
the system decide which casts are reasonable, you could extend the syntax and
allow an easy shorthand for the user. Something like:
CREATE CAST (M AS A)
WITH FUNCTION cast_M_A
AS ASSIGNMENT
PROPOGATES TO B AS ASSIGNMENT,
PROPOGATES TO C AS ASSIGNMENT;
CREATE CAST (A AS M)
WITH FUNCTION cast_A_M
AS ASSIGNMENT
PROPOGATES FROM B,
PROPOGATES FROM C;
And then the casts from M->B, M->C, B->M, and C->M would all be added to the
system.
Thoughts?
mark
---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?
http://archives.postgresql.org