Martijn van Oosterhout wrote:
On Wed, Sep 20, 2006 at 10:56:08AM -0700, Mark Dilger wrote:
If the system chooses cast chains based on a breadth-first search, then the existing int2 -> int8 cast would be chosen over an int2 -> int4 -> int8 chain, or an int2 -> int3 -> int4 -> int8 chain, or in fact any chain at all, because the int2 -> int8 cast is the shortest.

But we're not talking about a search here, we don't always know where
the endpoint is. Imagine you have the following three functions:

abs(int8)
abs(float4)
abs(numeric)

And you have an int2. Which is the best cast to use? What's the answer
if you have a float8? What if it's an unknown type text string?

Now, consider that functions can have up to 32 arguments and that this
resolution might have to be applied to each argument and you find that
searching is going to get very expensive very quickly.

The current system of requiring only a single step is at least
predictable. If you have the choice between:

- first argument matches, second needs three "safe" conversions, and
- first argument need one "unsafe" conversion, second matches exactly

Which is cheaper?

To make this manageable you have to keep the number of types you can
cast to small, or you'll get lost in the possibilites. Adding just a
single step domain to base type conversion seems pretty safe, but
anything more is going to be hard.

Have a nice day,

The searching never needs to be done at runtime. It should be computable at cast creation time. A new cast creates a potential bridge between any two types in the system. Using a shortest path algorithm, the best chain (if any exists) from one type to another can be computed and pre-compiled, right?

So, assume the following already exists:

Types A,B,C, fully connected with casts A->B, B->A, A->C, C->A, B->C, C->B, with some marked IMPLICIT, some marked EXPLICIT, and some marked SAFE.

Types X,Y,Z, also fully connected with casts, as above.

Then assume someone comes along and creates a new type M with conversions A->M, M->A, X->M, and M->X. At the time that type and those casts are added to the system, the system could calculate any additional casts to/from B, C, Y, and Z. A simple implementation (but maybe not optimal) would be for the system to autogenerate code like:

CREATE FUNCTION cast_M_Y (arg M) RETURNS Y AS $$
        SELECT arg::X::Y;
$$ LANGUAGE SQL;
CREATE CAST (M AS Y) WITH FUNCTION cast_M_Y(M) [ AS ASSIGNMENT | AS IMPLICIT ]

And then load that function and cast. The only real trick seems to be determining the rules for which cast chain gets used within that autogenerated function, and whether the generated cast is IMPLICIT, EXPLICIT, or ASSIGNMENT.



Looking over what I have just written, another idea pops up. To avoid having the system decide which casts are reasonable, you could extend the syntax and allow an easy shorthand for the user. Something like:

CREATE CAST (M AS A)
        WITH FUNCTION cast_M_A
        AS ASSIGNMENT
        PROPOGATES TO B AS ASSIGNMENT,
        PROPOGATES TO C AS ASSIGNMENT;

CREATE CAST (A AS M)
        WITH FUNCTION cast_A_M
        AS ASSIGNMENT
        PROPOGATES FROM B,
        PROPOGATES FROM C;

And then the casts from M->B, M->C, B->M, and C->M would all be added to the 
system.

Thoughts?

mark


---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org

Reply via email to