Hi Frank,
One other quick piece of advice: The devil is going to be in the details
here. It's probably a good idea to list the (families of) queries you
want to handle. Then people can help you understand whether your plan
will really result in a performance boost with incremental modifications
to Derby and without SQL/MED support on the remote databases.
Hope this helps,
-Rick
Rick Hillegas wrote:
Hi Frank,
I'm not aware of any discussion of this topic on derby-dev. I have
used nabble (http://www.nabble.com/Apache-Database-f105.html) to
search the email archive for "sql med" and nothing came up. Your
proposal sounds like a good basis for a first implementation. Here are
a couple more thoughts:
1) This could grow arbitrarily complicated, depending on how much
heavy lifting you want the optimizer to do for you. With a little
luck, you could probably trick the optimizer into treating the foreign
tables like Table Functions. That would skirt some of the interesting
optimizer issues.
2) A later implementation could get the optimizer involved and you
could build some interesting cost model for remote tables.
3) Alternatively, you might be able to get an optimizer expert like
Army involved early on. He might be able to help you divide and
conquer this one.
You might want to start a wiki page on this topic and take it from there.
Regards,
-Rick
Frank Griffin wrote:
To recap...
A while back, I asked about accessing foreign data from Derby. Rick
very helpfully pointed me to Table Functions, and I've created a
prototype using Table Functions which accesses my foreign engine. This
all works very nicely, within the design capabilities of Table
Functions.
I had pointed out that without some way to push predicates, projection
and selection would be crippled, and lookup-type activities on large
result sets would be pretty inefficient. Rick responded that for that
type of access to work well, what I probably wanted was SQL/MED.
So I've gotten the SQL/MED spec and poked through it a bit. It looks
like exactly what I'd like.
Has there been any discussion of implementing SQL/MED in Derby ? If so,
was there any resolution ? From what I can see googling around, I can't
find any indication that any SQL engine actually provides a reference
implementation of SQL/MED, in spite of the fact that it's been in the
spec since 1999 and was enhanced in 2003.
I would be interested in working on such an implementation in Derby,
assuming that the idea hasn't already been discussed and discarded.
From what I can tell, this would require DDL syntax pretty much like
what Table Functions required. CREATE FOREIGN TABLE would be very
similar to CREATE FUNCTION (TABLE) and the others are pretty simple (as
they don't require column definitions, and pretty much just specify
software and connection linkages).
It would also require walking the SQL parse tree and creating the Value
Expressions which describe the SELECT elements, the WHERE clause
components, and so forth.
Derby would be required to initialize the MED wrapper and pass a Request
Handle related to the actual query Derby would like the foreign wrapper
to execute. The wrapper uses this Handle to query Derby about the
individual components of the Request, and accepts those that it can
handle. Derby then makes its own arrangements to handle any screening
that the wrapper can't handle, and generates an execution plan which
involves calling the wrapper to do the parts of which it has declared
itself capable.
Beyond that, the support is very much like what Table Functions already
supports. Derby activates the individual query components handled by
the wrapper, and navigates among the rows using a model pretty much like
the JDBC ResultSet model used by Table Functions.
Any interest in this, or am I beating a previously-deceased horse ?