Sounds excellent, Andrea! Good work.
On 29/05/11 17:06, Andrea Aime wrote:
Hi,
during the weekend I've been cooking a patch to implement
sql encoding of some filter functions for PostGIS.
I found out it's a really tricky business, harder than
I thought it would have been.
(warning, long mail)
The patch I've attached adds support for the strings functions
I could get to support, as well as a few math functions.
The patch has not been committed, it definitely requires discussion.
The most basic way to add support for a function is to add
its class to the capabilities, something like:
caps.addType(FilterFunction_ceil.class);
Once that is done the sql encoder will try to encode the
function call as funcName(p1, p2, p3, ...).
So we're done right?
Nope, wrong, there is a number of things that can go wrong.
Here is a short list of potential incompatibilities:
- the function name is not the same (if that is your only
problem you won the lottery)
- type system issues
- same arguments, different meaning
- different arguments, or the function is to be implemented
calling operators or concatenating calls to different native
function
- the function is there but does not use a standard function
call syntax
- the function is simply no freaking there unless you add
it declaring a new database level function
Let's see how we can deal with each problem.
Function name is not the same
------------------------------------------------------------------------------
strLength is not there, you have "char_length" instead.
To handle this simple case I've added a method in
FilterToSQL:
protected String getFunctionName(Function function)
Normally it returns function.getName(), but subclasses
can override it to have it encode functions with a different
name.
Your type system is not my type system
---------------------------------------------------------------------------
In GeoTools it's perfectly legit to do something like:
strLength(12.3) or strLenght(doubleAttribute)
because we have converters all over the place.
But in the database that does not really work, you
will get a type mismatch error and the call fails.
What you have to do is something like:
char_length('12.3')
char_length(12.3::text)
char_length(doubleAttribute::text)
To handle this case I've started using the function argument
type information we're adding on trunk and stick it as a context for the
encoding. This is sufficient to turn literals into the desired type
before the encoding happens.
What about attributes? I've added a casting method into
FilterToSQL that subclasses can override:
protected String cast(String encodedProperty, Class target) throws IOException
Normally it returns just encodedProperty as is, but for
example the postgis encoder overrides it to the following:
public String cast(String property, Class target) {
if(String.class.equals(target)) {
return property + "::text";
} else if(Short.class.equals(target) || Byte.class.equals(target)){
return property + "::smallint";
...
} else if(java.util.Date.class.isAssignableFrom(target)) {
return property + "::timesamp";
} else {
// dunno how to cast, leave as is
return property;
}
}
This will make strLenght(doubleAttribute) be encoded as
char_length("doubleAttribute"::text)
Long story short, for the simple cases
you can get things working by adding a function alias
and declaring the type information in the FunctionName of your function
(still assumes the type of the java function and the db function as the
same, which is often the case)
Same arguments, different meaning
---------------------------------------------------
strSubstring(str, start, end) can be encoded using substr(str, start, count)
in PostgreSQL... however there are two significant differences:
* start is 1 based instead of being 0 based
* substr third argument is a count, that is, end - start
So the encoding is something like substr(str, start + 1, end - start).
Different enough to require overriding visit(Function, extraData) and
do a custom encoding.
strIndexOf is similar, in this case we expect a zero based result
but the db will give us a 1 based one.
No equivalent function
--------------------------------
strConcat is the first example of this case, the db does not have a function,
it has an operator, s1 || s2.
strEndsWith/strStartWith are not there either, but they can be implemented
using LIKE str || '%' and LIKE '%' || str
The different in argument meanings is not reconcilable
-----------------------------------------------------------------------------
strMatches is such case. Yes, the database has regular expressions, but
they are not using 1-1 the java syntax for the patterns, so the encoded
one may fail to be parsed or behave differently.
Databases often support POSIX regular expression syntax, but a pure
java version that is stricly posix does not exist afaik.
For this case we could have a custom store parameter asking the
code to blindly encode the regexp as is: the admin that enables
it takes responsibility.
The function is simply not there
---------------------------------------------
strLastIndexOf is a case I could simply not encode, there is no such
a thing. To implement one we have to create a native function in
the database.
This gets tricky, are we allowed to create functions there? Really? :-)
Maybe for this case we could give people a sql file to run that
creates the functions we need, and while creating the sql
capabilities we check if the function we look for is there...
Conclusions
--------------------
The patch shows how we could proceed to encode filter functions
in sql, and shows also how to add tests to check the encoded function is
behaving as expected.
Function encoding requires us to have a type system for functions,
thus it can be only implemented on trunk (pity).
Encoding functions is tricky because of the many differences between
our Java based expectations and what is actually available at the db
level, so I would suggest to follow some careful road and add
function encoding only if a specific store param is enabled.
That said, the patch proves is possible and should help people get
started on encoding more functions for their db of choice.
Feedback appreciated
Cheers
Andrea
--
*Niels Charlier*
Software Engineer
CSIRO Earth Science and Resource Engineering
Phone: +61 8 6436 8914
Australian Resources Research Centre
26 Dick Perry Avenue, Kensington WA 6151
------------------------------------------------------------------------------
Simplify data backup and recovery for your virtual environment with vRanger.
Installation's a snap, and flexible recovery options mean your data is safe,
secure and there when you need it. Data protection magic?
Nope - It's vRanger. Get your free trial download today.
http://p.sf.net/sfu/quest-sfdev2dev
_______________________________________________
Geotools-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geotools-devel