Wow, this all looks great Andrea. I don't really have much to add, the patch
looks pretty complete to me and all the design choices you made make sense
to me. Some random comments inline.

On Sun, May 29, 2011 at 3:06 AM, Andrea Aime
<[email protected]>wrote:

> Hi,
> during the weekend I've been cooking a patch to implement
> sql encoding of some filter functions for PostGIS.
> I found out it's a really tricky business, harder than
> I thought it would have been.
> (warning, long mail)
>
> The patch I've attached adds support for the strings functions
> I could get to support, as well as a few math functions.
> The patch has not been committed, it definitely requires discussion.
>
> The most basic way to add support for a function is to add
> its class to the capabilities, something like:
> caps.addType(FilterFunction_ceil.class);
>
> Once that is done the sql encoder will try to encode the
> function call as funcName(p1, p2, p3, ...).
> So we're done right?
> Nope, wrong, there is a number of things that can go wrong.
> Here is a short list of potential incompatibilities:
> - the function name is not the same (if that is your only
>  problem you won the lottery)
> - type system issues
> - same arguments, different meaning
> - different arguments, or the function is to be implemented
>  calling operators or concatenating calls to different native
>  function
> - the function is there but does not use a standard function
>  call syntax
> - the function is simply no freaking there unless you add
>  it declaring a new database level function
>
> Let's see how we can deal with each problem.
>
> Function name is not the same
>
> ------------------------------------------------------------------------------
>
> strLength is not there, you have "char_length" instead.
> To handle this simple case I've added a method in
> FilterToSQL:
>
> protected String getFunctionName(Function function)
>
> Normally it returns function.getName(), but subclasses
> can override it to have it encode functions with a different
> name.
>
> Your type system is not my type system
> ---------------------------------------------------------------------------
>
> In GeoTools it's perfectly legit to do something like:
> strLength(12.3) or strLenght(doubleAttribute)
> because we have converters all over the place.
>
> But in the database that does not really work, you
> will get a type mismatch error and the call fails.
> What you have to do is something like:
> char_length('12.3')
> char_length(12.3::text)
> char_length(doubleAttribute::text)
>
> To handle this case I've started using the function argument
> type information we're adding on trunk and stick it as a context for the
> encoding. This is sufficient to turn literals into the desired type
> before the encoding happens.
> What about attributes? I've added a casting method into
> FilterToSQL that subclasses can override:
>
> protected String cast(String encodedProperty, Class target) throws
> IOException
>
> Normally it returns just encodedProperty as is, but for
> example the postgis encoder overrides it to the following:
>
> public String cast(String property, Class target) {
>        if(String.class.equals(target)) {
>            return property + "::text";
>        } else if(Short.class.equals(target) || Byte.class.equals(target)){
>            return property + "::smallint";
>       ...
>        } else if(java.util.Date.class.isAssignableFrom(target)) {
>            return property + "::timesamp";
>        } else {
>            // dunno how to cast, leave as is
>            return property;
>        }
>
>    }
>
> This will make strLenght(doubleAttribute) be encoded as
> char_length("doubleAttribute"::text)
>
> Long story short, for the simple cases
> you can get things working by adding a function alias
> and declaring the type information in the FunctionName of your function
> (still assumes the type of the java function and the db function as the
>  same, which is often the case)
>
> Same arguments, different meaning
> ---------------------------------------------------
>
> strSubstring(str, start, end) can be encoded using substr(str, start,
> count)
> in PostgreSQL... however there are two significant differences:
> * start is 1 based instead of being 0 based
> * substr third argument is a count, that is, end - start
>
> So the encoding is something like substr(str, start + 1, end - start).
> Different enough to require overriding visit(Function, extraData) and
> do a custom encoding.
>
> strIndexOf is similar, in this case we expect a zero based result
> but the db will give us a 1 based one.
>
> No equivalent function
> --------------------------------
>
> strConcat is the first example of this case, the db does not have a
> function,
> it has an operator, s1 || s2.
>
> strEndsWith/strStartWith are not there either, but they can be implemented
> using LIKE str || '%' and LIKE '%' || str
>
> The different in argument meanings is not reconcilable
>
> -----------------------------------------------------------------------------
>
> strMatches is such case. Yes, the database has regular expressions, but
> they are not using 1-1 the java syntax for the patterns, so the encoded
> one may fail to be parsed or behave differently.
>
> Databases often support POSIX regular expression syntax, but a pure
> java version that is stricly posix does not exist afaik.
> For this case we could have a custom store parameter asking the
> code to blindly encode the regexp as is: the admin that enables
> it takes responsibility.
>
Does perhaps a regex for the regex make sense? :) Like we do for virtual
table parameters. Just a thought.

>
> The function is simply not there
> ---------------------------------------------
>
> strLastIndexOf is a case I could simply not encode, there is no such
> a thing. To implement one we have to create a native function in
> the database.
> This gets tricky, are we allowed to create functions there? Really? :-)
> Maybe for this case we could give people a sql file to run that
> creates the functions we need, and while creating the sql
> capabilities we check if the function we look for is there...
>

Yeah, I think that makes sense. Those who want to optimize can run the
script.

>
> Conclusions
> --------------------
>
> The patch shows how we could proceed to encode filter functions
> in sql, and shows also how to add tests to check the encoded function is
> behaving as expected.
>
> Function encoding requires us to have a type system for functions,
> thus it can be only implemented on trunk (pity).
>
> Encoding functions is tricky because of the many differences between
> our Java based expectations and what is actually available at the db
> level, so I would suggest to follow some careful road and add
> function encoding only if a specific store param is enabled.
>
Yeah, that makes sense, especially for the first cut to have the
admin explicitly enable the feature.

>
> That said, the patch proves is possible and should help people get
> started on encoding more functions for their db of choice.
>
> Feedback appreciated
>
> Cheers
> Andrea
>
>
> --
> -------------------------------------------------------
> Ing. Andrea Aime
> GeoSolutions S.A.S.
> Tech lead
>
> Via Poggio alle Viti 1187
> 55054  Massarosa (LU)
> Italy
>
> phone: +39 0584 962313
> fax:      +39 0584 962313
>
> http://www.geo-solutions.it
> http://geo-solutions.blogspot.com/
> http://www.youtube.com/user/GeoSolutionsIT
> http://www.linkedin.com/in/andreaaime
> http://twitter.com/geowolf
>
> -------------------------------------------------------
>
>
> ------------------------------------------------------------------------------
> vRanger cuts backup time in half-while increasing security.
> With the market-leading solution for virtual backup and recovery,
> you get blazing-fast, flexible, and affordable data protection.
> Download your free trial now.
> http://p.sf.net/sfu/quest-d2dcopy1
> _______________________________________________
> Geotools-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/geotools-devel
>
>


-- 
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.
------------------------------------------------------------------------------
Simplify data backup and recovery for your virtual environment with vRanger. 
Installation's a snap, and flexible recovery options mean your data is safe,
secure and there when you need it. Data protection magic?
Nope - It's vRanger. Get your free trial download today. 
http://p.sf.net/sfu/quest-sfdev2dev
_______________________________________________
Geotools-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geotools-devel

Reply via email to