Puuh, a long mail, I will try to keep it short. 1) Each db engine has its own optimizer, DB2 as an example has a cost based optimizer (not rule based) using indices based on distribution statistics. Creating an and index for a char(1) field having values "Y" and "N" will have no sense, DB2 will never use this index. If the statistics is up to date, prepared statements outperform non ps statements clearly in most situations. But I think PostGIS and Oracle are mainstream, it is ok to focus on these systems.
2) A superb solution is to parametrize the use of a VT, either to use the select statement as a prepared statement or replace the parameters markers with literals. I did this for a DB2 implementation some times ago , but for all our supported db engines this would be a heavy job. I only mention this as an idea. 3) Using named parameters raises some problems. (For ps or non ps ). At the end of the day, you will need to parse the sql statement, because there could be a parameter marker within a string constant. During the Oracle Georaster stuff, I need to create queries which where composed of a lot of statements delimited by ";" (more like a procedure). A simple search and replace will work in 99.9 %, but not in 100%. To bring it to an end, if we offer regexp patterns for the user and warn them about sql injection attacks , I am +0 here. (I cant give a +1 here, my stomach has not the best feeling, I fear most of the users wont even know what sql injection is, but I may be wrong here). Cheers Quoting Andrea Aime <[email protected]>: > [email protected] ha scritto: >> Some thoughts from my side. >> >> 1) Using a regexp to avoid sql injection is a nice idea. But I think >> most developers are not familiar with regexps and thus, the regexp >> mostly used will be ".*" opening the door for all kind of sql >> injection attacks. Why not use prepared (callable) statements ?. > > Thinking about this overnight I realized why not: because there > is no guaratee we're using a prepared statement based dialect. > > PS based dialects are unfortunately significantly slower when > accessing the dataset as a whole compared to just a tiny bounding > box of it. > This is because the query optimizer never sees the actual parameter > values, so it assumes index access is the best approach, whilst > when you're accessing 50%+ of your data to render it a sequential > scan is faster instead. > > The PostGIS store in particular has both paths, prepared and not, > benchmark show the prepared path is not an ounce faster when > the area requested is small, and 4 times slower when you ask > for the whole dataset. > > Of course when exposing the server to the public it's better to > take the hit and use prepared statements for safety, but if > the server is internal, or used to generate tiles for a tile > cache, taking the 4x slowndown is criminal. > > So, long story short, if we tried to go the prepared statements > road we'd have to make again two coding paths, and roll out > something that would work for non prepared statement paths > as well. > > A parameter designed for p.s. usage would require the target > sql type instead of a regular expression so that we can > use PreparedStament.setObject(index, value, nativeType). > But what to do in the case of a non ps dialect? > You'd have to roll your own custom data validation (otherwise > you're less protected than using a regexp) for each of the > possible native sql types, and then rely on the dialect > to properly encode this value as a string, that other as > a date, and so on. > > And it gets much worse when you actually try to implement it > as a PS. This is because the current support for views > simply takes the sql view definition and uses it in > the from clause, without parameters. > But if you want to introduce p.s. parameters in that > subquery we do, you'd have to: > - locate all the parameters by name, considering that each > can be repeated multiple times, and replace them wiht ? > - add them to the set of actual query parameters that > we have in all the ps paths (not only select, but also > bounds and count) > - integrate them with the set of params that we're building > for the filter coming from outside > > And all the above would still have to care for freeform > params protected by a regexp (because as I said, the sponsor > does not wants params, he wants to be able to replace whole > sql blocks). > > I'm not saying it's not doable, but it would certainly > require a lot of work. > > From that perspective just providing a set of well known > regexps to facilitate life for the people not used > to them and just use simple string substitution looks > a lot more appealing to me. > > Cheers > Andrea > > -- > Andrea Aime > OpenGeo - http://opengeo.org > Expert service straight from the developers. ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. ------------------------------------------------------------------------------ ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo _______________________________________________ Geotools-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/geotools-devel
