Re: Validating form arguments [final version]

Stephen R. van den Berg Wed, 16 Jul 2014 05:04:43 -0700

Michael A. Patton wrote:
>"Stephen R. van den Berg" <[email protected]> writes:
>OK, I'll fix that...  But, I guess I was really arguing for bringing
>back the page of documentation that David Hunter posted and then


Sound like a good idea.

>improving it to say not only _which_ characters are special in each
>encoding, but _how_ they are encoded.  That would make it clearer which
>one(s) apply in a given case.

This probably is more difficult.  It's a matter of keeping the docs
in sync with the code, and keeping the docs readable to a wide audience.
But I'm sure examples can be given.

>> Several remarks:
>> a. Using sqltable/ascii nested in a trim is awkward and inefficient.

>But, WHY?  As you'll see below, I still need the trim, and it would seem
>to me that a single query using the output directly ought to be more
>efficient than needing the overhead for the implied loop of an emit

The sqltable rxml statement internally translates into a loop
(just like emit).
Basically what happens behind the scenes of an sqltable is something like:
- create an internal emit loop
- get all the rows
- add table representation
- strip table representation (ascii)

When you'd use an emit directly, you skip steps 3 and 4 above.

>(even if I already know there'll only be at most one match since the
>column "id" is a unique non-null key).  Also, I _like_ the cascaded
>tests, because they generate better (i.e. more specific) error reports.

You can still use the cascaded tests, just make sure that you replace
the sqltable with an emit.

>> c. Not using bindings in the query is needlessly inefficient and increases
>>    the chance for new security leaks for no good reason.

>I don't understand bindings.  Someone else mentioned them as well.  But
>before this the only mention I'd seen of them (in all of the Roxen docs
>and the MySQL language docs) was the comment that they could be used to
>get past an Oracle restriction (which I ignored because I'm using the
>MySQL that came with Roxen and not Oracle.)  What do they really do?  

Ok, maybe the Roxen documentation is seriously lacking here.
It basically works as follows:

  <emit source="sql" query="
    SELECT a, b*:bding3 AS bmult, c
     FROM d
     WHERE e=:bding1 AND f=CONCAT('abc',:bding2) AND g=12+:bding3+14
   "
   binding="bding1=form.indexid,bding2=form.tail,
        bding3=var.offset">
   &_.a; &_.bmult; &_.c;
  </emit>

It binds the rxml variables form.indexid form.tail and var.offset to
the placeholders :bding1 :bding2 and :bding3 respectively.
It works for <sqlquery> <sqltable> and <emit>.
In practice it results in either of the following two real operations:
a. If the database(driver) supports it, it will generate a real template
   of the query and will pass the placeholders to the database separately
   using a special protocol that avoid unnecessary recoding/escaping.
   The database itself will ensure that the placeholders simply cannot
   result in security breaches due to sql injection, because the database
   will only accept data for those placeholders.  I.e. the plus sides
   here are:
   + Minimal encoding overhead (if any; especially important if you
     transmit binary data as an argument).
   + Every placeholder assignment is transmitted to the database only once.
   + If the database supports caching the querytemplate, the query will
     be transmitted only once.
   + Impossible to get the quoting rules wrong (it's implied).
b. If the database(driver) does not support real templates, Pike will take
   care of the quoting per argument and will automatically pick the right
   encoding.  I.e. it's impossible to forget a :mysql; etc.

>> d. Why not add a LIMIT 1 to the SQL query to help the database
>> evaluator?

>Why is this going to help?  The key I'm using is already declared
>unique, so "there can be only one" (to quote "Highlander" :-) and MySQL
>should already know that.

MySQL needs to parse your query.  It will take a "while" for it to find
out that the key is unique.  If you add the LIMIT 1 at the top level,
the query parser will know *beforehand* that "there will be only one", which
allows the SQL parser and optimiser (if we can call it that for MySQL)
to cut some corners and be done early.  I admit that the efficiency
improvement for a simple query like this probably is marginal at best
(but it is there), but if you make it a general rule of thumb, it will
help the SQL engine a lot on more than trivial queries.

>> Why not use something more straightforward/efficient/secure like this:

>I certainly don't see it as more straightforward, it seems to be looping
>for no good reason over a query of one result.  I'm not sure I believe
>the efficiency argument.  And the security difference is really
>non-existant.

I presume that given the earlier comments above, you now understand
the efficiency and security difference?

>  And besides, your example doesn't work...

>always trim it...  So, for that to work, you'd still need a trim:
>       <set variable="var.title"><trim>&_.prefix; &_.title;</trim></set>
>or something equivalent...

Ok, by all means, then put that in the emit body.
Then again, if the prefix is empty most of the time, it's marginally
more efficient to do the concatenating and trimming in the SQL server.
-- 
Stephen.

Re: Validating form arguments [final version]

Reply via email to