On 1/23/06, Graham Dumpleton <[EMAIL PROTECTED]> wrote:
> Using "print" like that can't be done in an "eval", would need to
> use "exec".

Sorry, I probably didn't mean to use the print in my example.

Of course though you can always wrap sys.stdout if you wanted
to capture the output for post-escaping.

> If this approach [to charset handing] in mod_python.publisher as
> seen as reasonable and works, then for 'eval' one could take the
> same approach.

Yes, I think what publisher does looks quite reasonable to me.
They should do it the same way.

> That said, can you explain more about the differences between HTML
> and XML escaping

It comes down to what to do with quote marks.  In HTML escaping you
usually use entity references, but in XML you must use numeric character
references for anything except <, >, and &.

    HTML   XML
    ---------------------
<   &lt;    &lt;
>   &gt;   &gt;
&   &amp;  &amp;
"   &quot;   &#34;
'   &apos;  &#39;

But really HTML can use character references too, so you could just use
XML escaping and not worry about an HTML special case.

> Are there particular Python routines that implement
> each variant, or is it some option to 'cgi.escape'?

There is a second optional-argument to cgi.escape, which is a Boolean
defaulting to False.  If True, it will escape " as &quot;.  It never escapes
the apostrophe.

> which is the preferred routine for url encoding?

That's much less clear because there's no well-defined idea of
what exactly URL-escaping is...it depends upon the kind of URL.
I would tend to think it would be urllib.quote_plus()

Note that sometimes you may want to do multiple escaping.
URL escaping followed by HTML escaping.  Perhaps in something
like,
  <a href="<!--#python esc="uh" eval="random_page()"-->">surprise</a>
although that is admittedly an ugly use case.
--
Deron Meranda

Reply via email to