Neil Schemenauer <[EMAIL PROTECTED]> wrote:

>> I propose patch. I think, this patch solves all above-mentioned
>> problems.
> 
> It may solve the above mentioned problems but it doesn't solve this
> very annoying problem:
> 
>     >>> from quixote.html import htmltext
>     >>> htmltext('string %s') % u'\u1234'
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     UnicodeEncodeError: 'ascii' codec can't encode character u'\u1234' in
> position 0: ordinal not in range(128)

Are you sure??? Do you try install patch? After patch has been
installed on my machine (Windows 2000, Python 2.4, Quixote 2.1)
I can't reproduce error anymore:

>>> from quixote.html._py_htmltext import htmltext
>>> htmltext('string %s') % u'\u1234'
<htmltext u'string \u1234'>

(There are only pure-Python version which has been patched,
_py_htmltext.py, not _c_htmltext)

> The true problem lies in Python's str.__mod__ method.

I don't think so. Of course, str.__mod__ contains problems,
but this error can be completely patched in Quixote code.

IF PATCH NOT INSTALLED, in expression
(htmltext('string %s') % u'\u1234') following happened:

1. htmltext.__mod__ method called

2. Inside htmltext.__mod__, _wraparg(u'\u1234') called (line 74):

         return htmltext(self.s % _wraparg(args))

3. _wraparg(u'\u1234') returns _QuoteWrapper instance (line 166):

         return _QuoteWrapper(arg)

4. str.__mod__ called with _QuoteWrapper instance as second argument

5. _QuoteWrapper don't have __unicode__ method, hence
   _QuoteWrapper.__str__ method called, and (yes, it's very annoying!)
   its result converted from unicode to bytestring

AFTER PATCH HAS BEEN INSTALLED, _QuoteWrapper INSTANCE DON'T CREATED,
INSTEAD _escape_string(u'\u1234') DIRECTLY CALLED:

def _wraparg(arg):
    if isinstance(arg, htmltext):
        return stringify(arg)
    elif isinstance(arg, basestring):  # !!!!!!!!!!!!!!!!!!!!!!!!!!!!
        return _escape_string(arg)     # !!!!!!!!!!!!!!!!!!!!!!!!!!!!
    elif isinstance(arg, (int, long, float)):
        return arg
    else:
        return _QuoteWrapper(arg)

and result (which is escaped unicode string) passed to str.__mod__.
When str.__mod__ see unicode argument (and not _QuoteWrapper instance),
its value don't converted to bytestring

> The true problem lies in Python's str.__mod__ method.  I've fixed it
> for Python 2.5.  With the CVS version of Python, the current version
> of Quixote produces the output you expect.

It's great, but with proposed patch you may use Python 2.4 as well as
Python 2.5



Best regards,
 Alexander                            mailto:[EMAIL PROTECTED]

_______________________________________________
Quixote-users mailing list
[email protected]
http://mail.mems-exchange.org/mailman/listinfo/quixote-users

Reply via email to