Re: [Django Code] #5663: markdown 1.6b unicodedecodeerror

Django Code Thu, 17 Jul 2008 07:13:49 -0700

#5663: markdown 1.6b unicodedecodeerror
-------------------------------------------------------------------+--------
          Reporter:  Koen Biermans <[EMAIL PROTECTED]>  |         Owner:  
mboersma
            Status:  reopened                                      |     
Milestone:          
         Component:  Contrib apps                                  |       
Version:  SVN     
        Resolution:                                                |      
Keywords:  markdown
             Stage:  Accepted                                      |     
Has_patch:  1       
        Needs_docs:  0                                             |   
Needs_tests:  0       
Needs_better_patch:  0                                             |  
-------------------------------------------------------------------+--------
Comment (by wayla):


 Replying to [comment:13 Daniel Pope <[EMAIL PROTECTED]>]:
 >
 > I can confirm by empirical testing that 1.6b does require unicode
 strings as input. Your memory has failed you in this case.
 >


 Actually I wrote the code that changed unicode support between 1.6b & 1.7.
 I think I remember what I did. Yes, 1.6b did have ''some'' support for
 unicode, but it did not ''require'' it as you state. It also worked with
 (most) bytestrings. The fact that it supported unicode at all was kind of
 a fluke. It just so happens that the python re module runs just as well on
 unicode strings as it does on byte strings. Therefore, 1.6b added a
 {{{__unicode__}}} method to the Markdown class which simply wrapped
 {{{__str__}}} ({{{return str(self)}}}). In contrast, 1.7 raises a fatal
 error before running if it does not get a unicode string (or a bytestring
 only of ascii characters as they're a subset of unicode anyway) and will
 ''only'' return unicode.

 Oh, and the reason your first example works:

 {{{
 >>> markdown.markdown(i)
 u'<p>\u20ac\xa3\xbd\n</p>'
 }}}

 is because the {{{markdown}}} wrapper function did not call either
 {{{Markdown.__str__}}} or {{{Markdown.__unicode__}}}, but did it's own
 thing reimplementing most of {{{Markdown.__str__}}}. In other words, by
 using the wrapper (shortcut) in 1.6b, you get different behavior than if
 you call the class directly. That's buggy and I don't recommend it.

 Btw, using your example in 1.6b, you should have done this:

 {{{
 >>> markdown.markdown(i.encode('utf8'), encoding='utf8')
 }}}

 But even that was buggy. Shortly after Malcolm submitted a patch (which I
 applied), we threw away most of that convert-to-unicode stuff (we kept
 just enough only for use from the command line - including Malcolm's
 patch) and forced the requirement that Markdown only accept unicode.

 That's the difference here. 1.7 is the only version where we can be
 absolutely sure that it is safe to pass unicode text to markdown. It may
 or may not work in ''any'' earlier version. As a Markdown core dev, I will
 not guarantee that it will work all the time in anything but 1.7. Sure it
 may work fine for you in testing, but then some user will submit some text
 that fails and breaks your site. Debian/Ubuntu need to do the right thing
 here and provide 1.7 which passes a version test for 1.7.

 In any event, it's up to the django core devs to decide what to do in
 Django.

-- 
Ticket URL: <http://code.djangoproject.com/ticket/5663#comment:14>
Django Code <http://code.djangoproject.com/>
The web framework for perfectionists with deadlines
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-updates?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: [Django Code] #5663: markdown 1.6b unicodedecodeerror

Reply via email to