On 2013-03-06 14:18, Amaury Forgeot d'Arc wrote:
Hi,

2013/3/6 Matěj Cepl <mc...@redhat.com <mailto:mc...@redhat.com>>


    On 2013-02-26, 16:25 GMT, Terry Reedy wrote:
     > On 2/21/2013 4:22 PM, Matej Cepl wrote:
     >> as my method to commemorate Aaron Swartz, I have decided to port his
     >> html2text to work fully with the latest python 3.3. After some time
     >> dealing with various bugs, I have now in my repo
     >> https://github.com/mcepl/html2text (branch python3) working solution
     >> which works all the way to python 3.2 (inclusive;
     >> https://travis-ci.org/mcepl/html2text). However, the last problem
     >> remains. This
     >>
     >> <li>Run this command:
     >> <pre>ls -l *.html</pre></li>
     >> <li>?</li>
     >>
     >> should lead to
     >>
     >>    * Run this command:
     >>
     >>          ls -l *.html
     >>
     >>    * ?
     >>
     >> but it doesn’t. It leads to this (with python 3.3 only)
     >>
     >>      * Run this command:
     >>            ls -l *.html
     >>
     >>      * ?
     >>
     >> Does anybody know about something which changed in modules re or
     >> http://docs.python.org/3.3/whatsnew/changelog.html between 3.2 and
     >> 3.3, which could influence this script?
     >
     > Search the changelob or 3.3 misc/News for items affecting those two
     > modules. There are at least 4.
     > http://docs.python.org/3.3/whatsnew/changelog.html
     >
     > It is faintly possible that the switch from narrow/wide builds to
     > unified builds somehow affected that. Have you tested with 2.7/3.2 on
     > both narrow and wide unicode builds?

    So, in the end, I have went the long way and bisected cpython to
    find the commit which broke my tests, and it seems that the
    culprit is http://hg.python.org/cpython/rev/123f2dc08b3e so it is
    clearly something Unicode related.

    Unfortunately, it really doesn't tell me what exactly is broken
    (is it a known regression) and if there is known workaround.
    Could anybody suggest a way how to find bugs on
    http://bugs.python.org related to some particular commit (plain
    search for 123f2dc0 didn’t find anything).


I strongly suspect an incorrect usage of the "is" operator:
https://github.com/mcepl/html2text/blob/master/html2text.py#L95
Identity of strings is not guaranteed...

Does it change something if you use "==" instead?

That function looks a little odd to me. Maybe I just don't understand
what it's doing! :-)
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to