[issue9985] difflib.SequenceMatcher has slightly buggy and undocumented caching behavior
Christoph Burgmer cburg...@ira.uka.de added the comment: Here's a test case and a fix for get_matching_blocks() to return the same content on subsequent calls. -- keywords: +patch nosy: +christoph Added file: http://bugs.python.org/file19084/get_matching_blocks.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9985 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue9985] difflib.SequenceMatcher has slightly buggy and undocumented caching behavior
Christoph Burgmer cburg...@ira.uka.de added the comment: BTW, here's the commit that broke the behavior in the first place: http://svn.python.org/view/python/trunk/Lib/difflib.py?r1=54230r2=59907 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9985 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6412] Titlecase as defined in Unicode Case Mappings not followed
Christoph Burgmer cburg...@ira.uka.de added the comment: @Terry How is the behavior changed? To me it seems the same to as initially reported. The results are consistent but nonetheless wrong. It's not about whether your agree with the result, but rather about following the Unicode standard. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6412 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1602] windows console doesn't print utf8 (Py30a2)
Christoph Burgmer cburg...@ira.uka.de added the comment: Will this bug be tackled or Python2.7? And is there a way to get hold of the access denied error? Here are my steps to reproduce: I started the console with cmd /u /k chcp 65001 ___ Aktive Codepage: 65001. C:\Dokumente und Einstellungen\rootset PYTHONIOENCODING=UTF-8 C:\Dokumente und Einstellungen\rootd: D:\cd Python31 D:\Python31python Python 3.1.2 (r312:79149, Mar 21 2010, 00:41:52) [MSC v.1500 32 bit (Intel)] on win32 Type help, copyright, credits or license for more information. print(\u573a) 场 Traceback (most recent call last): File stdin, line 1, in module IOError: [Errno 13] Permission denied ___ I see a rectangle on screen but obviously cp works. -- nosy: +christoph ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1602 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8192] SQLite3 PRAGMA table_info doesn't respect database on Win32
New submission from Christoph Burgmer cburg...@ira.uka.de: 'PRAGMA database.table_info(SOME_TABLE_NAME)' will report table metadata for the given database. The main database called 'main', can be extended by attaching further databases via 'ATTACH DATABASE'. The above PRAGMA should respect the chosen database, but fails to do so on Win32 (tested on Wine) while it does on Linux. How to reproduce: FILE 'first.db' has table: CREATE TABLE First ( Test INTEGER NOT NULL ); FILE 'second.db' has table: CREATE TABLE Second ( Test INTEGER NOT NULL ); The final result of the following code shoule be empty, but returns table data from second.db instead. Y:\python Python 2.6.5 (r265:79096, Mar 19 2010, 21:48:26) [MSC v.1500 32 bit (Intel)] on win32 Type help, copyright, credits or license for more information. import sqlite3 conn = sqlite3.connect('first.db') c = conn.cursor() c.execute(ATTACH DATABASE 'second.db' AS 'second') sqlite3.Cursor object at 0x0071FB00 for row in c: ... print repr(row) ... c.execute(PRAGMA 'main'.table_info('Second')) sqlite3.Cursor object at 0x0071FB00 for row in c: ... print repr(row) ... (0, u'Test', u'INTEGER', 99, None, 0) In contrast sqlite3.exe respects the value for the same command: Y:\sqlite3.exe first.db SQLite version 3.6.23 Enter .help for instructions Enter SQL statements terminated with a ; sqlite .tables First sqlite ATTACH DATABASE 'second.db' AS 'second'; sqlite .tables First sqlite PRAGMA main.table_info('Second'); sqlite PRAGMA second.table_info('Second'); 0|Test|INTEGER|1||0 sqlite Advice on further debugging possibilities is requested. I do not have a Windows system available though, nor can I currently compile for Win32. -- components: Library (Lib) messages: 101440 nosy: christoph severity: normal status: open title: SQLite3 PRAGMA table_info doesn't respect database on Win32 versions: Python 2.6 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8192 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6412] Titlecase as defined in Unicode Case Mappings not followed
Christoph Burgmer cburg...@ira.uka.de added the comment: Implementing full patch solving it the old way (UTR#21). The correct way for the latest Unicode version would be to implement the word breaking algorithm described in (UAX#29) [1] first. [1] http://www.unicode.org/reports/tr29/#Word_Boundaries -- Added file: http://bugs.python.org/file14890/unicodeobject.titlecase.2.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6412 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6412] Titlecase as defined in Unicode Case Mappings not followed
Christoph Burgmer cburg...@ira.uka.de added the comment: I should add that I didn't include the two header files generated by Tools/unicode/makeunicodedata.py -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6412 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6656] locale.format_string fails on escaped percentage
New submission from Christoph Burgmer cburg...@ira.uka.de: locale.format_string doesn't return same result as a normal string % format directive, but raises a TypeError. See attached test case for Python 2.6. locale.format_string('%f%%', 1.0) Traceback (most recent call last): File stdin, line 1, in module File /usr/lib/python2.5/locale.py, line 195, in format_string return new_f % val TypeError: not enough arguments for format string '%f%%' % 1.0 '1.00%' -- components: Library (Lib) files: locale_percents_test.diff keywords: patch messages: 91352 nosy: christoph severity: normal status: open title: locale.format_string fails on escaped percentage versions: Python 2.5, Python 2.6 Added file: http://bugs.python.org/file14665/locale_percents_test.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6656 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6656] locale.format_string fails on escaped percentage
Christoph Burgmer cburg...@ira.uka.de added the comment: This patch removes '%%' entities from the regex results and only replaces other matches with '%s' which later then get replaced by localized versions so that escaped percentage entities don't show up in localized parsing anymore. Removing case '%%' from the regex completely does not sound feasible and will result in '%%d' having a match '%d', though d should be a normal character. The replacing of regex matches does not look that beautiful, feel free to rewrite said part. -- Added file: http://bugs.python.org/file14666/locale_percents.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6656 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6625] UnicodeEncodeError on pydoc's CLI
Christoph Burgmer cburg...@ira.uka.de added the comment: Here is a diff for test/test_pydoc.py (against Python2.6) which though doesn't trigger due to how Python handles output encoding. This test here will pass, but pydoc will still fail: $ pydoc test/pydoc_mod.py /dev/null Traceback (most recent call last): File /usr/bin/pydoc, line 5, in module pydoc.cli() File /usr/lib/python2.5/pydoc.py, line 2226, in cli help.help(arg) File /usr/lib/python2.5/pydoc.py, line 1691, in help else: doc(request, 'Help on %s:') File /usr/lib/python2.5/pydoc.py, line 1482, in doc pager(title % desc + '\n\n' + text.document(object, name)) File /usr/lib/python2.5/pydoc.py, line 1300, in pager pager(text) File /usr/lib/python2.5/pydoc.py, line 1398, in plainpager sys.stdout.write(plain(text)) UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 936: ordinal not in range(128) -- Added file: http://bugs.python.org/file14656/pydoc_unicode_testcase_notworking.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6625 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6625] UnicodeEncodeError on pydoc's CLI
New submission from Christoph Burgmer cburg...@ira.uka.de: pydoc fails with a UnicodeEncodeError for properly specified Unicode docstrings (u...) on the command line interface. See attached patch that encodes the output with the system's encoding. -- components: Extension Modules files: unicode.patch keywords: patch messages: 91182 nosy: christoph severity: normal status: open title: UnicodeEncodeError on pydoc's CLI versions: Python 2.5, Python 2.6 Added file: http://bugs.python.org/file14626/unicode.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6625 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6412] Titlecase as defined in Unicode Case Mappings not followed
Christoph Burgmer cburg...@ira.uka.de added the comment: Casing algorithms should follow Section 3.13 Default Case Algorithms in the standard itself, not UTR#21. See http://www.unicode.org/Public/5.2.0/ucd/DerivedCoreProperties-5.2.0d11. Unicode 5.2. A nice mail on the Unicode mail list has a bit explanation to that: http://www.unicode.org/mail-arch/unicode-ml/y2009- -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6412 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6412] Titlecase as defined in Unicode Case Mappings not followed
New submission from Christoph Burgmer cburg...@ira.uka.de: Titlecase, i.e. istitle() and title(), is buggy when the string includes combining diacritical marks. u'H\u0301ngh'.istitle() False u'H\u0301ngh'.title() u'H\u0301Ngh' The string given already is in titlecase so that the following result is expected: u'H\u0301ngh'.istitle() True u'H\u0301ngh'.title() u'H\u0301ngh' UTR#21 Case Mappings defines the following algorithm for titlecase mapping [1]: For each character C, find the preceding character B. ignore any intervening case-ignorable characters when finding B. If B exists, and is cased map C to UCD_lower(C) Otherwise, map C to UCD_title(C) The class of 'case-ignorable' is defined under [2] and includes Nonspacing Marks (Mn) as listed in [3]. This includes diacritcal marks and others. These should not be handled similar to spaces which they currently are, thus dividing words. A patch including the above test case is attached. [1] http://unicode.org/reports/tr21/tr21-5.html#Case_Conversion_of_Strings [2] http://unicode.org/reports/tr21/tr21-5.html#Definitions [3] http://www.fileformat.info/info/unicode/category/Mn/list.htm -- components: Library (Lib) files: test_unicode.titlecase.diff keywords: patch messages: 90086 nosy: christoph severity: normal status: open title: Titlecase as defined in Unicode Case Mappings not followed versions: Python 2.5, Python 2.6 Added file: http://bugs.python.org/file14443/test_unicode.titlecase.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6412 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6412] Titlecase as defined in Unicode Case Mappings not followed
Christoph Burgmer cburg...@ira.uka.de added the comment: Adding a incomplete patch in need of a function Py_UNICODE_ISCASEIGNORABLE defining the case-ignorable class. I don't want to touch capitalize() as I don't fully understand the semantics, where it is different to title(). It seems though following UTR#21 not the first character should be uppercased, but the first character with casing. -- Added file: http://bugs.python.org/file1/unicodeobject.titlecase.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6412 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1293741] doctest runner cannot handle non-ascii characters
Christoph Burgmer cburg...@ira.uka.de added the comment: My last patch only changed the encoding used in DocTestRunner.run(). This new patch will apply the same to DocTestCase.runTest(). -- Added file: http://bugs.python.org/file14422/doctest.unicode.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1293741 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3955] maybe doctest doesn't understand unicode_literals?
Christoph Burgmer cburg...@ira.uka.de added the comment: JFTR: To yield the results of my last comment, you need to apply the patch posted in http://bugs.python.org/issue1293741 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3955 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3955] maybe doctest doesn't understand unicode_literals?
Christoph Burgmer cburg...@ira.uka.de added the comment: This problem seems more severe as the appended test case shows. That gives me: Expected: u'ī' Got: u'\u012b' Both literals are the same. Unicode literals in doc strings are not treated as other escaped characters: repr(r'\n') 'n' repr('\n') '\\n' but: repr(ur'\u012b') u'\\u012b' repr(u'\u012b') u'\\u012b' So there is no work around in the docstring's reference itself. I file this here, even though the problems are not strictly equal. I do believe though that there is or should be a common solution to these issues. Both results need to be interpreted on a more abstract scale. -- Added file: http://bugs.python.org/file14406/test.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3955 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1293741] doctest runner cannot handle non-ascii characters
Christoph Burgmer cburg...@ira.uka.de added the comment: See attached patch which works for error reporting and verbose output. -- keywords: +patch nosy: +christoph Added file: http://bugs.python.org/file14407/doctest.unicode.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1293741 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3955] maybe doctest doesn't understand unicode_literals?
Christoph Burgmer cburg...@ira.uka.de added the comment: OutputChecker.check_output() seems to be responsible for comparing 'example.want' and 'got' literals and this is obviously done literally. So as u'1' is different to '1' this is reflected in the result. This gets more complicated with literals like [u'1', u'2'] I believe. So, eval() could be used for testing for equality: repr(['1', '2']) == repr([u'1', u'2']) False but eval(repr(['1', '2'])) == eval(repr([u'1', u'2'])) True doctests are already compiled and executed, but evaluating the doctest code's result is probably a security issue, so a method doing the invers of repr() could be used, that only works on variables; something like Pickle, but without its own protocol. -- nosy: +christoph ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3955 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2517] Error when printing an exception containing a Unicode string
Christoph Burgmer [EMAIL PROTECTED] added the comment: JFTR: print unicode(e.message).encode(utf-8) only works for Python 2.5, not downwards. __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2517 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2517] Error when printing an exception containing a Unicode string
Christoph Burgmer [EMAIL PROTECTED] added the comment: To be more precise: I see no way to convert the encapsulated non-ASCII data from the string in an easy way. Taking e from my last post none of the following will work: str(e) # UnicodeDecodeError e.__str__() # UnicodeDecodeError e.__unicode__() # AttributeError unicode(e) # UnicodeDecodeError unicode(e, 'utf8') # TypeError My solution around this right now is raising an exception with an already converted string (see the link I provided). But as the tutorials speak of simply print e I guess the behaviour described above is some kind of a bug. __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2517 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2517] Error when printing an exception containing a Unicode string
Christoph Burgmer [EMAIL PROTECTED] added the comment: Thanks, this does work. But, where can I find the piece of information you just gave to me in the docs? I couldn't find any interface definition for Exceptions. Further more will this be regarded as a bug? From [1] I understand that unicode(e) and unicode(e, 'utf8') are supposed to work. No limitations are made on the type of the object. And I suppose that unicode() is the exact equivalent of str() in that it copes with unicode strings. Not expecting the string representation of an Exception to return a Unicode string when its content is non-ASCII where as this kind of behaviour of simple string conversion is wished for with ASCII text seems unlikely cumbersome. Please reopen if my report does have a point. [1] http://docs.python.org/lib/built-in-funcs.html __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2517 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2517] Error when printing an exception containing a Unicode string
Christoph Burgmer [EMAIL PROTECTED] added the comment: Though I welcome the reopening of the bug for Python 3.0 I must say that plans of not fixing a core element rather surprises me. I never believed Python to be a programming language with good Unicode integration. Several points were missing that would've been nice or even essential to have for good development with Unicode, most ignored for the sake of maintaining backward compatibility. This though is not the fault of the Unicode class itself and supporting packages. Some modules like the one for CSV are lacking full Unicode support. But nevertheless the basic Python would always give you the possibility to use Unicode in (at least) a consistent way. For me raising exceptions does count as basic support like this. So I still hope to see this solved for the 2.x versions which I read will be maintained even after the release of 3.0. __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2517 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2517] Error when printing an exception containing a Unicode string
New submission from Christoph Burgmer [EMAIL PROTECTED]: Python seems to have problems when an exception is thrown that contains non-ASCII text as a message and is converted to a string. try: ... raise Exception(u'Error when printing ü') ... except Exception, e: ... print e ... Traceback (most recent call last): File , line 4, in ? UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 20: ordinal not in range(128) See http://www.stud.uni-karlsruhe.de/~uyhc/de/content/python-and-exceptions-containing-unicode-messages -- components: Unicode messages: 64770 nosy: christoph severity: normal status: open title: Error when printing an exception containing a Unicode string type: behavior versions: Python 2.4, Python 2.5 __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2517 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com