[issue9985] difflib.SequenceMatcher has slightly buggy and undocumented caching behavior

2010-10-01 Thread Christoph Burgmer

Christoph Burgmer cburg...@ira.uka.de added the comment:

Here's a test case and a fix for get_matching_blocks() to return the same 
content on subsequent calls.

--
keywords: +patch
nosy: +christoph
Added file: http://bugs.python.org/file19084/get_matching_blocks.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9985
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9985] difflib.SequenceMatcher has slightly buggy and undocumented caching behavior

2010-10-01 Thread Christoph Burgmer

Christoph Burgmer cburg...@ira.uka.de added the comment:

BTW, here's the commit that broke the behavior in the first place: 
http://svn.python.org/view/python/trunk/Lib/difflib.py?r1=54230r2=59907

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9985
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6412] Titlecase as defined in Unicode Case Mappings not followed

2010-08-04 Thread Christoph Burgmer

Christoph Burgmer cburg...@ira.uka.de added the comment:

@Terry

How is the behavior changed?  To me it seems the same to as initially reported.
The results are consistent but nonetheless wrong. It's not about whether your 
agree with the result, but rather about following the Unicode standard.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6412
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1602] windows console doesn't print utf8 (Py30a2)

2010-06-19 Thread Christoph Burgmer

Christoph Burgmer cburg...@ira.uka.de added the comment:

Will this bug be tackled or Python2.7?

And is there a way to get hold of the access denied error?

Here are my steps to reproduce:

I started the console with cmd /u /k chcp 65001
___
Aktive Codepage: 65001.

C:\Dokumente und Einstellungen\rootset PYTHONIOENCODING=UTF-8

C:\Dokumente und Einstellungen\rootd:

D:\cd Python31

D:\Python31python
Python 3.1.2 (r312:79149, Mar 21 2010, 00:41:52) [MSC v.1500 32 bit (Intel)] on 
win32
Type help, copyright, credits or license for more information.
 print(\u573a)
场
Traceback (most recent call last):
  File stdin, line 1, in module
IOError: [Errno 13] Permission denied

___

I see a rectangle on screen but obviously cp works.

--
nosy: +christoph

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1602
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8192] SQLite3 PRAGMA table_info doesn't respect database on Win32

2010-03-21 Thread Christoph Burgmer

New submission from Christoph Burgmer cburg...@ira.uka.de:

'PRAGMA database.table_info(SOME_TABLE_NAME)' will report table metadata for 
the given database. The main database called 'main', can be extended by 
attaching further databases via 'ATTACH DATABASE'. The above PRAGMA should 
respect the chosen database, but fails to do so on Win32 (tested on Wine) while 
it does on Linux.

How to reproduce:

FILE 'first.db' has table:

  CREATE TABLE First (
  Test INTEGER NOT NULL
  );

FILE 'second.db' has table:

  CREATE TABLE Second (
  Test INTEGER NOT NULL
  );

The final result of the following code shoule be empty, but returns table data 
from second.db instead.

Y:\python
Python 2.6.5 (r265:79096, Mar 19 2010, 21:48:26) [MSC v.1500 32 bit (Intel)] on 
win32
Type help, copyright, credits or license for more information.
 import sqlite3
 conn = sqlite3.connect('first.db')
 c = conn.cursor()
 c.execute(ATTACH DATABASE 'second.db' AS 'second')
sqlite3.Cursor object at 0x0071FB00
 for row in c:
... print repr(row)
...
 c.execute(PRAGMA 'main'.table_info('Second'))
sqlite3.Cursor object at 0x0071FB00
 for row in c:
... print repr(row)
...
(0, u'Test', u'INTEGER', 99, None, 0)


In contrast sqlite3.exe respects the value for the same command:

Y:\sqlite3.exe first.db
SQLite version 3.6.23
Enter .help for instructions
Enter SQL statements terminated with a ;
sqlite .tables
First
sqlite ATTACH DATABASE 'second.db' AS 'second';
sqlite .tables
First
sqlite PRAGMA main.table_info('Second');
sqlite PRAGMA second.table_info('Second');
0|Test|INTEGER|1||0
sqlite

Advice on further debugging possibilities is requested. I do not have a Windows 
system available though, nor can I currently compile for Win32.

--
components: Library (Lib)
messages: 101440
nosy: christoph
severity: normal
status: open
title: SQLite3 PRAGMA table_info doesn't respect database on Win32
versions: Python 2.6

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8192
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6412] Titlecase as defined in Unicode Case Mappings not followed

2009-09-14 Thread Christoph Burgmer

Christoph Burgmer cburg...@ira.uka.de added the comment:

Implementing full patch solving it the old way (UTR#21).

The correct way for the latest Unicode version would be to implement
the word breaking algorithm described in (UAX#29) [1] first.

[1] http://www.unicode.org/reports/tr29/#Word_Boundaries

--
Added file: http://bugs.python.org/file14890/unicodeobject.titlecase.2.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6412
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6412] Titlecase as defined in Unicode Case Mappings not followed

2009-09-14 Thread Christoph Burgmer

Christoph Burgmer cburg...@ira.uka.de added the comment:

I should add that I didn't include the two header files generated by
Tools/unicode/makeunicodedata.py

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6412
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6656] locale.format_string fails on escaped percentage

2009-08-06 Thread Christoph Burgmer

New submission from Christoph Burgmer cburg...@ira.uka.de:

locale.format_string doesn't return same result as a normal
string % format
directive, but raises a TypeError. See attached test case for Python
2.6.

 locale.format_string('%f%%', 1.0)
Traceback (most recent call last):
  File stdin, line 1, in module
  File /usr/lib/python2.5/locale.py, line 195, in format_string
return new_f % val
TypeError: not enough arguments for format string
 '%f%%' % 1.0
'1.00%'

--
components: Library (Lib)
files: locale_percents_test.diff
keywords: patch
messages: 91352
nosy: christoph
severity: normal
status: open
title: locale.format_string fails on escaped percentage
versions: Python 2.5, Python 2.6
Added file: http://bugs.python.org/file14665/locale_percents_test.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6656
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6656] locale.format_string fails on escaped percentage

2009-08-06 Thread Christoph Burgmer

Christoph Burgmer cburg...@ira.uka.de added the comment:

This patch removes '%%' entities from the regex results and only
replaces other matches with '%s' which later then get replaced by
localized versions so that escaped percentage entities don't show up in
localized parsing anymore.

Removing case '%%' from the regex completely does not sound feasible
and will result in '%%d' having a match '%d', though d should be a
normal character.

The replacing of regex matches does not look that beautiful, feel free
to rewrite said part.

--
Added file: http://bugs.python.org/file14666/locale_percents.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6656
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6625] UnicodeEncodeError on pydoc's CLI

2009-08-05 Thread Christoph Burgmer

Christoph Burgmer cburg...@ira.uka.de added the comment:

Here is a diff for test/test_pydoc.py (against Python2.6) which though
doesn't trigger due to how Python handles output encoding. This test
here will pass, but pydoc will still fail:

$ pydoc test/pydoc_mod.py  /dev/null
Traceback (most recent call last):
  File /usr/bin/pydoc, line 5, in module
pydoc.cli()
  File /usr/lib/python2.5/pydoc.py, line 2226, in cli
help.help(arg)
  File /usr/lib/python2.5/pydoc.py, line 1691, in help
else: doc(request, 'Help on %s:')
  File /usr/lib/python2.5/pydoc.py, line 1482, in doc
pager(title % desc + '\n\n' + text.document(object, name))
  File /usr/lib/python2.5/pydoc.py, line 1300, in pager
pager(text)
  File /usr/lib/python2.5/pydoc.py, line 1398, in plainpager
sys.stdout.write(plain(text))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in
position 936: ordinal not in range(128)

--
Added file: 
http://bugs.python.org/file14656/pydoc_unicode_testcase_notworking.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6625
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6625] UnicodeEncodeError on pydoc's CLI

2009-08-02 Thread Christoph Burgmer

New submission from Christoph Burgmer cburg...@ira.uka.de:

pydoc fails with a UnicodeEncodeError for properly specified Unicode
docstrings (u...) on the command line interface.

See attached patch that encodes the output with the system's encoding.

--
components: Extension Modules
files: unicode.patch
keywords: patch
messages: 91182
nosy: christoph
severity: normal
status: open
title: UnicodeEncodeError on pydoc's CLI
versions: Python 2.5, Python 2.6
Added file: http://bugs.python.org/file14626/unicode.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6625
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6412] Titlecase as defined in Unicode Case Mappings not followed

2009-07-16 Thread Christoph Burgmer

Christoph Burgmer cburg...@ira.uka.de added the comment:

Casing algorithms should follow Section 3.13 Default Case Algorithms
in the standard itself, not UTR#21.

See
http://www.unicode.org/Public/5.2.0/ucd/DerivedCoreProperties-5.2.0d11.
Unicode 5.2. A nice mail on the Unicode mail list has a bit explanation
to that: http://www.unicode.org/mail-arch/unicode-ml/y2009-

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6412
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6412] Titlecase as defined in Unicode Case Mappings not followed

2009-07-03 Thread Christoph Burgmer

New submission from Christoph Burgmer cburg...@ira.uka.de:

Titlecase, i.e. istitle() and title(), is buggy when the string
includes combining diacritical marks.

 u'H\u0301ngh'.istitle()
False
 u'H\u0301ngh'.title()
u'H\u0301Ngh'


The string given already is in titlecase so that the following result
is expected:
 u'H\u0301ngh'.istitle()
True
 u'H\u0301ngh'.title()
u'H\u0301ngh'


UTR#21 Case Mappings defines the following algorithm for titlecase
mapping [1]:

For each character C, find the preceding character B. 
  ignore any intervening case-ignorable characters when finding B.
If B exists, and is cased 
  map C to UCD_lower(C)
Otherwise, 
  map C to UCD_title(C)

The class of 'case-ignorable' is defined under [2] and includes
Nonspacing Marks (Mn) as listed in [3]. This includes diacritcal marks
and others. These should not be handled similar to spaces which they
currently are, thus dividing words.

A patch including the above test case is attached.

[1]
http://unicode.org/reports/tr21/tr21-5.html#Case_Conversion_of_Strings
[2] http://unicode.org/reports/tr21/tr21-5.html#Definitions
[3] http://www.fileformat.info/info/unicode/category/Mn/list.htm

--
components: Library (Lib)
files: test_unicode.titlecase.diff
keywords: patch
messages: 90086
nosy: christoph
severity: normal
status: open
title: Titlecase as defined in Unicode Case Mappings not followed
versions: Python 2.5, Python 2.6
Added file: http://bugs.python.org/file14443/test_unicode.titlecase.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6412
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6412] Titlecase as defined in Unicode Case Mappings not followed

2009-07-03 Thread Christoph Burgmer

Christoph Burgmer cburg...@ira.uka.de added the comment:

Adding a incomplete patch in need of a function
Py_UNICODE_ISCASEIGNORABLE defining the case-ignorable class.

I don't want to touch capitalize() as I don't fully understand the
semantics, where it is different to title(). It seems though following
UTR#21 not the first character should be uppercased, but the first
character with casing.

--
Added file: http://bugs.python.org/file1/unicodeobject.titlecase.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6412
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1293741] doctest runner cannot handle non-ascii characters

2009-07-01 Thread Christoph Burgmer

Christoph Burgmer cburg...@ira.uka.de added the comment:

My last patch only changed the encoding used in DocTestRunner.run().
This new patch will apply the same to DocTestCase.runTest().

--
Added file: http://bugs.python.org/file14422/doctest.unicode.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1293741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3955] maybe doctest doesn't understand unicode_literals?

2009-07-01 Thread Christoph Burgmer

Christoph Burgmer cburg...@ira.uka.de added the comment:

JFTR: To yield the results of my last comment, you need to apply the
patch posted in http://bugs.python.org/issue1293741

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue3955
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3955] maybe doctest doesn't understand unicode_literals?

2009-06-30 Thread Christoph Burgmer

Christoph Burgmer cburg...@ira.uka.de added the comment:

This problem seems more severe as the appended test case shows.

That gives me:

Expected:
u'ī'
Got:
u'\u012b'

Both literals are the same.

Unicode literals in doc strings are not treated as other escaped
characters: 

 repr(r'\n')
'n'
 repr('\n')
'\\n'

but:

 repr(ur'\u012b')
u'\\u012b'
 repr(u'\u012b')
u'\\u012b'

So there is no work around in the docstring's reference itself.

I file this here, even though the problems are not strictly equal. I do
believe though that there is or should be a common solution to these
issues. Both results need to be interpreted on a more abstract scale.

--
Added file: http://bugs.python.org/file14406/test.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue3955
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1293741] doctest runner cannot handle non-ascii characters

2009-06-30 Thread Christoph Burgmer

Christoph Burgmer cburg...@ira.uka.de added the comment:

See attached patch which works for error reporting and verbose output.

--
keywords: +patch
nosy: +christoph
Added file: http://bugs.python.org/file14407/doctest.unicode.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1293741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3955] maybe doctest doesn't understand unicode_literals?

2009-06-29 Thread Christoph Burgmer

Christoph Burgmer cburg...@ira.uka.de added the comment:

OutputChecker.check_output() seems to be responsible for comparing
'example.want' and 'got' literals and this is obviously done literally.
So as u'1' is different to '1' this is reflected in the result.
This gets more complicated with literals like [u'1', u'2'] I believe.
So, eval() could be used for testing for equality:

 repr(['1', '2']) == repr([u'1', u'2'])
False

but

 eval(repr(['1', '2'])) == eval(repr([u'1', u'2']))
True

doctests are already compiled and executed, but evaluating the doctest
code's result is probably a security issue, so a method doing the
invers of repr() could be used, that only works on variables; something
like Pickle, but without its own protocol.

--
nosy: +christoph

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue3955
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2517] Error when printing an exception containing a Unicode string

2008-04-02 Thread Christoph Burgmer

Christoph Burgmer [EMAIL PROTECTED] added the comment:

JFTR:
 print unicode(e.message).encode(utf-8)
only works for Python 2.5, not downwards.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2517
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2517] Error when printing an exception containing a Unicode string

2008-03-31 Thread Christoph Burgmer

Christoph Burgmer [EMAIL PROTECTED] added the comment:

To be more precise: I see no way to convert the encapsulated non-ASCII 
data from the string in an easy way.
Taking e from my last post none of the following will work:
str(e) # UnicodeDecodeError
e.__str__() # UnicodeDecodeError
e.__unicode__() # AttributeError
unicode(e) # UnicodeDecodeError
unicode(e, 'utf8') # TypeError

My solution around this right now is raising an exception with an 
already converted string (see the link I provided).

But as the tutorials speak of simply print e I guess the behaviour 
described above is some kind of a bug.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2517
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2517] Error when printing an exception containing a Unicode string

2008-03-31 Thread Christoph Burgmer

Christoph Burgmer [EMAIL PROTECTED] added the comment:

Thanks, this does work.

But, where can I find the piece of information you just gave to me in 
the docs? I couldn't find any interface definition for Exceptions.

Further more will this be regarded as a bug?
From [1] I understand that unicode(e) and unicode(e, 'utf8') are 
supposed to work. No limitations are made on the type of the object. 
And I suppose that unicode() is the exact equivalent of str() in that 
it copes with unicode strings. Not expecting the string representation 
of an Exception to return a Unicode string when its content is 
non-ASCII where as this kind of behaviour of simple string conversion 
is wished for with ASCII text seems unlikely cumbersome.

Please reopen if my report does have a point.

[1] http://docs.python.org/lib/built-in-funcs.html

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2517
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2517] Error when printing an exception containing a Unicode string

2008-03-31 Thread Christoph Burgmer

Christoph Burgmer [EMAIL PROTECTED] added the comment:

Though I welcome the reopening of the bug for Python 3.0 I must say 
that plans of not fixing a core element rather surprises me.

I never believed Python to be a programming language with good Unicode 
integration. Several points were missing that would've been nice or 
even essential to have for good development with Unicode, most ignored 
for the sake of maintaining backward compatibility. This though is not 
the fault of the Unicode class itself and supporting packages.

Some modules like the one for CSV are lacking full Unicode support. 
But nevertheless the basic Python would always give you the 
possibility to use Unicode in (at least) a consistent way. For me 
raising exceptions does count as basic support like this.

So I still hope to see this solved for the 2.x versions which I read 
will be maintained even after the release of 3.0.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2517
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2517] Error when printing an exception containing a Unicode string

2008-03-30 Thread Christoph Burgmer

New submission from Christoph Burgmer [EMAIL PROTECTED]:

Python seems to have problems when an exception is thrown that 
contains non-ASCII text as a message and is converted to a string.

 try:
... raise Exception(u'Error when printing ü')
... except Exception, e:
... print e
...
Traceback (most recent call last):
  File , line 4, in ?
UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in 
position 20:
ordinal not in range(128)

See 
http://www.stud.uni-karlsruhe.de/~uyhc/de/content/python-and-exceptions-containing-unicode-messages

--
components: Unicode
messages: 64770
nosy: christoph
severity: normal
status: open
title: Error when printing an exception containing a Unicode string
type: behavior
versions: Python 2.4, Python 2.5

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2517
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com