[issue1771381] bsddb can't use unicode keys

2007-08-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Unassigning since I don't know the details of bsddb. -- assignee: lemburg - _ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue1771381

[issue225476] Codec naming scheme and aliasing support

2007-08-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Closing this request as the encodings package search function should not be used import external codecs (this poses a security risk). -- status: open - closed Tracker [EMAIL PROTECTED] http://bugs.python.org

[issue880951] ez format code for ParseTuple()

2007-08-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Closing. There doesn't seem to be much interest in this. -- status: open - closed Tracker [EMAIL PROTECTED] http://bugs.python.org/issue880951

[issue1001895] Adding missing ISO 8859 codecs, especially Thai

2007-08-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Not sure why this is still open. The patches were checked in a long time ago. -- status: open - closed _ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue1001895

[issue547537] cStringIO should provide a binary option

2007-08-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Unassigning: I've never had a need for this in the past years. -- assignee: lemburg - Tracker [EMAIL PROTECTED] http://bugs.python.org/issue547537

[issue883466] quopri encoding Unicode

2007-08-30 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Georg: Yes, sure. Tracker [EMAIL PROTECTED] http://bugs.python.org/issue883466 ___ Python-bugs-list mailing list Unsubscribe: http

[issue1528802] Turkish Character

2007-08-30 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Unassigning this. Unless someone provides a patch to add context sensitivity to the Unicode upper/lower conversions, I don't think anything will change. The mapping you see in Python (for Unicode) is taken straight from the Unicode database and there's

[issue1071] unicode.translate() doesn't error out on invalid translation table

2007-09-01 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Nice idea, but why don't you use a dictionary iterator (PyDict_Next()) for the fixup ? __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue1071

[issue1071] unicode.translate() doesn't error out on invalid translation table

2007-09-02 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Ah, I hadn't noticed that you're actually manipulating the input dictionary. You should create a copy and fix that instead of changing the dict that the user passed in to the function. You can then use PyDict_Next() for fast iteration over the original

[issue1505257] winerror module

2007-09-06 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: The winerror module should really be coded in C. Otherwise you don't benefit from the lookup object approach. The files I uploaded only server as basis for such a C module. Would be great if you could find someone to write such a module - preferably using

[issue1082] platform system may be Windows or Microsoft since Vista

2007-09-17 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: A couple of notes: * platform.uname() needs to be fixed, not the individual query functions. * The third entry of uname() should return Vista instead of Microsoft on MS Vista. * A patch should go on trunk and into 2.5.2, since this is a real bug

[issue1082] platform system may be Windows or Microsoft since Vista

2007-09-18 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Pat, we already have system_alias() for exactly the purpose you suggested. Software relying on platform.system() reporting Vista will have to use Python 2.5.2 as minimum Python version system requirement - pretty much the same as with all other bug fixes

[issue10459] missing character names in unicodedata (CJK...)

2010-11-19 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Martin v. Löwis wrote: Martin v. Löwis mar...@v.loewis.de added the comment: Marc-Andre: Many of the characters you refer actually do have names assigned, even if the names don't appear in the Unicode character database. Instead

[issue10466] locale.py resetlocale throws exception on Windows (getdefaultlocale returns value not usable in setlocale)

2010-11-22 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: I think that's a bug in the resetlocale() API. The correct way to reset the locale setting to defaults, it to use setlocale(category, ) The other issues here is that getlocale() appears to return non-ISO language codes on Windows

[issue10435] Document unicode C-API in reST

2010-11-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky belopol...@users.sourceforge.net added the comment: On Wed, Nov 17, 2010 at 5:20 PM, Marc-Andre Lemburg rep...@bugs.python.org wrote: .. -/* Encodes a Unicode object and returns

[issue10466] locale.py resetlocale throws exception on Windows (getdefaultlocale returns value not usable in setlocale)

2010-11-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: R. David Murray wrote: R. David Murray rdmur...@bitdance.com added the comment: I had a report from a user on IRC during the bug weekend that they could not reproduce the failure on windows. So it may be dependent on the windows

[issue10466] locale.py resetlocale throws exception on Windows (getdefaultlocale returns value not usable in setlocale)

2010-11-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Marc-Andre Lemburg wrote: Marc-Andre Lemburg m...@egenix.com added the comment: R. David Murray wrote: R. David Murray rdmur...@bitdance.com added the comment: I had a report from a user on IRC during the bug weekend that they could

[issue10521] str methods don't accept non-BMP fillchar on a narrow Unicode build

2010-11-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: New submission from Alexander Belopolsky belopol...@users.sourceforge.net: 'xyz'.center(20, '\U00100140') Traceback (most recent call last): File stdin, line 1, in module TypeError: The fill character

[issue10524] Patch to add Pardus to supported dists in platform

2010-11-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Looks good. BTW: What is pardus ? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10524

[issue10524] Patch to add Pardus to supported dists in platform

2010-11-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Eric Smith wrote: Eric Smith e...@trueblade.com added the comment: The patch name has 2.7 in it, although Versions says 3.2. As this is a feature request, it can't be added to 2.7. I consider missing distros in the list of supported

[issue10524] Patch to add Pardus to supported dists in platform

2010-11-25 Thread Marc-Andre Lemburg
Changes by Marc-Andre Lemburg m...@egenix.com: -- stage: - commit review type: feature request - behavior versions: +Python 2.7 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10524

[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-11-27 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Raymond Hettinger wrote: Raymond Hettinger rhettin...@users.sourceforge.net added the comment: Mark, can you opine on this? Yes, I'll have a look later today. -- ___ Python tracker rep

[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-11-27 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: I like the idea and thanks for putting work into this. Some comments: * when using macro variables, always put the variables in parens in the expansion; this avoids precedence issues, weird syntax errors, etc. - even if it may

[issue10552] Tools/unicode/gencodec.py error

2010-11-27 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky belopol...@users.sourceforge.net added the comment: Attached patch addresses the issue by using -1 instead of None for missing codes. Comparison of generated encoding files to those

[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-11-27 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky belopol...@users.sourceforge.net added the comment: On Sat, Nov 27, 2010 at 5:03 PM, Marc-Andre Lemburg rep...@bugs.python.org wrote: .. * same for the Py_UNICODE_NEXT() macro, i.e

[issue10557] Malformed error message from float()

2010-11-28 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Mark Dickinson wrote: Mark Dickinson dicki...@gmail.com added the comment: About Alexander's solution: might it make more sense to have PyUnicode_EncodeDecimal raise for inputs like this? I see it as PyUnicode_EncodeDecimal's job

[issue10557] Malformed error message from float()

2010-11-28 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: float('½') Traceback (most recent call last): File stdin, line 1, in module ValueError: could not convert string to float: � float('42½') Traceback (most recent call last): File stdin, line 1, in module ValueError Note

[issue10567] Unicode space character \u200b unrecognised a space

2010-11-28 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Martin v. Löwis wrote: Martin v. Löwis mar...@v.loewis.de added the comment: In 2.6, there was a manually maintained list, probably dating back to before Unicode 4.0. That's not quite correct: Python 1.6.x - 2.5.x used tables

[issue10567] Unicode space character \u200b unrecognised a space

2010-11-28 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: It is still strange that the .isspace() property value changed, since the code point has not changed in the recent Unicode versions: 4.1.0: 200B;ZERO WIDTH SPACE;Cf;0;BN;N; 5.1.0: 200B;ZERO WIDTH SPACE;Cf;0;BN;N; 5.2.0: 200B

[issue10567] Unicode space character \u200b unrecognised a space

2010-11-28 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Going back further shows the change: 3.0.1: 200B;ZERO WIDTH SPACE;Zs;0;BN;N; 3.2.0: 200B;ZERO WIDTH SPACE;Zs;0;BN;N; 4.0.1: 200B;ZERO WIDTH SPACE;Cf;0;BN;N; 4.1.0: 200B;ZERO WIDTH SPACE;Cf;0;BN;N; 5.1.0: 200B

[issue10557] Malformed error message from float()

2010-11-29 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky belopol...@users.sourceforge.net added the comment: After a bit of svn archeology, it does appear that Arabic-Indic digits' support was deliberate at least in the sense that the feature

[issue10575] makeunicodedata.py does not support Unihan digit data

2010-11-29 Thread Marc-Andre Lemburg
New submission from Marc-Andre Lemburg m...@egenix.com: The script only patches numeric data into the table (field 8), but does not update the digit field (field 7). As a result, ideographs used for Chinese digits are not recognized as digits and not evaluated by int(), long() and float

[issue10575] makeunicodedata.py does not support Unihan digit data

2010-11-29 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: The code point is also not listed as decimal digit (relevant for the int() decimal parsing): unicodedata.decimal(unicode('三', 'utf-8')) Traceback (most recent call last): File stdin, line 1, in module ValueError: not a decimal

[issue10575] makeunicodedata.py does not support Unihan digit data

2010-11-29 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Here's a quick overview of the fields that are set for U+4E09: http://www.fileformat.info/info/unicode/char/4e09/index.htm -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org

[issue10575] makeunicodedata.py does not support Unihan digit data

2010-11-29 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: This is the definition of kPrimaryNumeric http://ftp.lanet.lv/ftp/mirror/unicode/5.0.0/ucd/Unihan.html#kPrimaryNumeric -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org

[issue10552] Tools/unicode/gencodec.py error

2010-11-29 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: gencodec.py is only rarely used, namely when adding new codecs based on Unicode mapping files. It is not run regularly on the files from ftp.unicode.org and only updated on demand. AFAIK, it was last used on Python2 and never on Python3

[issue10575] makeunicodedata.py does not support Unihan digit data

2010-11-29 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky belopol...@users.sourceforge.net added the comment: I am adding #10552 as a dependency because I think we should fix unicode data generation in 3.x before adding new features

[issue10552] Tools/unicode/gencodec.py error

2010-11-29 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky belopol...@users.sourceforge.net added the comment: On Mon, Nov 29, 2010 at 1:21 PM, Marc-Andre Lemburg rep...@bugs.python.org wrote: .. BTW: You appear to have a comma appended

[issue10575] makeunicodedata.py does not support Unihan digit data

2010-11-29 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky belopol...@users.sourceforge.net added the comment: On Mon, Nov 29, 2010 at 1:29 PM, Marc-Andre Lemburg rep...@bugs.python.org wrote: .. I consider this a bug (which is why I added

[issue10575] makeunicodedata.py does not support Unihan digit data

2010-11-29 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Martin v. Löwis wrote: Martin v. Löwis mar...@v.loewis.de added the comment: This is not a bug, see http://www.unicode.org/reports/tr44/#Numeric_Value Characters have a Numeric_Type property of either null, Decimal, Digit

[issue10562] Change 'j' for imaginary unit into an 'i'

2010-12-02 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Mark Dickinson wrote: Mark Dickinson dicki...@gmail.com added the comment: There should be an environment variable to make the symbol settable. That could work; it's a bit late to do this in 3.2, though. How about the following

[issue10557] Malformed error message from float()

2010-12-02 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky belopol...@users.sourceforge.net added the comment: I am submitting a patch (issue10557b.diff) for commit review. As Marc suggested, decimal conversion is now performed on Py_UNICODE

[issue10562] Change 'j' for imaginary unit into an 'i'

2010-12-02 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Mark Dickinson wrote: Mark Dickinson dicki...@gmail.com added the comment: In all seriousness, the idea of accepting both 'i' and 'j' in complex() isn't horrible. I'm personally -0.small on it, mostly because it seems likely to lead

[issue10610] Correct the float(), int() and complex() documentation

2010-12-02 Thread Marc-Andre Lemburg
New submission from Marc-Andre Lemburg m...@egenix.com: The Python3 documentation for these numeric constructors is wrong. Python has supported Unicode numerals specified as code points from the Unicode category Nd (decimal digit) since Python 1.6.0 when Unicode was first introduced in Python

[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2010-12-03 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky belopol...@users.sourceforge.net added the comment: I am probably a bit late to this discussion, but why these things should be called codecs and why should they share the registry

[issue10610] Correct the float(), int() and complex() documentation

2010-12-03 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky belopol...@users.sourceforge.net added the comment: Should we also review the documentation for fractions and decimals? For example, fractions are documented as accepting strings

[issue10610] Correct the float(), int() and complex() documentation

2010-12-03 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Raymond Hettinger wrote: Raymond Hettinger rhettin...@users.sourceforge.net added the comment: Try not to sprawl this all over the docs. Find the most common root and document it there. No need to garbage-up Fractions, Decimal etc

[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-03 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky belopol...@users.sourceforge.net added the comment: On Sat, Nov 27, 2010 at 6:38 PM, Raymond Hettinger rep...@bugs.python.org wrote: .. I suggest Py_UNICODE_ADVANCE() to avoid false

[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2010-12-06 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Martin v. Löwis wrote: Martin v. Löwis mar...@v.loewis.de added the comment: As per http://mail.python.org/pipermail/python-dev/2010-December/106374.html I think this checkin should be reverted, as it's breaking the language

[issue6697] Check that _PyUnicode_AsString() result is not NULL

2010-12-07 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky belopol...@users.sourceforge.net added the comment: I am attaching a revised version of the patch which also includes some tests. Interestingly, the issue in syslog module

[issue6697] Check that _PyUnicode_AsString() result is not NULL

2010-12-07 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky belopol...@users.sourceforge.net added the comment: On Tue, Dec 7, 2010 at 12:44 PM, Marc-Andre Lemburg rep...@bugs.python.org wrote: .. * Rather than just patching in error handling

[issue6697] Check that _PyUnicode_AsString() result is not NULL

2010-12-07 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky belopol...@users.sourceforge.net added the comment: On Tue, Dec 7, 2010 at 1:11 PM, Marc-Andre Lemburg rep...@bugs.python.org wrote: I am not sure what you mean by a parser API

[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky belopol...@users.sourceforge.net added the comment: I am attaching a patch for commit review. I added an underscore prefix to all new macros. This way I am not introducing new

[issue4819] Misc/cheatsheet needs updating

2011-01-20 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Updating the cheat sheet would be a great summer of code like project. We are considering using the cheat sheet as basis for a flyer in the PSF marketing material project. Please add it back and add a note to it, that it currently

[issue4819] Misc/cheatsheet needs updating

2011-01-21 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Raymond Hettinger wrote: Raymond Hettinger rhettin...@users.sourceforge.net added the comment: Perhaps the cheatsheet can be transferred to a wiki page and we can put out a comp.lang.python call for updates. Good idea. I just want

[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2011-01-28 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Python can be embedded into other applications and unconditionally changing the locale (esp. the LC_CTYPE) is not good practice, since it's not thread-safe and affects the entire process. An application may have set LC_CTYPE (or the locale

[issue11022] locale.setlocale() doesn't change I/O codec, os.environ

2011-01-28 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: STINNER Victor wrote: STINNER Victor victor.stin...@haypocalc.com added the comment: upon program startup, init LibC environment: setlocale(LC_ALL, ); Python 3 does something like that: Py_InitializeEx() calls setlocale(LC_CTYPE

[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2011-01-28 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Martin v. Löwis wrote: Martin v. Löwis mar...@v.loewis.de added the comment: An clean alternative would be adding LC_* variable parsing code to Python to avoid the setlocale() call altogether. That would be highly non-portable

[issue11167] Overflow in unicode_hash

2011-02-10 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Could you try the same in Python 2.7 ? The overflow is intended (after all, it's a hash function), but we should probably add a cast to Py_hash_t to the hash building line in order to make the compiler aware of this. -- nosy

[issue11173] Undocumented public APIs in Python 3.2

2011-02-10 Thread Marc-Andre Lemburg
New submission from Marc-Andre Lemburg m...@egenix.com: Mark Shannon on python-dev: The following API functions were removed from 3.1.3: PyAST_Compile PyCObject_AsVoidPtr PyCObject_FromVoidPtr PyCObject_FromVoidPtrAndDesc PyCObject_GetDesc PyCObject_Import PyCObject_SetVoidPtr

[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: I am not sure PyUnicode_Decode() should treat NULL as an empty string. Definitely not. That would hide programming errors. -- nosy: +lemburg title: Some trivial python 2.x pickles fails to load

[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Antoine Pitrou wrote: Antoine Pitrou pit...@free.fr added the comment: I am not sure PyUnicode_Decode() should treat NULL as an empty string. Definitely not. That would hide programming errors. Well, this could break some third

[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Antoine Pitrou wrote: Antoine Pitrou pit...@free.fr added the comment: Antoine Pitrou wrote: Antoine Pitrou pit...@free.fr added the comment: I am not sure PyUnicode_Decode() should treat NULL as an empty string. Definitely

[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Antoine Pitrou wrote: Antoine Pitrou pit...@free.fr added the comment: PyUnicode_Decode() et al. are conversion functions and these require valid content to work on. Passing in a NULL pointer does not fit that specification and so

[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Please go with Alexander's solution of fixing the higher level code rather than silently trying to introduce a new feature in PyUnicode_Decode() that hides programming errors. Thanks

[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Jesús Cea Avión wrote: Jesús Cea Avión j...@jcea.es added the comment: What if we commit Antoine patch for 3.2.x, and the correct patch for py3k trunk?. I am actually +1 to Marc-Andre. I feel in my guts that the provided patch

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky belopol...@users.sourceforge.net added the comment: In issue11303.diff, I add similar optimization for encode('latin1') and for 'utf8' variant of utf-8. I don't think dash-less

[issue5902] Stricter codec names

2011-02-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky belopol...@users.sourceforge.net added the comment: What is the status of this. Status=open and Resolution=rejected contradict each other. Sorry, forgot to close the ticket

[issue5902] Stricter codec names

2011-02-24 Thread Marc-Andre Lemburg
Changes by Marc-Andre Lemburg m...@egenix.com: -- status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5902 ___ ___ Python-bugs

[issue5902] Stricter codec names

2011-02-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky belopol...@users.sourceforge.net added the comment: Accepting all common forms for encoding names means that you can usually give Python an encoding name from, e.g. a HTML page, or any

[issue5902] Stricter codec names

2011-02-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky belopol...@users.sourceforge.net added the comment: Ezio and I discussed on IRC the implementation of alias lookup and neither of us was able to point out to the function that strips

[issue11309] #include wctype.h in Objects/unicodetype_db.h and Objects/unicodectype.c

2011-02-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Дилян Палаузов wrote: New submission from Дилян Палаузов dilyan.palau...@aegee.org: As of python 2.7.1 configured with --enable-ipv6 --enable-unicode --with-system-expat --with-system-ffi --with-signal-module --with-threads

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky belopol...@users.sourceforge.net added the comment: On Thu, Feb 24, 2011 at 10:30 AM, Ezio Melotti rep...@bugs.python.org wrote: .. See also discussion on #5902. Mark has closed

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: As promised, here's the list of places where the wrong Latin-1 encoding spelling is used: Lib//test/test_cmd_line.py: -- for encoding in ('ascii', 'latin1', 'utf8'): Lib//test/test_codecs.py: -- ef = codecs.EncodedFile(f

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: STINNER Victor wrote: STINNER Victor victor.stin...@haypocalc.com added the comment: I think that the normalization function in unicodeobject.c (only used for internal functions) can skip any character different than a-z, A-Z and 0-9

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky belopol...@users.sourceforge.net added the comment: On Thu, Feb 24, 2011 at 11:01 AM, Marc-Andre Lemburg rep...@bugs.python.org wrote: .. On this ticker, we're discussing just one

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: STINNER Victor wrote: STINNER Victor victor.stin...@haypocalc.com added the comment: Ooops, I attached the wrong patch. Here is the new fixed patch. That won't work, Victor, since it makes invalid encoding names valid, e.g. 'utf(=)-8

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky belopol...@users.sourceforge.net added the comment: On Thu, Feb 24, 2011 at 11:31 AM, Marc-Andre Lemburg rep...@bugs.python.org wrote: .. I think rather than removing any hyphens

[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky belopol...@users.sourceforge.net added the comment: On Thu, Feb 24, 2011 at 3:54 PM, Antoine Pitrou rep...@bugs.python.org wrote: .. I've committed the part of the patch which

[issue11313] Speed up default encode()/decode()

2011-02-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: New submission from Alexander Belopolsky belopol...@users.sourceforge.net: In Python 3.x default encoding is always utf-8, but encode()/decode() still try to look it up. Attached patch eliminates a call

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: r88586: Normalized the encoding names for Latin-1 and UTF-8 to 'latin-1' and 'utf-8' in the stdlib. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue11303

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: I think we should reset this whole discussion and just go with Alexander's original patch issue11303.diff. I don't know who changed the encoding's package normalize_encoding() function (wasn't me), but it's a really slow implementation

[issue11322] encoding package's normalize_encoding() function is too slow

2011-02-25 Thread Marc-Andre Lemburg
New submission from Marc-Andre Lemburg m...@egenix.com: I don't know who changed the encoding's package normalize_encoding() function (wasn't me), but it's a really slow implementation. The original version used the .translate() method which is a lot faster and can be adapted to work

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Marc-Andre Lemburg wrote: I don't know who changed the encoding's package normalize_encoding() function (wasn't me), but it's a really slow implementation. The original version used the .translate() method which is a lot faster. I

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: STINNER Victor wrote: STINNER Victor victor.stin...@haypocalc.com added the comment: r88586: Normalized the encoding names for Latin-1 and UTF-8 to 'latin-1' and 'utf-8' in the stdlib. Why did you do that? We are trying to find

[issue11322] encoding package's normalize_encoding() function is too slow

2011-02-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: STINNER Victor wrote: STINNER Victor victor.stin...@haypocalc.com added the comment: We should first implement the same algorithm of the 3 normalization functions and add tests for them (at least for the function in normalization

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: I guess you could regard the wrong encoding name use as bug - it slows down several stdlib modules for no apparent reason. If you agree, Raymond, I'll backport the patch. -- title: b'x'.decode('latin1') is much slower than

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Marc-Andre Lemburg wrote: Marc-Andre Lemburg m...@egenix.com added the comment: I guess you could regard the wrong encoding name use as bug - it slows down several stdlib modules for no apparent reason. If you agree, Raymond, I'll

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-26 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Raymond Hettinger wrote: Raymond Hettinger rhettin...@users.sourceforge.net added the comment: If you agree, Raymond, I'll backport the patch. Yes. That will address Antoine's legitimate concern about making other backports harder

[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

2011-02-28 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Ezio Melotti wrote: Ezio Melotti ezio.melo...@gmail.com added the comment: The patch turned out to be less trivial than I initially thought. The current algorithm checks for invalid continuation bytes in 4 places: 1) before

[issue1276] LookupError: unknown encoding: X-MAC-JAPANESE

2007-10-15 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: My name appears in that Makefile because I wrote it and used it to create the charmap codecs. The reason why the Mac Japanese codec was not created for 2.x was the size of the mapping table. Ideal would be to have the C version of the CJK codecs support

[issue1276] LookupError: unknown encoding: X-MAC-JAPANESE

2007-10-15 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Adding Python 2.6 as version target. -- versions: +Python 2.6 __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue1276 __ ___ Python

[issue1399] XML codec

2007-11-07 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Nice codec ! The only nit I have is the name: xml isn't intuitive enough. I had to read the code to figure out what the codec actually does. xml used a encoding usually refers to having Unicode text converted to ASCII with XML entity escapes for all non

[issue1399] XML codec

2007-11-07 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Leaving the module name as xml would remove that name from the namespace of possible encodings. xml as encoding name is problematic, as many people regard writing data in XML as encoding the data in XML. I'd simply not use it at all, not even for a codec

[issue1399] XML codec

2007-11-08 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Thanks, Walter ! __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue1399 __ ___ Python-bugs-list mailing list Unsubscribe: http

[issue1234] semaphore errors on AIX 5.2

2007-11-14 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: The problem is also present in Python 2.4 and 2.3. Confirmed on AIX 5.3. -- nosy: +lemburg versions: +Python 2.3, Python 2.4 __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue1234

[issue1433] marshal roundtripping for unicode

2007-11-15 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: I think you have a wrong understanding of round-tripping. In Unicode it is really irrelevant if you're using a UCS2 surrogate pair or a UCS4 representation to describe a code point. The length of the Unicode representation may change, but the meaning won't

[issue1620174] Improve platform.py usability on Windows

2007-11-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Rejecting the patch, since it hasn't been updated. -- resolution: - rejected status: open - closed _ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue1620174

[issue1514] missing constants in socket module

2007-11-30 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Interesting. It appears as if r57142 caused this change. Before: # if !(defined(__BEOS__) || defined(__CYGWIN__) || (defined(PYOS_OS2) defined(PYCC_VACPP))) After: # if defined(__CYGWIN__) || (defined(PYOS_OS2) defined(PYCC_VACPP)) That change

[issue1514] missing constants in socket module

2007-11-30 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Doesn't this bug report also refer to Python 2.5 and 2.6 ? __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue1514 __ ___ Python-bugs-list

  1   2   3   4   5   6   7   8   9   10   >