[Python-Dev] Restoring the aliases for the non-Unicode codecs

2013-11-13 Thread Nick Coghlan
Back in Python 3.2, the non-Unicode codecs were restored to the
standard library, but without the associated aliases (mostly due to
some thoroughly confusing error messages when they were mistakenly
used with the Unicode encoding convenience methods).

The long gory history in http://bugs.python.org/issue7475 took a
different turn earlier this year when I noticed
(http://bugs.python.org/issue7475#msg187698) that the codecs module
already *had* type neutral helper functions in the form of
codecs.encode and codecs.decode, and has had them since Python 2.4.
These were covered in the test suite, but not in the documentation.

That realisation substantially changed my perspective on the issue,
since it was no longer a matter of adding a new API, but of better
documenting and facilitating the use of one we already had (and was
supported all the way back to Python 2.4, make it easy to use in
single-source Python 2/3 projects as well). Since then, three key
supporting issues have been addressed for Python 3.4:

* codecs.encode() and codecs.decode() have been documented in 2.7, 3.3
and default (http://bugs.python.org/issue17827)
* codec errors have been updated to incorporate the name of the codec
whenever feasible, and output type errors in Unicode encoding
convenience methods refer uses to the type neutral object-object
convenience functions (http://bugs.python.org/issue17839)
* the especially confusing errors from the base64 codecs have been
eliminated by updating the base64 module to use memoryview to process
binary input rather than explicit typechecks on builtin types
(http://bugs.python.org/issue17839)

So, with those underlying issues resolved, I would now like to restore
the aliases for the non-Unicode codecs that were removed in
http://bugs.python.org/issue10807 (aliases) and
http://bugs.python.org/issue17841 (docs).

I also looked into the possibility of providing appropriate 2to3
fixers, but the data driven nature of the problem makes that
impractical. However, the presence of codecs.decode and codecs.encode
in Python 2.4+ makes a new Py3k warning in 2.7.7 a viable option:
http://bugs.python.org/issue19543

Regards,
Nick.

P.S. Until the next docs rebuild, you can see a summary of the
non-Unicode codec handling improvements in the What's New diff here:
http://hg.python.org/cpython/rev/854a2cea31b9


-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Restoring the aliases for the non-Unicode codecs

2013-11-13 Thread M.-A. Lemburg
On 13.11.2013 15:29, Nick Coghlan wrote:
 Back in Python 3.2, the non-Unicode codecs were restored to the
 standard library, but without the associated aliases (mostly due to
 some thoroughly confusing error messages when they were mistakenly
 used with the Unicode encoding convenience methods).
 
 The long gory history in http://bugs.python.org/issue7475 took a
 different turn earlier this year when I noticed
 (http://bugs.python.org/issue7475#msg187698) that the codecs module
 already *had* type neutral helper functions in the form of
 codecs.encode and codecs.decode, and has had them since Python 2.4.
 These were covered in the test suite, but not in the documentation.
 
 That realisation substantially changed my perspective on the issue,
 since it was no longer a matter of adding a new API, but of better
 documenting and facilitating the use of one we already had (and was
 supported all the way back to Python 2.4, make it easy to use in
 single-source Python 2/3 projects as well). Since then, three key
 supporting issues have been addressed for Python 3.4:
 
 * codecs.encode() and codecs.decode() have been documented in 2.7, 3.3
 and default (http://bugs.python.org/issue17827)
 * codec errors have been updated to incorporate the name of the codec
 whenever feasible, and output type errors in Unicode encoding
 convenience methods refer uses to the type neutral object-object
 convenience functions (http://bugs.python.org/issue17839)
 * the especially confusing errors from the base64 codecs have been
 eliminated by updating the base64 module to use memoryview to process
 binary input rather than explicit typechecks on builtin types
 (http://bugs.python.org/issue17839)
 
 So, with those underlying issues resolved, I would now like to restore
 the aliases for the non-Unicode codecs that were removed in
 http://bugs.python.org/issue10807 (aliases) and
 http://bugs.python.org/issue17841 (docs).

+1 and thanks for your work on this.

 I also looked into the possibility of providing appropriate 2to3
 fixers, but the data driven nature of the problem makes that
 impractical. However, the presence of codecs.decode and codecs.encode
 in Python 2.4+ makes a new Py3k warning in 2.7.7 a viable option:
 http://bugs.python.org/issue19543
 
 Regards,
 Nick.
 
 P.S. Until the next docs rebuild, you can see a summary of the
 non-Unicode codec handling improvements in the What's New diff here:
 http://hg.python.org/cpython/rev/854a2cea31b9
 
 

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 13 2013)
 Python Projects, Consulting and Support ...   http://www.egenix.com/
 mxODBC.Zope/Plone.Database.Adapter ...   http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2013-11-19: Python Meeting Duesseldorf ...  6 days to go

: Try our mxODBC.Connect Python Database Interface for free ! ::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Restoring the aliases for the non-Unicode codecs

2013-11-13 Thread Stephen J. Turnbull
Nick Coghlan writes:

  The long gory history in http://bugs.python.org/issue7475 took a
  different turn earlier this year when I noticed
  (http://bugs.python.org/issue7475#msg187698) that the codecs module
  already *had* type neutral helper functions in the form of
  codecs.encode and codecs.decode, and has had them since Python 2.4.
  These were covered in the test suite, but not in the documentation.

As long as the agreement on _methods_ (documented in
http://bugs.python.org/issue7475#msg96240) isn't changed, I'm +1 on
this.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Restoring the aliases for the non-Unicode codecs

2013-11-13 Thread Nick Coghlan
On 14 November 2013 01:43, Stephen J. Turnbull step...@xemacs.org wrote:
 Nick Coghlan writes:

   The long gory history in http://bugs.python.org/issue7475 took a
   different turn earlier this year when I noticed
   (http://bugs.python.org/issue7475#msg187698) that the codecs module
   already *had* type neutral helper functions in the form of
   codecs.encode and codecs.decode, and has had them since Python 2.4.
   These were covered in the test suite, but not in the documentation.

 As long as the agreement on _methods_ (documented in
 http://bugs.python.org/issue7475#msg96240) isn't changed, I'm +1 on
 this.

Yeah, those have actually been updated to point to the type neutral functions:

 bad output.encode(rot_13)
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: 'rot_13' encoder returned 'str' instead of 'bytes'; use
codecs.encode() to encode to arbitrary types

 bbad output.decode(quopri_codec)
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: 'quopri_codec' decoder returned 'bytes' instead of 'str';
use codecs.decode() to decode to arbitrary types

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com