Marc-Andre Lemburg added the comment:

On 16.11.2013 10:16, Nick Coghlan wrote:
> 
> Nick Coghlan added the comment:
> 
> The full input/output type specifications can't be implemented sensibly 
> without also defining at least a ByteSequence ABC. While I think it's a good 
> idea in the long run, there's no feasible way to design such a system in the 
> time remaining before the Python 3.4 feature freeze.
> 
> However, we could do something much simpler as a blacklist API:
> 
>     def is_unicode_codec(name):
>         """Returns true if this is the name of a known Unicode text 
> encoding"""
> 
>     def set_as_non_unicode(name):
>         """Indicates that the named codec is not a Unicode codec"""
> 
> And then the codecs module would just maintain a set internally of all the 
> names explicitly flagged as non-unicode.

That doesn't look flexible enough to cover the various different
input/output types.

> Such an API remains useful even if the input/output type support is added in 
> Python 3.5 (since "codecs.is_unicode_codec(name)" is a bit simpler thing to 
> explain than the exact type restrictions).
> 
> Alternatively, implementing just the "encodes_to" and "decodes_to" attributes 
> would be enough for str.encode, bytes.decode and bytearray.decode to reject 
> known bad encodings early, leaving the input type checks to the codecs for 
> now (since it is correctly defining "encode_from" and "decode_from" for many 
> stdlib codecs that would need the ByteSequence ABC).

The original idea we discussed some time ago was to add a mapping
or list attribute to CodecInfo which lists all supported type
combinations.

The codecs module could then make this information available through
a simple type check API (which also caches the lookups for performance
reasons), e.g.

codecs.types_supported(encoding, input_type, output_type) -> boolean.

    Returns True/False depending on whether the codec for
    encoding supports the given input and output types.

Usage:

if not codecs.types_support(encoding, str, bytes):
    # not a Unicode -> 8-bit codec
    ...

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue19619>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to