New issue 3044: py3.6: str.encode should not work with not-text encoders
https://bitbucket.org/pypy/pypy/issues/3044/py36-strencode-should-not-work-with-not

Zsolt Cserna:

With Cpython, using a non-text encoder such as “hex”, the following happens:

```
>>> "foo".encode("hex")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
LookupError: 'hex' is not a text encoding; use codecs.encode() to handle 
arbitrary codecs
```

In pypy this call gets to the codec itself, where `TypeError` is raised \(which 
is correct as this encoding works on bytes, not on unicode\):

```
>>>> "foo".encode("hex")
Traceback (most recent call last):
  File "/home/zsolt/src/pypy/lib-python/3/encodings/hex_codec.py", line 15, in 
hex_encode
    return (binascii.b2a_hex(input), len(input))
TypeError: a bytes-like object is required, not str
```

The root cause of the problem is that when looking up a codec in `str.encode`, 
`lookup_text_codec()` function is not called. Thereby the CodecInfo's 
`_is_text_encoding` is not checked at all as `str.encode` uses 
`codes.encode`under the hood.

The solution would be adding a check for this method somewhere between 
`str.encode` and using the encoder.


_______________________________________________
pypy-issue mailing list
pypy-issue@python.org
https://mail.python.org/mailman/listinfo/pypy-issue

Reply via email to