[issue20420] BufferedIncrementalEncoder violates IncrementalEncoder interface

2021-12-09 Thread Irit Katriel


Change by Irit Katriel :


--
components: +Library (Lib)
versions: +Python 3.10, Python 3.11, Python 3.9 -Python 2.7, Python 3.3, Python 
3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20420] BufferedIncrementalEncoder violates IncrementalEncoder interface

2015-01-17 Thread Martin Panter

Martin Panter added the comment:

For what it’s worth, both io.TextIOWrapper and _pyio.TextIOWrapper appear to 
only ever call IncrementalEncoder.setstate(0). And the newline _decoder_ is not 
relevant because it doesn’t use any _encoder_.

--
nosy: +vadmium

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20420
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20420] BufferedIncrementalEncoder violates IncrementalEncoder interface

2014-07-07 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

IncrementalNewlineDecoder requires that decoder state is integer (C 
implementation requires at most 63-bit unsigned integer). TextIOWrapper 
requires that decoder state is at most 64-bit unsigned integer (only 63-bit if 
universal newlines is enabled).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20420
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20420] BufferedIncrementalEncoder violates IncrementalEncoder interface

2014-01-31 Thread Walter Dörwald

Walter Dörwald added the comment:

I dug up an ancient email about that subject:

 However, I've discovered that BufferedIncrementalEncoder.getstate()
 doesn't match the specification (i.e. it returns the buffer, not an
 int). However this class is unused (and probably useless, because it
 doesn't make sense to delay encoding the input). The simplest solution
 would be to simply drop the class.

 Sounds like a plan; go right ahead!

 Oops, there *is* one codec that uses it: The idna encoder. It buffers
 the input until a '.' is encountered (or encode() is called with
 final==True) and then encodes this part.

 Either the idna encoder encodes the unencoded input as a int, or we drop
 the specification that encoder.getstate() must return an int, or we
 change it to mirror the decoder specification (i.e. return a
 (buffered_input, additional_state_info) tuple.

 (A more radical solution would be to completely drop the incremental
 codecs for idna).

 Maybe we should wait and see how the implementation of writing turns out?

And indeed the incremental encoder for idna behaves strange:

 import io
 b = io.BytesIO()
 s = io.TextIOWrapper(b, 'idna')
 s.write('x')
1
 s.tell()
0
 b.getvalue()
b''
 s.write('.')
1
 s.tell()
2
 b.getvalue()
b'x.'
 b = io.BytesIO()
 s = io.TextIOWrapper(b, 'idna')
 s.write('x')
1
 s.seek(s.tell())
0
 s.write('.')
Traceback (most recent call last):
  File stdin, line 1, in module
  File /Users/walter/.local/lib/python3.3/codecs.py, line 218, in encode
(result, consumed) = self._buffer_encode(data, self.errors, final)
  File /Users/walter/.local/lib/python3.3/encodings/idna.py, line 246, in 
_buffer_encode
result.extend(ToASCII(label))
  File /Users/walter/.local/lib/python3.3/encodings/idna.py, line 73, in 
ToASCII
raise UnicodeError(label empty or too long)
UnicodeError: label empty or too long

The cleanest solution might probably by to switch to a (buffered_input, 
additional_state_info) state.

However I don't know what changes this would require in the seek/tell 
imlementations.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20420
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20420] BufferedIncrementalEncoder violates IncrementalEncoder interface

2014-01-28 Thread Serhiy Storchaka

New submission from Serhiy Storchaka:

The documentation of IncrementalEncoder.getstate() says:


Return the current state of the encoder which must be an integer. The 
implementation should make sure that 0 is the most common state. (States that 
are more complicated than integers can be converted into an integer by 
marshaling/pickling the state and encoding the bytes of the resulting string 
into an integer).


But implementation of BufferedIncrementalEncoder.getstate() is

def getstate(self):
return self.buffer or 0

self.buffer is unencoded input that is kept between calls to encode(), e.g. a 
string.

--
messages: 209563
nosy: doerwalter, lemburg, loewis, serhiy.storchaka
priority: normal
severity: normal
status: open
title: BufferedIncrementalEncoder violates IncrementalEncoder interface
type: behavior
versions: Python 2.7, Python 3.3, Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20420
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com