On Fri, 27 May 2022 at 12:01, Stephen J. Turnbull
wrote:
>
> Chris Angelico writes:
>
> > If I'm reading this correctly, the result from f.tell() has enough
> > information to reconstruct a position within a hypothetical array
> > of code points contained within the file (that is to say - if yo
Chris Angelico writes:
> If I'm reading this correctly, the result from f.tell() has enough
> information to reconstruct a position within a hypothetical array
> of code points contained within the file (that is to say - if you
> read the entire file into a string, f.tell() returns something t
On Thu, 26 May 2022 at 22:07, Eryk Sun wrote:
>
> On 5/26/22, Steven D'Aprano wrote:
> >
> > If you seek() to position 4, say, the results will be unpredictable but
> > probably not anything good.
> >
> > In other words, the tell() and seek() cookies represent file positions
> > in **bytes**, eve
On 5/26/22, Steven D'Aprano wrote:
>
> If you seek() to position 4, say, the results will be unpredictable but
> probably not anything good.
>
> In other words, the tell() and seek() cookies represent file positions
> in **bytes**, even though we are reading or writing a text file.
To clarify the
On Tue, May 24, 2022 at 04:31:13AM -, mguin...@gmail.com wrote:
> seek() and tell() works with opaque values, called cookies.
> This is close to low level details, but it is not pythonic.
Even after reading the issue you linked to, I am not sure I understand
either the issue, or your suggest
On Thu, May 26, 2022 at 08:28:24PM +1000, Steven D'Aprano wrote:
> Narrow builds were UCS-2; wide builds were UTC-32.
To be more precise, narrow builds were sort of a hybrid between an
incomplete version of UTF-16 and a superset of UCS-2.
Like UTF-16, if your code point was above U+, it wou
On Wed, May 25, 2022 at 06:16:50PM +0900, Stephen J. Turnbull wrote:
> mguin...@gmail.com writes:
>
> > There should be a safer abstraction to these two basic functions.
>
> There is: TextIOBase.read, then treat it as an array of code units
> (NOT CHARACTERS!!)
No need to shout :-)
Reading the
On 5/26/22, Christopher Barker wrote:
> IIRC, there were two builds- 16 and 32 bit Unicode. But it wasn’t UTF16, it
> was UCS-2.
In the old implementation prior to 3.3, narrow and wide builds were
supported regardless of the size of wchar_t. For a narrow build, if
wchar_t was 32-bit, then PyUnico