On Thu, 12 Jan 2023 at 04:31, Stephen Tucker <[email protected]> wrote:
> 1. Create BOM.txt
> 2. Input three bytes at once from BOM.txt and print them
> 3. Input three bytes one at a time from BOM.txt and print them
All of these correctly show that a file, in binary mode, reads and writes bytes.
> 4. Input three bytes at once from BOM.txt and print them
> >>> import codecs
> >>> myfil = codecs.open ("BOM.txt", mode="rb", encoding="UTF-8")
This is now a codecs file, NOT a vanilla file object. See its docs here:
https://docs.python.org/2.7/library/codecs.html#codecs.open
The output is "codec-dependent" but I would assume that UTF-8 will
yield Unicode text strings.
> 5. Attempt to input three bytes one at a time from BOM.txt and print them
> -------------------------------------------------------------------------
>
> >>> myfil = codecs.open ("BOM.txt", mode="rb", encoding="UTF-8")
> >>> myBOM_4 = myfil.read (1)
> >>> myBOM_4
> u'\ufeff'
> A. The attempt at Part 5 actually inputs all three bytes when we ask it to
> input just the first one!
On the contrary; you asked it for one *character* and it read one character.
Where were you seeing documentation that disagreed with this?
ChrisA
--
https://mail.python.org/mailman/listinfo/python-list