[Python-ideas] Re: Adding `open_text()` builtin function. (relating to PEP 597)

Steven D'Aprano Sat, 23 Jan 2021 03:02:57 -0800

On Sat, Jan 23, 2021 at 12:40:55AM -0500, Random832 wrote:
> On Fri, Jan 22, 2021, at 20:34, Inada Naoki wrote:
> > * Default encoding is "utf-8".
> 
> it might be worthwhile to be a little more sophisticated than this.
> 
> Notepad itself uses character set detection [it might not be 
> reasonable to do this on the whole file as notepad does, but maybe the 
> first 512 bytes, or the result of read1(512)?] when opening a file of 
> unknown encoding, and msvcrt's "ccs=UTF-8" option to fopen will at 
> least detect at the presence of UTF-8 and UTF-16 BOMs [and treat the 
> file as UTF-16 in the latter case].



I like Random's idea. If we add a new "open text file" builtin function, 
we should seriously consider having it attempt to auto-detect the 
encoding. It need not be as sophisticated as `chardet`.

That auto-detection behaviour could be enough to differentiate it from 
the regular open(), thus solving the "but in ten years time it will be 
redundant and will need to be deprecated" objection.

Having said that, I can't say I'm very keen on the name "open_text", but 
I can't think of any other bikeshed colour I prefer.


-- 
Steve
_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/VAWFPIAA4WIVLIF4LFJ4OATJK6JDJS2N/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Adding `open_text()` builtin function. (relating to PEP 597)

Reply via email to