Hi,
Let me add that 'guess' should probably be forbidden as an encoding
parameter (instead, a separate function argument should be used as in my
proposal).
Here is a schematic example to show why :
def append_text(filename, encoding):
src = textfile(filename, "r", encoding)
my_text = src.read()
src.close()
dst = textfile("textlist.txt", "r+", encoding)
dst.seek_end(0)
dst.write(my_text + "\n")
dst.close()
With Paul's current proposal three cases can arise :
- "encoding" is a real encoding name like iso-8859-1 or utf-8. There
should be no problems, since we assume this encoding has been configured
once and for all in the application.
- "encoding" is either "site" or "locale". This should result in the
same value run after run, since we assume the site or locale encoding
value has been configured once and for all.
- "encoding" is "guess". In this case anything can happen. A possible
occurence is that for the first file, it will result in utf-8 being
detected (or Shift-JIS, or whatever), and for the second file it will be
iso-8859-1. This will lead to a crash in the likely case that some
characters in the source file can't be represented using the character
encoding auto-detected for the destination file.
Yet the append_text() function does look correct, doesn't it?
We shouldn't hide a contextual encoding-detection algorithm under an
encoding name. It leads to semantic uncertainty.
Regards
Antoine.
_______________________________________________
Python-3000 mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe:
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com