On 2014-02-13 00:59, Mark Lawrence wrote: > >>>> s = "\u3141" # HANGUL LETTER MIEUM > >>>> f = open('test.txt', 'w') > >>>> f.write("\u3141") > > Traceback (most recent call last): > > File "<stdin>", line 1, in <module> > > UnicodeEncodeError: 'ascii' codec can't encode character '\u3141' > > in position 0: ordinal not in range(128) > > > > Just because the open() call hides the specification of how Python > > should do that encoding doesn't prevent the required encoding from > > happening. :-) > > Which clearly reinforces the fact that what you originally said is > incorrect, I don't have to do anything, Python very kindly does > things for me under the covers.
...and when they break, you get to keep both pieces. :) If you don't know that encoding is being done, it's a lot harder to trust the assumption that you can directly write strings to files when exceptions like the above happen. My original point (though perhaps not conveyed as well as I'd intended) was that only bytes get written to the disk, and that some encoding must take place. It can be done implicitly using some defaults which may break (as demoed), whereas one would be better off doing it explicitly such as Chris shows: >>> f = open('test.txt', 'w', encoding='utf-8') >>> f.write("\u3141") 1 UTF-8'rs gonna 8. (or whatever memes the cool kids are riffing these days) -tkc -- https://mail.python.org/mailman/listinfo/python-list