On Sep 16, 5:28?pm, John Machin <[EMAIL PROTECTED]> wrote: > On Sep 17, 7:54 am, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote: > > > > > > > On Sep 16, 2:22?pm, Steve Holden <[EMAIL PROTECTED]> wrote: > > > > [EMAIL PROTECTED] wrote: > > > > On Sep 16, 1:10?pm, Dennis Lee Bieber <[EMAIL PROTECTED]> wrote: > > > >> On Sun, 16 Sep 2007 01:46:34 -0700, GeorgeRXZ <[EMAIL PROTECTED]> > > > >> declaimed the following in comp.lang.python: > > > > >>> Then Open the Notepad and type the following sentence, and save the > > > >>> file and close the notepad. Now reopen the file and you will find out > > > >>> that, Notepad is not able to save the following text line. > > > >>> Well you are speed > > > >>> This occurs not only with above sentence but any sentence that has > > > >>> 4 3 3 5 (sequence of characters: Well=4 you=3 are=3 speed=5) > > > >> I tried. I also opened the saved file in SciTE... > > > >> And the text WAS there... > > > > >> It is Notepad that can not properly render what it, > > > >> itself, saved. > > > > > C:\Documents and Settings\mensanator\My Documents>type huh.txt > > > > Well you are speed > > > > > Yes, file was saved correctly. > > > > But reopening it shows 9 unprintable characters. > > > > If I copy those to a new file (huh1.txt): > > > > > C:\Documents and Settings\mensanator\My Documents>type huh1.txt > > > > ????????? > > > > > But wait...the new file is 20 characters, not 9. > > > > > 09/16/2007 01:44 PM 18 huh.txt > > > > 09/16/2007 01:54 PM 20 huh1.txt > > > > > C:\Documents and Settings\mensanator\My Documents>dump huh.txt > > > > huh.txt: > > > > 00000000 5765 6c6c 2079 6f75 2061 7265 2073 7065 Well you are spe > > > > 00000010 6564 ed > > > > > Here's what it's actually doing: > > > > > C:\Documents and Settings\mensanator\My Documents>dump huh1.txt > > > > huh1.txt: > > > > 00000000 fffe 5765 6c6c 2079 6f75 2061 7265 2073 .~Well you are s > > > > 00000010 7065 6564 peed > > > > One word: Unicode. > > > > The "open" and "save" dialogs allow you to specify an encoding. > > > And the encoding specified was ANSI. > > > > If you > > > specify Unicode the you will get what you see above. > > > And if you specify ANSI _before_ you click the file name, > > the specification switches to Unicode and has to then > > be manually switched back to ANSI. > > > > If you specify ANSI > > > you will get the text you entered. > > > It's still a bug in the "open" dialog. > > It's more like a bug/feature in its encoding detector.
It is NOT a feature. If I save something as ANSI, there is no excuse for it not to re-open in ANSI. > I can get it to > switch to Unicode only if there's an even number of characters AND the > line is NOT terminated by CRLF -- add/remove one alpha character, or > hit the enter key at the end of the line, and it won't detect it as > Unicode when you open it again. > > You only get the BOM (0xfffe) if you are silly enough to save it while > it's open in Unicode mode. That was a test. I wasn't so stupid as to save to the original file, but to make a copy. > > > > > > By the way, this has precisely what to do with Python? > > > I've been known to use Notepad to create Python > > source code. > > Your source code would have to be trivially short to trigger the > strange behaviour. Makes you wonder what other edge cases aren't handled properly. Makes you wonder why Microsoft doesn't employ professional programmers. -- http://mail.python.org/mailman/listinfo/python-list