> Please tell the details about how you downloaded and saved the file to > disk. It is impossible to know what went wrong without these details.
What went wrong is not the point. However it is that the characters got messed up (Web site, browser, user error, cosmic rays, CIA, Al Qaeda), there is no reason not to use the escape sequence, for portability and better code legibility. > I did it twice with two different methods: > . Clicked the "download" link and saved the file to disk. > . Clicked the "view" link; then, after seeing that the Unicode > characters are displayed incorrectly, clicked View->Encoding from > the menu bar, selected "Unicode UTF-8", which fixed the display; > then File->Save As, selected "Text files" and made sure the > encoding is set to UTF-8; clicked OK. > > Both methods gave me a valid UTF-8 encoded file that displayed > correctly in Emacs 22. I used the "view" link, clicking mouse-1 on it, because I wanted to look at the code before saving it. I did not scan the entire file to notice that two of the characters were displayed incorrectly, so I did not change my browser encoding - after all, this is code, which displays as plain text. And how would one know that those two characters were in fact displayed incorrectly? How would you know what they were supposed to be? Did you read all of the code comments, and analyze the code, to come to the conclusion that the browser encoding for those two characters was incorrect? Or did you in fact know just what to look for, because you had read my bug report? That's cheating ;-). Or did you notice the -*- coding: utf-8 -*- in the header, and realize that your current browser encoding didn't correspond to that? You said, however, that you noticed that the (two) Unicode characters were displayed incorrectly - a much harder thing to spot. Some other methods a user might use to try to retrieve the code: - Right-click the "download" link, and use Save As" (as I assume you meant by "clicked the 'download' link"). Here, you can Save as type All Files. This works. - Right-click the "view" link, use Save Target As", and Save as type All Files, changing the suffix to "el". For some reason, this does nothing, for me - no file is saved. - Click mouse-1 on the "download" link, and use "Save As". This does default to the Unicode encoding, but, at least in my IE6 browser, there is no filter option for All Files at that point, and you must choose Save as type Text File (*.txt) (the other options involving saving as HTML pages). When I open the resulting file in Emacs 22, C-h C shows raw-text-unix, not Unicode, and the buffer is filled with null bytes (^@) - every other byte. C-x RET r utf-8 does not change what I see. The -*- coding never takes effect because each of its characters is preceded by a null character. There are multiple ways a user might try to retrieve this code from that site, and there will be other sites that also offer the code, perhaps in other ways. As I mentioned, I first ran into this problem on the Emacs Wiki (with the same em-dash character, in a library that is derived from buff-menu.el). Simply uploading or downloading the code on the Wiki changes the characters (in the same way, BTW). Here, the downloading user has no choice. If the normal page-edit means of uploading is used, then the characters are messed up in the file on the wiki, so regardless of how you download it, you get garbage. AFAIK, this has nothing to do with the browser. You might not care about the Emacs Wiki, but you might care that such a problem exists there, because other sites might present similar problems. The real point is that there is no good reason *not* to use the escape sequence in this case, and there are good reasons to use it: easier file exchange using email and Internet, and better code legibility. The only reason given so far not to use the escape sequence was code legibility, and I pointed out that the code is in fact less legible without the escape sequence, because the em-dash and hyphen characters are indistinguishable in a fixed font. They both appear as ?-, making it impossible to tell which is which (without a comment). This seems a no-brainer, to me. Further resistance to using the escape sequence in this case seems to me to reflect only unwillingness to see the obvious. If, on the other hand, your concern was the Web site and how to ensure that users download Unicode code correctly, then I share that concern. You might want to include explicit instructions for how to download, and explicit mention that "view" of code that includes Unicode characters might require that you change your browser encoding to Unicode. Or something like that. _______________________________________________ emacs-pretest-bug mailing list [email protected] http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
