Hello, The earlier investigation indicated that darcs decodes the contents of files that it reads with readLocaleFile, such as the file read when specifying the --logfile option, using "the console's encoding". To me, this is a debatable choice. I seem to recall a fairly recent discussion, related to darcs or perhaps GHC, that brought up questions like this, so perhaps there is some agreement that I am simply unaware of. If so, I would be grateful for a reference to such a conclusion. Otherwise, I would like to hear some answers to the question: Is decoding the contents of files using "the console's encoding" really suitable? Or should some other mechanism be used? Perhaps controlled by settings/options/parameters?
----- Original meddelelse ----- > Fra: Reinier Lamers <[email protected]> > Til: [email protected] > Dato: Lør, 03. apr 2010 14:19 > Emne: Re: [darcs-users] > Debugging issue1739-escape-multibyte-chars-correctly.sh on tn23 > ... > It looks like there's actually a bug in the script so that it > depends on the > locale (i.e., it works with LC_ALL=da_DK.UTF-8 but fails with > LC_ALL=''). > Perhaps the script should try to detect what the locale encoding is > and bomb > out if it's not UTF-8 or if it can't be detected. > > Reinier The answer depends: If decoding the contents of a file using "the console's encoding" is considered proper, then, yes, a script that critically depends on the exact manner in which the contents of a file is decoded needs to ensure that "the console's encoding" is set properly or, alternately, fail. But if decoding the contents of a file using "the console's encoding" is considered improper, the field is more open. As a practical matter, just to get the buildbot lights green, it is, of course, easy to simply ensure that the tn23 buildbot slave sets LC_ALL=da_DK.UTF-8. Best regards Thorkil _______________________________________________ darcs-users mailing list [email protected] http://lists.osuosl.org/mailman/listinfo/darcs-users
