Hello,

The earlier investigation indicated that darcs decodes the contents of files 
that it reads with readLocaleFile, such as the file read when specifying the 
--logfile option, using "the console's encoding". To me, this is a debatable 
choice. 
 
I seem to recall a fairly recent discussion, related to darcs or perhaps GHC, 
that brought up questions like this, so perhaps there is some agreement that I 
am simply unaware of. If so, I would be grateful for a reference to such a 
conclusion. Otherwise, I would like to hear some answers to the question: Is 
decoding the contents of files using "the console's encoding" really suitable? 
Or should some other mechanism be used? Perhaps controlled by 
settings/options/parameters?

----- Original meddelelse -----
> Fra: Reinier Lamers <[email protected]>
> Til: [email protected]
> Dato: Lør, 03. apr 2010 14:19
> Emne: Re: [darcs-users]
> Debugging     issue1739-escape-multibyte-chars-correctly.sh on tn23
> ...
> It looks like there's actually a bug in the script so that it
> depends on the 
> locale (i.e., it works with LC_ALL=da_DK.UTF-8 but fails with
> LC_ALL=''). 
> Perhaps the script should try to detect what the locale encoding is
> and bomb 
> out if it's not UTF-8 or if it can't be detected.
> 
> Reinier

The answer depends: If decoding the contents of a file using "the console's 
encoding" is considered proper, then, yes, a script that critically depends on 
the exact manner in which the contents of a file is decoded needs to ensure 
that "the console's encoding" is set properly or, alternately, fail. But if 
decoding the contents of a file using "the console's encoding" is considered 
improper, the field is more open.

As a practical matter, just to get the buildbot lights green, it is, of course, 
easy to simply ensure that the tn23 buildbot slave sets LC_ALL=da_DK.UTF-8. 

Best regards
Thorkil
_______________________________________________
darcs-users mailing list
[email protected]
http://lists.osuosl.org/mailman/listinfo/darcs-users

Reply via email to