As a follow-up, I found this post (http://stackoverflow.com/a/388500), 
especially the comments from bobince (yes that bobince 
<http://stackoverflow.com/a/1732454>, who knows a thing or two about 
unicode).

So you can change the code page of the shell to one that supports UTF-8. 
 If you add
 

chcp 65001


to the start of the test script from my earlier message, the script will in 
fact create the file "é.in" and Tup will process it.  However, it still 
gets the errors:

tup error: File 'D:\home\gavin\ws\test\é.out' was written to, but is not in 
.tup/db. You probably should specify it as an output
 -- Delete: D:\home\gavin\ws\test\é.out
tup error: Expected to write to file '\351.out' from cmd 23 but didn't
 *** Command ID=23 ran successfully, but tup failed to save the 
dependencies.


(The \351 is emacs' way of printing the unmappable codepoint 4194281.)

To quote bobince from the above-linked comment:

Note there are serious implementation bugs in Windows's code page 65001 
> support which will break many applications that rely on the C standard 
> library IO methods, so this is very fragile. (Batch files also just stop 
> working in 65001.) Unfortunately UTF-8 is a second-class citizen in Windows.
>  


But regardless of the code page, my experience remains that even when Tup 
does the right thing in the outside world -- which it can do for "unicode" 
filenames -- its internal behavior is not consistent with that.  So I can't 
quite conclude that this is just a Windows thing.

Thankful for any insights,
Gavin

-- 
-- 
tup-users mailing list
email: [email protected]
unsubscribe: [email protected]
options: http://groups.google.com/group/tup-users?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"tup-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to