#4006: System.Process doesn't encode its arguments.
----------------------------------+-----------------------------------------
Reporter: Khudyakov | Owner:
Type: bug | Status: new
Priority: normal | Milestone: 6.14.1
Component: libraries/process | Version: 6.12.1
Keywords: | Difficulty:
Os: Linux | Testcase:
Architecture: Unknown/Multiple | Failure: None/Unknown
----------------------------------+-----------------------------------------
Changes (by beroal):
* cc: rabes...@… (added)
Comment:
I tracked down this issue to 'hackage.base.Foreign.C.String.withCString*'
. That function uses
'hackage.base.Foreign.C.String.withCString.castCharToCChar', which is not
supposed to handle Unicode at all.
Test it with
{{{
withCStringLen "А здесь сломалось" $
\(cs, sl) -> withBinaryFile "/tmp/user" WriteMode $
\h -> hPutBuf h cs sl
}}}
So, 'withCString*' incorrectly implements the FFI addendum: "The
marshalling converts each Haskell character, representing a Unicode code
point, to one or more bytes in a manner that, by default, is determined by
the '''current locale'''."
see
http://www.cse.unsw.edu.au/~chak/haskell/ffi/ffi/ffise6.html#x10-420006.3
Replying to [comment:1 igloo]:
> It's not clear to me that UTF8 encoding is the right thing to do.
Shouldn't there be a low level function which takes `[Word8]` rather than
`String`, and then perhaps a `String` function on top of that which does
UTF8 encoding?
Supposing 'withCString*' works properly, then command arguments are
encoded using current locale. I (and the ticket author I guess) would be
satisfied with this solution. Should a user be allowed to pass arbitrary
binary data as command arguments, is a broader question — the same
question appears for file names.
--
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/4006#comment:3>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs