#5061: Implement FFI spec behaviour for *CString family
---------------------------------+------------------------------------------
Reporter: batterseapower | Owner:
Type: bug | Status: new
Priority: normal | Component: Compiler
Version: 7.0.3 | Keywords:
Testcase: | Blockedby:
Os: Unknown/Multiple | Blocking:
Architecture: Unknown/Multiple | Failure: None/Unknown
---------------------------------+------------------------------------------
Although the FFI spec requires the *CString functions in Foreign.C.String
to use the locale encoding to interpret the supplied CString, the
implementation currently uses ASCII.
I have implemented this feature, and at the same time have changed the
behaviour of the encoder upon encountering non-decodable characters to
silently ignore them. The rationale behind this is just that the
previously-specified behaviour (replace them with ?) was in fact also
unimplemented and just ignoring them is marginally easier.
Here are some discussion points:
1. It seems a shame not to expose a general interface to peek CStrings
in any supported TextEncoding.
2. I'm not a fan of either the "ignore" error handling behaviour or the
"replace with ?" behaviour. In my opinion we should throw an exception
upon encoding failure because how to recover in this situation in general
will depend on the user application
3. I could implement the replace-with-? error handling behaviour with
modest extra effort, if it is deemed necessary.
4. To ensure that this patch does not change the behaviour of GHC in any
way, I replaced every instance of a *CString function with a call to the
CAString equivalent, and marked the source with a comment of the form "--
UNICODE". The intention is that if and when this patch is accepted I will
then go back and figure out what is really going on in each case and
choose the correct function to call.
5. Some of the occurrences of CString in my GHC repo came from projects
with a distinct upstream, such as Cabal. Should I be submitting these
patches upstream rather than here?
I note that if we did expose a version of the CString functions that took
a TextEncoding, it would then be easy for the user to decode ignoring
errors instead, because they could simply supply a TextEncoding with
different error-handling behaviour.
I have validated this patch on Windows and OS X, and not seen any
reproducible failures above and beyond the usual set.
--
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/5061>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs