#3341: encoding errors could be handled better
-------------------------------+--------------------------------------------
Reporter: judahj | Owner:
Type: bug | Status: new
Priority: high | Milestone: 6.12.1
Component: libraries/base | Version: 6.11
Severity: normal | Resolution:
Keywords: | Difficulty: Unknown
Testcase: | Os: MacOS X
Architecture: x86 |
-------------------------------+--------------------------------------------
Comment (by judahj):
Replying to [comment:1 simonmar]:
> What do you mean by a "Latin-1 non-ASCII character"? e.g. a byte
between 0x80 and 0xBF should elicit an error immediately, whereas a byte
between 0xC0 and 0xDF will require one extra byte to determine whether
there is a decoding error or not. I do think there's a bug here though:
if the bytes 0xE0 0x00 are received, then GHC will wait for one more byte
before raising an error, even though the sequence is already erroneous.
>
OK, that's a better explanation of what I was seeing. As a further
example, if I enter into a utf-8 terminal the invalid byte sequence [0xf7,
0x97] (which corresponds to a latin-1 'รน' followed by an 'a'), then GHC
incorrectly waits for two more bytes before raising an error.
However, from a UI perspective it seems to me that when `NoBuffering` is
set we should not be pausing to wait for more input. Instead, when a
partial byte sequence is read but `hReady` is false we should raise an
error. (This is the current behavior of Haskeline.) Thus, when a user
has the wrong LANG set and types a character they will immediately get
feedback that something's off.
> In this example:
>
> {{{
> ghc -e "putStrLn \"\\249\"" | ./badchar
> }}}
>
> bear in mind that ghc is using the locale encoding to output '\249', and
then decoding it again on input. I think you're seeing the correct result
here.
Yes, you're correct, sorry.
--
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/3341#comment:3>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs