#3341: encoding errors could be handled better
-------------------------------+--------------------------------------------
    Reporter:  judahj          |        Owner:         
        Type:  bug             |       Status:  new    
    Priority:  high            |    Milestone:  6.12.1 
   Component:  libraries/base  |      Version:  6.11   
    Severity:  normal          |   Resolution:         
    Keywords:                  |   Difficulty:  Unknown
    Testcase:                  |           Os:  MacOS X
Architecture:  x86             |  
-------------------------------+--------------------------------------------
Comment (by judahj):

 Replying to [comment:1 simonmar]:
 > What do you mean by a "Latin-1 non-ASCII character"?  e.g. a byte
 between 0x80 and 0xBF should elicit an error immediately, whereas a byte
 between 0xC0 and 0xDF will require one extra byte to determine whether
 there is a decoding error or not.  I do think there's a bug here though:
 if the bytes 0xE0 0x00 are received, then GHC will wait for one more byte
 before raising an error, even though the sequence is already erroneous.
 >

 OK, that's a better explanation of what I was seeing.  As a further
 example, if I enter into a utf-8 terminal the invalid byte sequence [0xf7,
 0x97] (which corresponds to a latin-1 'รน' followed by an 'a'), then GHC
 incorrectly waits for two more bytes before raising an error.

 However, from a UI perspective it seems to me that when `NoBuffering` is
 set we should not be pausing to wait for more input.  Instead, when a
 partial byte sequence is read but `hReady` is false we should raise an
 error.  (This is the current behavior of Haskeline.)  Thus, when a user
 has the wrong LANG set and types a character they will immediately get
 feedback that something's off.

 > In this example:
 >
 > {{{
 > ghc -e "putStrLn \"\\249\"" | ./badchar
 > }}}
 >
 > bear in mind that ghc is using the locale encoding to output '\249', and
 then decoding it again on input.  I think you're seeing the correct result
 here.

 Yes, you're correct, sorry.

-- 
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/3341#comment:3>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs

Reply via email to