#7853: UTF encodings do not detect overlong forms
----------------------------------------+-----------------------------------
Reporter:  batterseapower               |          Owner:                  
    Type:  bug                          |         Status:  new             
Priority:  normal                       |      Component:  libraries/base  
 Version:  7.6.3                        |       Keywords:                  
      Os:  Unknown/Multiple             |   Architecture:  Unknown/Multiple
 Failure:  Incorrect result at runtime  |      Blockedby:                  
Blocking:                               |        Related:                  
----------------------------------------+-----------------------------------
 Overlong UTF-{8,16} sequences can have security implications
 (http://www.cl.cam.ac.uk/~mgk25/unicode.html). Decoders for these
 encodings should detect them and flag them as invalid characters. GHC's
 implementations of these decoders do not do so!

 This problem has additional implications for GHC since as we are not
 rejecting overlong sequences, trying to roundtrip 0xC0 0xB1 through
 UTF-8//ROUNDTRIP results in 0x31 rather than the expected sequence. The
 roundtripping fails because the overlong sequence is not flagged up by the
 UTF-8 encoder and so the surrogate escape mechanism never gets a chance to
 work.

-- 
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/7853>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
ghc-tickets mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/ghc-tickets

Reply via email to