When decoding bytes to unicode using the "latin1" scheme, there are three options for bytes not defined in the ISO-8859-1 standard.
1) Throw an error. 2) Insert the replacement glyph (fffd), indicating an unknown character. 3) Insert the unicode character with equal value. This means that completely random bytes will always decode successfully. The Python language currently implements option three. Is this correct? There is an option to produce errors or replacements for encodings which have undefined characters, but as implemented, latin1 currently defines characters for all 256 bytes, so the option does nothing. Restated, are the first 256 characters of unicode intended to be exactly compatible with a latin1 codec? This would imply that unicode has inserted character definitions into the ISO-8859-1 standard.