On 1/5/2016 8:26 AM, Markus Scherer wrote:
I would specify that UTF-8 must be used, without mapping.
US-ASCII is a proper subset, so need not be mentioned explicitly, nor distinguished in the protocol. Mappings would require that all implementations carry relevant data, and are up to date to recent versions of Unicode, or else previously-unassigned code points will cause failures. As long as a user types the same password the same way, or with IMEs that produce the same output, they are fine. Strange variants might improve password security.

Right.

In PRECIS, UTF-8 is enforced. However as you point out, the issue is that "strange variants" exist, as well as different IMEs and different keyboard/keystroke combinations. A case in point is that 0xFF is not a valid UTF-8 octet. However, nothing constrains the underlying technology not to use 0xFF, so there should be a way for a user (or process) to force the use of specific octet strings as inputs. That is why the "password-mapping" parameter is proposed as a hint rather than a strict rule.

Also as pointed out, PKCS#8 encrypted blobs are used within PKCS #12, which has its own Unicode mapping (based on UTF-16LE).

Sean

Reply via email to