I would specify that UTF-8 must be used, without mapping. US-ASCII is a proper subset, so need not be mentioned explicitly, nor distinguished in the protocol. Mappings would require that all implementations carry relevant data, and are up to date to recent versions of Unicode, or else previously-unassigned code points will cause failures. As long as a user types the same password the same way, or with IMEs that produce the same output, they are fine. Strange variants might improve password security.
markus

