I assume that the JSON spec deliberately allows anything that Java and JavaScript allow. In particular, there is no requirement for a Java String or JavaScript string to contain "text", or well-formed UTF-16, or only assigned characters. Some code stores binary data (sequence of arbitrary 16-bit unsigned integers) in a "string", just because it is easy and fairly efficient to transport.
You should "validate" *text* only when you are certain that it is indeed text. And when you do validate, you might want to be narrower than "assigned character"; for example, you might require Unicode identifiers or XML NMTOKENS or whatever. Also remember that "assigned" and "identifier" and such depend on the version of Unicode your library currently implements. markus

