I use something like String.isValid functionality in a transcoder that
converts Strings to/from UTF-8, HTML Formdata (MIME type
application/x-www-form-urlencoded -- not the same as URI encoding!), and
Base64.

Admittedly these currently use 'encodeURI' to do the work, or it just drops
out naturally when considering UTF-8 sequences.

(I considered testing the regexp
/^(?:[\u0000-\uD7FF\uE000-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF])*$/
against the input string.)

Maybe the function is too obscure for general use, although its presence does flag up the surrogate-pair issue to developers.

--------------------------------------------------
From: "Norbert Lindenberg" <[email protected]>

It's easy to provide this function, but in which situations would it be
useful? In most cases that I can think of you're interested in far more
constrained definitions of validity:
- what are valid ECMAScript identifiers?
- what are valid BCP 47 language tags?
- what are the characters allowed in a certain protocol?
- what are the characters that my browser can render?

Thanks,
Norbert


On Mar 24, 2012, at 12:12 , David Herman wrote:

On Mar 23, 2012, at 11:45 AM, Roger Andrews wrote:

Concerning UTF-16 surrogate pairs, how about a function like:
    String.isValid( str )
to discover whether surrogates are used correctly in 'str'?

Something like Array.isArray().

No need for it to be a class method, since it only operates on strings.
We could simply have String.prototype.isValid(). Note that it would work
for primitive strings as well, thanks to JS's automatic promotion
semantics.

Dave


_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss

Reply via email to