The definition of UTF-32 (and the modifications to UTF-8 for Unicode 3.1) make it clear that conformant processes shall not generate irregular sequences. However, they do not (and perhaps they shouldn't) indicate what a process should do when encountering an irregular sequence, and I'm curious what people are doing in practice.
One could apply the traditional Internet aphorism of being liberal in what one accepts, but that didn't pan out so well for non-shortest-form UTF-8, so in addition to wondering what people are doing in practice, I'm also curious about the follow theoretical issue: It doesn't seem very likely to me that someone would write a security check that depends on, say, passing Deseret code points but blocking musical notation code points; however, I wouldn't say it's impossible; moreover, a security check that wants to disallow all non-BMP characters doesn't seem quite so outlandish. If someone did write such a check, it seems to me that the attack described in UAX #27 would apply, by substituting "irregular sequence" for "non-shortest form": Process A performs security checks, but does not check for irregular sequences. Process B accepts the byte sequence from process A, and transforms it into UTF-16 while interpreting irregular sequences. The UTF-16 text may then contain characters that should have been filtered out by process A. Even if I'm mistaken about this, is there a specific argument *for* accepting irregular sequences? --deh!

