@Araq 19:17:16: That makes the distinction between rule 9 ('underscores are
removed') and a part of rule 11 ('en-dashes are not removed but ignored')
irrelevant indeed. Most of my rules are based on the behaviour of the
executables or the compilability of the source, however.
@Araq 19:26:26: That four rules don't explain everything. I'm not quite sure
what you mean by your last paragraph. In general, different representations of
essentially the same character doesn't make life easier when you want to search
through the source for occurrences of them, but that topic has been discussed
elsewere. I was just overwhelmed by the complexity of the whole thing, that's
all. Why are `{`}!_:!{`}` and `{`}!:_!{`}` okay and is `{`}!_:_!{`}` not okay?
It makes me curious about the underlying mechanisms, in case I want to play
with it. In most cases, the alphabet and numbers would suffice me, and I would
certainly avoid the pathological ones like this example.