Your newly found edge cases have all been fixed, thanks. And the 4 rules that I gave still apply. ;)
PS: I really dislike this em-dash special casing and might remove it from the language again. I never understood why the fonts cannot be patched instead so that the underscore looks more like a dash...
