> I think ConTeXt already does (or parts of it). I also wrote code for normalisation as part of my Google Summer of Code project in 2008 (independently from Hans), it's really very short. The necessary Unicode data is taken from ConTeXt's char-def.lua, so you may extract the 200-300 lines of Lua code from this, and take a more recent char-def.lua from the ConTeXt distribution.
See http://code.google.com/p/google-summer-of-code-2008-tex/downloads/list Arthur
