... If I had to do that I would need to use a complete implementation found in ICU (but ICU is quite large for some projects).
ICU carries a lot of data and a good chunk of code because it has a lot of features and data for many locales and codepage conversions. After conversion tables, the collation tailorings are the second-largest set of data.
However, you can make ICU smaller by dropping features and/or data: http://oss.software.ibm.com/icu/userguide/packaging.html
About Japanese collation, I forgot to point out that we have an online demo: http://oss.software.ibm.com/cgi-bin/icu/lx/en_US/utf-8/?_=ja&EXPLORE_CollationElements=
markus

