> > That's an amazing number of changes, even when you ignore name changes. >
DM, for your reference, I created another diff from 4.0->5.1, showing what will happen with JDK7 here: http://people.apache.org/~rmuir/unicodeDiff2.txt the problem is that as a search engine library, lucene cares about properties and other semantics of characters that will change across versions. so if we leave this up to the JDK, then when they upgrade unicode it breaks back compat. databases and other things don't much care about these properties and its just utf8 bytes for the most part so it doesn't matter for them. -- Robert Muir rcm...@gmail.com