On Wed, Dec 7, 2011 at 8:22 PM, Andi Vajda <[email protected]> wrote: >> JavaError: java.lang.UnsupportedOperationException: This JRE does not have >> support for Thai segmentation >> Java stacktrace: >> java.lang.UnsupportedOperationException: This JRE does not have support for >> Thai segmentation >> at >> org.apache.lucene.analysis.th.ThaiWordFilter.<init>(ThaiWordFilter.java:85) >> at >> org.apache.lucene.analysis.th.ThaiAnalyzer.createComponents(ThaiAnalyzer.java:64) >> at >> org.apache.lucene.analysis.ReusableAnalyzerBase.tokenStream(ReusableAnalyzerBase.java:92) >> > > That's a Java error. Your JVM doesn't do Thai. I didn't know this was > possible. > > A patch to silence this could be written and is welcome. Not a new issue and > not a release stopper, imho. >
Hi Andi, I added this check (i think a few releases back) when I found out some JVMs such as IBM's don't return a real thai-wordbreaker for "th" Locale. It could also be that even a Sun/Oracle JRE doesn't have support for this (if its not the "international" version). http://www.oracle.com/technetwork/java/javase/locales-137662.html There is a public boolean constant available if you want to inspect that its working: ThaiWordFilter.DBBI_AVAILABLE: /** * True if the JRE supports a working dictionary-based breakiterator for Thai. * If this is false, this filter will not work at all! */ public static final boolean DBBI_AVAILABLE; In our unit tests for Thai we don't fail the test if this is false: assumeTrue("JRE does not support Thai dictionary-based BreakIterator", ThaiWordFilter.DBBI_AVAILABLE); (though now that you brought it up, i see i missed adding this assume to one of our tests... thanks) -- lucidimagination.com
