Hi,
could not find detailed info wrt supporting full text search for 2-byte
languages like CJK (Chinese, Japanese and Korea).
1) anybody know if there is one such library available ? and
2) how to config this in Jackrabbit ? Should I replace all the extractors in
the current configuration:
<SearchIndex .....
<param name="textFilterClasses"
value="org.apache.jackrabbit.extractor.PlainTextExtractor,
org.apache.jackrabbit.extractor.MsWordTextExtractor,
org.apache.jackrabbit.extractor.MsExcelTextExtractor,
org.apache.jackrabbit.extractor.MsPowerPointTextExtractor,
org.apache.jackrabbit.extractor.PdfTextExtractor,
org.apache.jackrabbit.extractor.OpenOfficeTextExtractor,
org.apache.jackrabbit.extractor.RTFTextExtractor,
org.apache.jackrabbit.extractor.HTMLTextExtractor,
org.apache.jackrabbit.extractor.XMLTextExtractor" />
</SearchIndex>
rgds,
canal