Yes. Lucene can handle that. You have to select which stemmer to use. You may have to improve the German and Danish stemmers a little bit.
You may also have some issues with the fact that if Danish is 5% of your corpus, then words that occur in 100% of your Danish documents will tend to have too high weights since they only occur in 5% of your documents. Any term that occurs in more than 20% of a sub-corpus should generally be discarded from your query. This can be difficult in multi-lingual situations. For a first pass, I would ignore this issue, however. On Mon, May 11, 2009 at 4:07 AM, uday kumar maddigatla <u...@mach.com>wrote: > what if my database data contains other language (like danish, german). > > Is Lucene will handle that . > > If yes How? > -- Ted Dunning, CTO DeepDyve 111 West Evelyn Ave. Ste. 202 Sunnyvale, CA 94086 www.deepdyve.com 858-414-0013 (m) 408-773-0220 (fax)