Would it be ok to add an extra addDocument method to IndexWriter that would take an analyzer in addition to the document?
I am going to be indexing documents for multiple languages and I would prefer to not have to reopen a writer for each document that we are going to index.
I took a look at the code and it looks pretty straight forward and it didn't look like it would break anything.
I had the same problem, but I came up with a workaround which might be helpful to you. I just wrote a facade analyzer, which selects appropriate language-specific analyzer just before I call addDocument. Something like:
SwitchLangAnalyzer sla = new SwitchLangAnalyzer(new Analyzer[] {GermanAnalyzer, RussianAnalyzer, SwedishAnalyzer});
IndexWriter iw = new IndexWriter(dir, sla, true);
// add German doc
sla.select(0);
iw.addDocument(doc);
// add Russian doc
sla.select(1);
iw.addDocument(doc);
..and so on...
You need to be extra careful though how you use such index afterwards, especially if you use stemming or stop words - I also store a "lang" field which I use to limit the search to documents only in a given language, and I use the same sub-analyzer for queries.
-- Best regards, Andrzej Bialecki
------------------------------------------------- Software Architect, System Integration Specialist CEN/ISSS EC Workshop, ECIMF project chair EU FP6 E-Commerce Expert/Evaluator ------------------------------------------------- FreeBSD developer (http://www.freebsd.org)
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
