Hi, I need to index a Wikipedia dump. I know there is code in contrib/benchmark for indexing *English* Wikipedia for benchmarking purposes. However, I'd like to index a non-English dump, and I actually don't need it for benchmarking, I just want to end up with a Lucene index.
Any suggestions where I should start? That is, can anything in contrib/benchmark already do this, or is there anything there that I should use as a starting point? As opposed to writing my own Wikipedia XML dump parser+indexer. Thanks, Otis --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]