Yep we replaced javacc with our home grown tokenizer. I think we gained almost 100% indexing speed because our document size is rather large.
Rajive --- Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > Hello, > > I decided to run a little Lucene app that does some > indexing under a > profiler. (I used JMP, > http://www.khelekore.org/jmp/, a rather simple > one). > > The app uses StandardAnalyzer. > I've noticed that a lot of time is spent in > StandardTokenizer and > various JavaCC-generated methods. > I am wondering if anyone tried replacing > StandardTokenizer.jj with > something more efficient? > > Also,StopFilter is using a Hashtable to store the > list of stop words. > Has anyone tried using HashMap instead? > > Thanks, > Otis > > > __________________________________________________ > Do you Yahoo!? > Yahoo! Web Hosting - Let the expert host your site > http://webhosting.yahoo.com > > -- > To unsubscribe, e-mail: > <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: > <mailto:[EMAIL PROTECTED]> > __________________________________________________ Do you Yahoo!? Yahoo! Web Hosting - Let the expert host your site http://webhosting.yahoo.com -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
