Yep we replaced javacc with our home grown tokenizer.
I think we gained almost 100% indexing speed because
our document size is rather large. 

Rajive

--- Otis Gospodnetic <[EMAIL PROTECTED]>
wrote:
> Hello,
> 
> I decided to run a little Lucene app that does some
> indexing under a
> profiler. (I used JMP,
> http://www.khelekore.org/jmp/, a rather simple
> one).
> 
> The app uses StandardAnalyzer.
> I've noticed that a lot of time is spent in
> StandardTokenizer and
> various JavaCC-generated methods.
> I am wondering if anyone tried replacing
> StandardTokenizer.jj with
> something more efficient?
> 
> Also,StopFilter is using a Hashtable to store the
> list of stop words. 
> Has anyone tried using HashMap instead?
> 
> Thanks,
> Otis
> 
> 
> __________________________________________________
> Do you Yahoo!?
> Yahoo! Web Hosting - Let the expert host your site
> http://webhosting.yahoo.com
> 
> --
> To unsubscribe, e-mail:  
> <mailto:[EMAIL PROTECTED]>
> For additional commands, e-mail:
> <mailto:[EMAIL PROTECTED]>
> 


__________________________________________________
Do you Yahoo!?
Yahoo! Web Hosting - Let the expert host your site
http://webhosting.yahoo.com

--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to