Re: Proposal: extracting term-level stats from query process

markharw00d Thu, 11 Mar 2004 15:29:21 -0800

I just re-ran the same tests but using SimpleAnalyzer (a lowercase filter only)


This time round responses were :
Tokenizing:5 ms avg per doc
Highlighting:11 ms avg per doc
RAM Indexing docs:39 ms avg per doc

RAM indexing still looks to add more than I would like.

Having reviewed my previous choice of analyzer the main offender in it's list of 
filters looks to be "StandardTokenizer".
On its own it clocks up an avg 73 ms per doc.

To be honest at first glance I dont know what it is trying to do - its JavaCC 
generated code and its not immediately obvious to me.
I do see its using Vectors internally so thats not going to help matters.

Cheers
Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Proposal: extracting term-level stats from query process

Reply via email to