[jira] Commented: (LUCENE-937) Make CachingTokenFilter faster

Mark Miller (JIRA) Thu, 21 Jun 2007 18:03:46 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12507080
 ]


Mark Miller commented on LUCENE-937:
------------------------------------

My tests early must have gotten out of whack. I was measuring a much bigger 
difference than I see now.

As a result, I started from scratch, carefully creating and lableing a new 
Lucene core jar for each case and averaging the performance over 15,000 calls 
creating and reading TokenStreams off the Reuters data.

After very thorough testing (I was in quite a hurry this morning), I have come 
up with the following:

LinkedList() using get, LinkedList() using iterator, and ArrayList() are 
practically identical in speed.

ArrayList(30) gave a 47% increase in speed. Above 30-60 gave no more returns.

This patch should not go through as is. What do you think given these results? 
I assumed that an ArrayList would be faster as all of the data is guaranteed 
contiguous, but it surprised me that the resizing was not enough to slow things 
down to LinkedList speed (unless you start with too low an initial size -- 
default is 10).

- Mark

> Make CachingTokenFilter faster
> ------------------------------
>
>                 Key: LUCENE-937
>                 URL: https://issues.apache.org/jira/browse/LUCENE-937
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Mark Miller
>            Priority: Minor
>         Attachments: CachingTokenFilter.patch
>
>
> The wrong data structure was used for the CachingTokenFilter. It should be an 
> ArrayList rather than a LinkedList. There is a noticeable difference in speed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-937) Make CachingTokenFilter faster

Reply via email to