[ https://issues.apache.org/jira/browse/LUCENE-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12507080 ]
Mark Miller commented on LUCENE-937: ------------------------------------ My tests early must have gotten out of whack. I was measuring a much bigger difference than I see now. As a result, I started from scratch, carefully creating and lableing a new Lucene core jar for each case and averaging the performance over 15,000 calls creating and reading TokenStreams off the Reuters data. After very thorough testing (I was in quite a hurry this morning), I have come up with the following: LinkedList() using get, LinkedList() using iterator, and ArrayList() are practically identical in speed. ArrayList(30) gave a 47% increase in speed. Above 30-60 gave no more returns. This patch should not go through as is. What do you think given these results? I assumed that an ArrayList would be faster as all of the data is guaranteed contiguous, but it surprised me that the resizing was not enough to slow things down to LinkedList speed (unless you start with too low an initial size -- default is 10). - Mark > Make CachingTokenFilter faster > ------------------------------ > > Key: LUCENE-937 > URL: https://issues.apache.org/jira/browse/LUCENE-937 > Project: Lucene - Java > Issue Type: Improvement > Reporter: Mark Miller > Priority: Minor > Attachments: CachingTokenFilter.patch > > > The wrong data structure was used for the CachingTokenFilter. It should be an > ArrayList rather than a LinkedList. There is a noticeable difference in speed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]