[
https://issues.apache.org/jira/browse/LUCENE-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12463729
]
Chuck Williams commented on LUCENE-769:
---------------------------------------
The test case uses only tiny documents, and the reported timings for multiple
searches with FieldCache make it appear that the version of lucene used
contains the bug that caused FieldCaches to be frequently recomputed
unnecessarily.
I suggest trying the test with much larger documents, of realistic size, and
using current Lucene source. I'm sure the patch will make things much slower
with the current implementation. As Hoss suggests, performance would be
improved considerably by using a FieldSelector to obtain just the sort field,
but even so will be slow unless the sort field is arranged to be early on the
documents, ideally the first field, and a LOAD_AND_BREAK FieldSelector is used.
Another important performance variable will be the number of documents
retrieved in the test query. If the number of documents satisfying the query
is a sizable percentage of the total collection size, I'm pretty sure the patch
will be much slower than using FieldCache.
> [PATCH] Performance improvement for some cases of sorted search
> ---------------------------------------------------------------
>
> Key: LUCENE-769
> URL: https://issues.apache.org/jira/browse/LUCENE-769
> Project: Lucene - Java
> Issue Type: Improvement
> Affects Versions: 2.0.0
> Reporter: Artem Vasiliev
> Attachments: DocCachingSorting.patch, DocCachingSorting.patch
>
>
> It's a small addition to Lucene that significantly lowers memory consumption
> and improves performance for sorted searches with frequent index updates and
> relatively big indexes (>1mln docs) scenario. This solution supports only
> single-field sorting currently (which seem to be quite popular use case).
> Multiple fields support can be added without much trouble.
> The solution is this: documents from the sorting set (instead of given
> field's values from the whole index - current FieldCache approach) are cached
> in a WeakHashMap so the cached items are candidates for GC. Their fields
> values are then fetched from the cache and compared while sorting.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]