[jira] Issue Comment Edited: (SOLR-1111) fix FieldCache usage in Solr

Yonik Seeley (JIRA) Thu, 16 Apr 2009 07:48:36 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-1111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698399#action_12698399
 ]


Yonik Seeley edited comment on SOLR-1111 at 4/16/09 7:46 AM:
-------------------------------------------------------------

The major issue is that Lucene now creates scorers per-segment, and if you use 
Lucene's searcher.search(...,sort) then the FieldCache populations will also be 
per-segment.

The biggest issue:  If FieldCache get's populated at both the top-level reader 
and per-segment, memory usage doubles (as does un-inversion time).
 - Faceting on single-valued fields uses the FieldCache at the top-level (and 
would be
   - This is non-trivial to change...  if we started counting per-segment, 
counts would somehow have to be merged across segments.
 - Sorting in Solr currently uses the FieldCache at the top level
   - This can't easily be changed to use Lucene's searcher.search(...,sort) 
since we are using a hit collector (which can be wrapped in a time limited 
collector).
 - Distributed search uses the top-level FieldCache to retrieve sort field 
values.
 - FunctionQuery now derives values at the segment level
   - This also applies to the function range query

Another issue for function query is the use of ord()... it won't be valid in 
multi-segment indexes if evaluated at the segment level.

Evaluate custom sorters (like query elevation, etc) to ensure that they still 
work at the segment level.  Solr doesn't currently do segment-level sorting 
like Lucene now does, but perhaps we should switch for more near-real-time 
support.


      was (Author: [email protected]):
    The major issue is that Lucene now creates scorers per-segment, and if you 
use Lucene's searcher.search(...,sort) then the FieldCache populations will 
also be per-segment.

The biggest issue:  If FieldCache get's populated at both the top-level reader 
and per-segment, memory usage doubles (as does un-inversion time).
 - Faceting on single-valued fields uses the FieldCache at the top-level (and 
would be
   - This is non-trivial to change...  if we started counting per-segment, 
counts would somehow have to be merged across segments.
 - Sorting in Solr currently uses the FieldCache at the top level
   - This can't easily be changed to use Lucene's searcher.search(...,sort) 
since we are using a hit collector (which can be wrapped in a time limited 
collector).
 - Distributed search uses the top-level FieldCache to retrieve sort field 
values.
 - FunctionQuery now derives values at the segment level
   - This also applies to the function range query

Another issue for function query is the use of ord()... it won't be valid in 
multi-segment indexes if evaluated at the segment level.
  
> fix FieldCache usage in Solr
> ----------------------------
>
>                 Key: SOLR-1111
>                 URL: https://issues.apache.org/jira/browse/SOLR-1111
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Yonik Seeley
>             Fix For: 1.4
>
>
> Recent changes in Lucene have altered how the FieldCache is used and as-is 
> could lead to previously working Solr installations blowing up when they 
> upgrade to 1.4.  We need to fix, or document the affects of these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-1111) fix FieldCache usage in Solr

Reply via email to