I have an idea for a Solr feature, but to know whether it's at all
viable, I need a question about Lucene operation answered.

In recent versions of Solr, if a field is not stored, not indexed, but
does have docValues, the originally indexed data sent for that field
will be returned in search results.  In older versions (not sure which
ones) a field must be stored to be returned.

Let's say that such a field contains a very large amount of data in
every document.  Normally, this would affect OS disk cache efficiency
for general queries, because the docValues data for the field would need
to be read in order to be included in search results.  Reading that
large amount of data can pollute the disk cache.  If the system is in a
low-memory situation, that can affect performance.

What happens if every query has an explicit list of fields to return in
results, and the list of fields does NOT include this field that
contains a large amount of data in docValues?  Does this mean that the
docValues data for the field I've mentioned is never read, and has no
effect on OS disk cache efficiency?  Or would Lucene read the docValues
data even though it doesn't include it in results?

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to