[
https://issues.apache.org/jira/browse/SOLR-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15072281#comment-15072281
]
Yonik Seeley commented on SOLR-8220:
------------------------------------
bq. copyField target that isn't stored, isn't indexed, but has docValues for
faceting or sorting.
Going forward, why wouldn't one just use docValues on the original field?
Anyway the copyField thing presents ambiguities for atomic updates as well...
it's not specific to "fl". Whatever we support for that can be used for "fl"
as well (for instance, determinging that the copyField target isn't a "real"
field because the source is already stored/docValued or something).
It seems like going forward, people will benefit by adjusting their mental
model to "it's just a different way of storing... row stored or column stored."
To a new user, why would one not get all stored fields back? If column stored
option had been present in Lucene from the start, that's probably how it would
have been implemented in Solr from the start.
> Read field from docValues for non stored fields
> -----------------------------------------------
>
> Key: SOLR-8220
> URL: https://issues.apache.org/jira/browse/SOLR-8220
> Project: Solr
> Issue Type: Improvement
> Reporter: Keith Laban
> Assignee: Shalin Shekhar Mangar
> Attachments: SOLR-8220-5x.patch, SOLR-8220-branch_5x.patch,
> SOLR-8220-ishan.patch, SOLR-8220-ishan.patch, SOLR-8220-ishan.patch,
> SOLR-8220-ishan.patch, SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch,
> SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch,
> SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch,
> SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch,
> SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch,
> SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch,
> SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch,
> SOLR-8220.patch
>
>
> Many times a value will be both stored="true" and docValues="true" which
> requires redundant data to be stored on disk. Since reading from docValues is
> both efficient and a common practice (facets, analytics, streaming, etc),
> reading values from docValues when a stored version of the field does not
> exist would be a valuable disk usage optimization.
> The only caveat with this that I can see would be for multiValued fields as
> they would always be returned sorted in the docValues approach. I believe
> this is a fair compromise.
> I've done a rough implementation for this as a field transform, but I think
> it should live closer to where stored fields are loaded in the
> SolrIndexSearcher.
> Two open questions/observations:
> 1) There doesn't seem to be a standard way to read values for docValues,
> facets, analytics, streaming, etc, all seem to be doing their own ways,
> perhaps some of this logic should be centralized.
> 2) What will the API behavior be? (Below is my proposed implementation)
> Parameters for fl:
> - fl="docValueField"
> -- return field from docValue if the field is not stored and in docValues,
> if the field is stored return it from stored fields
> - fl="*"
> -- return only stored fields
> - fl="+"
> -- return stored fields and docValue fields
> 2a - would be easiest implementation and might be sufficient for a first
> pass. 2b - is current behavior
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]