[
https://issues.apache.org/jira/browse/SOLR-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15068075#comment-15068075
]
Ishan Chattopadhyaya commented on SOLR-8220:
--------------------------------------------
bq. Is there a test which creates a new field with useDocValuesAsStored as true
and separately as false using the schema API?
I had added SchemaVersionSpecificBehaviorTest to test for these various
true/false cases. However, there is no useDocValuesAsStored=false case with
checking of output. I'll add such a test.
bq. I'm assuming you will address Erick's concern above about multi-valued
fields.
I'm working through them. So far as I can see, both the current loop with
values.getValueCount() and what Erick suggested as a loop are running
identically, i.e values.getValueCount() is indeed returning the count of values
per document. But I am adding a test to prove it.
For the {{DocValues.getDocsWithField(atomicReader, fieldName).get(docid)}}, not
having it was resulting in empty fields being returned for documents that
weren't supposed to have an docValue (the user never added a docValue for that
document during indexing). Again, I think I should add a specific test for
that, testing for the number of fields returned (maybe there already is one
from Keith, but I'll check again).
> Read field from docValues for non stored fields
> -----------------------------------------------
>
> Key: SOLR-8220
> URL: https://issues.apache.org/jira/browse/SOLR-8220
> Project: Solr
> Issue Type: Improvement
> Reporter: Keith Laban
> Attachments: SOLR-8220-5x.patch, SOLR-8220-ishan.patch,
> SOLR-8220-ishan.patch, SOLR-8220-ishan.patch, SOLR-8220-ishan.patch,
> SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch,
> SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch,
> SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch,
> SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch,
> SOLR-8220.patch, SOLR-8220.patch
>
>
> Many times a value will be both stored="true" and docValues="true" which
> requires redundant data to be stored on disk. Since reading from docValues is
> both efficient and a common practice (facets, analytics, streaming, etc),
> reading values from docValues when a stored version of the field does not
> exist would be a valuable disk usage optimization.
> The only caveat with this that I can see would be for multiValued fields as
> they would always be returned sorted in the docValues approach. I believe
> this is a fair compromise.
> I've done a rough implementation for this as a field transform, but I think
> it should live closer to where stored fields are loaded in the
> SolrIndexSearcher.
> Two open questions/observations:
> 1) There doesn't seem to be a standard way to read values for docValues,
> facets, analytics, streaming, etc, all seem to be doing their own ways,
> perhaps some of this logic should be centralized.
> 2) What will the API behavior be? (Below is my proposed implementation)
> Parameters for fl:
> - fl="docValueField"
> -- return field from docValue if the field is not stored and in docValues,
> if the field is stored return it from stored fields
> - fl="*"
> -- return only stored fields
> - fl="+"
> -- return stored fields and docValue fields
> 2a - would be easiest implementation and might be sufficient for a first
> pass. 2b - is current behavior
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]