[
https://issues.apache.org/jira/browse/SOLR-12697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352666#comment-17352666
]
Christine Poerschke commented on SOLR-12697:
--------------------------------------------
Hi [~TomGilke], thanks for the questions on the idea of connecting the stored
fields!
Yes, the idea was to achieve connectivity via a document cache and I agree with
you the example configuration is error prone e.g. because of repeated
information.
So there's two "connecting" things that are needed:
# field values: When computing scores we wish for the second and subsequent
features to benefit from fetch work done by the first feature.
{{FieldValueFeature}} has access to a/the {{SolrIndexSearcher}} and that has a
{{SolrDocumentFetcher}} and that has a {{DocumentCache}} i.e. from looking
(without trying) at least there _appears_ to be a way there to connect things
if the first feature adds to that document cache. Whether or not using that
document fetcher and adding to that document cache is actually a good idea or
if some alternative special purpose document fetcher instance and corresponding
document cache should be used instead, that's of course a different question, a
question to explore further if one had a use case and were to start
implementing a {{PrefetchingFieldValueFeature}} class (outside the scope of
this JIRA task here).
** edit: I note that [~slivotov]'s original patch obtains a fetcher from a
searcher and the searcher from the request i.e. that's an alternative to
IndexSearcher-to-SolrIndexSearcher casting
# field names: When fetching fields the first feature must "look ahead" to ask
for the return not only of the field that it itself needs but also the fields
that other features will subsequently need. It could do so at scoring time i.e.
"ask around" what other fields its fellow features need or it could do so at
construction time, both approaches have pros and cons of course but if it
happens at construction time then our feature object would have a "state" which
conceptually looks like
{code:java}
private String field;
private Set<String> fieldAsSet;
+ private Set<String> prefetchFields; // own field plus other
PrefetchingFieldValueFeature objects' fields
{code}
So that then leads to the question of "where does prefetchFields come from?" on
the basis that it coming from configuration is error prone and impractical.
At present feature construction is a single step e.g.
[https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.8.2/solr/contrib/ltr/src/java/org/apache/solr/ltr/store/rest/ManagedFeatureStore.java#L121]
but one might imagine a two pass approach e.g. after all the features in a
store are constructed in a second pass we (a) determine which features are
prefetching and (b) tell all the prefetching features what the joint set of
fields is.
* There probably is a proper name for that sort of approach but I don't know
what it is, sorry!
* Model construction already has a "sometimes two pass" approach e.g. note the
[Feature.getInstance|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.8.2/solr/contrib/ltr/src/java/org/apache/solr/ltr/store/rest/ManagedFeatureStore.java#L209]
and
[LTRScoringModel.getInstance|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.8.2/solr/contrib/ltr/src/java/org/apache/solr/ltr/store/rest/ManagedModelStore.java#L239]
similarity but the [if (ltrScoringModel instanceof AdapterModel)
initAdapterModel(solrResourceLoader, (AdapterModel)ltrScoringModel,
managedFeatureStore)|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.8.2/solr/contrib/ltr/src/java/org/apache/solr/ltr/store/rest/ManagedModelStore.java#L248-L250]
difference i.e. for most models initialisation is completed in the constructor
but for adapter models the initialization is completed only via init method
call.
Oops, slightly more words above than anticipated! Perhaps one short pseudocode
bit to try to sum it up:
{code:java}
# configuration illustration
[ ...
{ "name": "a", "class": "PrefetchingFieldValueFeature", "params": { "field"
: "aaa" } },
...
{ "name": "b", "class": "PrefetchingFieldValueFeature", "params": { "field"
: "bbb" } },
...
{ "name": "c", "class": "PrefetchingFieldValueFeature", "params": { "field"
: "ccc" } },
...
{ "name": "c", "class": "FieldValueFeature", "params": { "field" : "xyz" } },
... ]
# code fragment 1
class PrefetchingFieldValueFeature extends FieldValueFeature {
private Set<String> prefetchFields;
public void setPrefetchFields(Set<String> prefetchFields) {
this.prefetchFields = prefetchFields;
}
...
}
# code fragment 2
Set<String> prefetchFields = new Set<>();
for (Feature feature : featureStore.getFeatures()) {
if (feature instanceof PrefetchingFieldValueFeature) {
prefetchFields.add(((PrefetchingFieldValueFeature)feature).getField());
}
}
# prefetchFields contains "aaa" and "bbb" and "ccc" at this point
for (Feature feature : featureStore.getFeatures()) {
if (feature instanceof PrefetchingFieldValueFeature) {
((PrefetchingFieldValueFeature)feature).setPrefetchFields(prefetchFields);
}
}
{code}
{{</endOfMeThinkingOutAloud>}} :) What do you think?
> pure DocValues support for FieldValueFeature
> --------------------------------------------
>
> Key: SOLR-12697
> URL: https://issues.apache.org/jira/browse/SOLR-12697
> Project: Solr
> Issue Type: Sub-task
> Components: contrib - LTR
> Reporter: Stanislav Livotov
> Priority: Major
> Attachments: SOLR-12697.patch, SOLR-12697.patch, SOLR-12697.patch,
> SOLR-12697.patch, SOLR-12697.patch
>
> Time Spent: 8h
> Remaining Estimate: 0h
>
> [~slivotov] wrote in SOLR-12688:
> bq. ... FieldValueFeature doesn't support pure DocValues fields (Stored
> false). Please also note that for fields which are both stored and DocValues
> it is working not optimal because it is extracting just one field from the
> stored document. DocValues are obviously faster for such usecases. ...
> (Please see SOLR-12688 description for overall context and analysis results.)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]