[ 
https://issues.apache.org/jira/browse/SOLR-8858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15360002#comment-15360002
 ] 

ASF GitHub Bot commented on SOLR-8858:
--------------------------------------

Github user maedhroz commented on a diff in the pull request:

    https://github.com/apache/lucene-solr/pull/47#discussion_r69373171
  
    --- Diff: 
solr/core/src/java/org/apache/solr/handler/component/QueryComponent.java ---
    @@ -910,8 +910,12 @@ protected void createMainQuery(ResponseBuilder rb) {
             additionalAdded = addFL(additionalFL, "score", additionalAdded);
           }
         } else {
    -      // reset so that only unique key is requested in shard requests
    -      sreq.params.set(CommonParams.FL, 
rb.req.getSchema().getUniqueKeyField().getName());
    +      if (rb.req.getSearcher().enableLazyFieldLoading) {
    +        // reset so that only unique key is requested in shard requests
    +        sreq.params.set(CommonParams.FL, 
rb.req.getSchema().getUniqueKeyField().getName());
    +      } else {
    +        sreq.params.set(CommonParams.FL, "*");
    --- End diff --
    
    In the current master (without my patch), the query stage shard request for 
join in `DistribJoinFromCollectionTest` will pull the document from 
`SolrIndexSearcher#doc()' with only `id` in the specified `fields`. It does not 
use lazy field loading, and so uses a `DocumentStoredFieldVisitor` with no 
`fields` specified to load the whole document, and then put it in the 
`documentCache`. If we used lazy field loading, the cached document would still 
have some representation of all the fields, albeit lazy ones.
    
    With only the `SolrIndexSearcher` piece of my patch in place, the 
`TestSubQueryTransformer` failures are easy to avoidl, and I was able to fix 
them by simply reading the JavaDoc. (See the 
[comment](https://github.com/apache/lucene-solr/pull/47/files/4f9e67c63ce5130795df647ef5e86ae970601cb6#r69015716)
 below.) `DistribJoinFromCollectionTest` (and `TestCloudDeleteByQuery`) fails, 
because when, as I've laid out above, `doc()` actually respects the `fields` 
list during the main query phase, it caches a document that *only contains 
those fields*. When the actual field retrieval stage of the query hits the 
shard, `doc()` spits out a document that doesn't have the all fields in `fl`. 
(I'm not sure `DistribJoinFromCollectionTest` or `TestCloudDeleteByQuery` are 
doing something wrong, and they actually *pass* if they enable lazy field 
loading.)
    
    The reason I raised this issue in the first place is that I have a custom 
`StoredFieldsVisitor` that relies on `DocumentStoredFieldVisitor` providing the 
fields requested by the query. The unfortunate thing is that I think the 
`QueryComponent` bit of this PR isn't actually compatible with that, and I 
think that will need to be reverted no matter what. The only other ways I can 
imagine fixing this are:
    
    a.) Always cache an entire document, regardless of what we return from 
`doc()`. (Seems like it adds overhead.)
    b.) Skip caching under certain conditions, like if the `fields` list only 
contains the unique key (or key and score). (Seems very reliant on 
`QueryComponent` still.)
    c.) Always use lazy loading. (Seems invasive, but most of the examples I 
see use it anyway.)
    
    I don't love any of these options, but I'd be interested to get more 
informed opinions.


> SolrIndexSearcher#doc() Completely Ignores Field Filters Unless Lazy Field 
> Loading is Enabled
> ---------------------------------------------------------------------------------------------
>
>                 Key: SOLR-8858
>                 URL: https://issues.apache.org/jira/browse/SOLR-8858
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 4.6, 4.10, 5.5
>            Reporter: Caleb Rackliffe
>              Labels: easyfix
>
> If {{enableLazyFieldLoading=false}}, a perfectly valid fields filter will be 
> ignored, and we'll create a {{DocumentStoredFieldVisitor}} without it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to