[ 
https://issues.apache.org/jira/browse/SOLR-7128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-7128:
----------------------------------------
    Attachment: SOLR-7128.patch

This patch fixes the bug and modifies the 
DistributedQueryComponentOptimizationTest to use the 
TrackingShardHandlerFactory introduced in SOLR-7147.

This test now asserts that every distrib.singlePass query:
# Makes exactly 'numSlices' number of shard requests
# Makes no GET_FIELDS requests
# Must request the unique key field from shards
# Must request the score if 'fl' has score or sort by score is requested
# Requests all fields that are present in 'fl' param

It also asserts that every regular two phase distribtued search:
# Makes at most 2 * 'numSlices' number of shard requests
# Must request the unique key field from shards
# Must request the score if 'fl' has score or sort by score is requested
# Requests no fields other than id and score in GET_TOP_IDS request
# Requests exactly the fields that are present in 'fl' param in GET_FIELDS 
request and no others

and also asserts that:
# Each query which requests id or score or both behaves exactly like a single 
pass query

> Two phase distributed search is fetching extra fields in GET_TOP_IDS phase
> --------------------------------------------------------------------------
>
>                 Key: SOLR-7128
>                 URL: https://issues.apache.org/jira/browse/SOLR-7128
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 4.10.2, 4.10.3
>            Reporter: Shalin Shekhar Mangar
>            Assignee: Shalin Shekhar Mangar
>             Fix For: Trunk, 5.1
>
>         Attachments: SOLR-7128.patch, SOLR-7128.patch
>
>
> [~pqueixalos] reported this to me privately so I am creating this issue on 
> his behalf.
> {quote}
> We found an issue in versions 4.10.+ (4.10.2 and 4.10.3 for sure).
> When processing a two phase distributed query with an explicit fl parameter, 
> the two phases are well processed, but the GET_TOP_IDS retrieves the matching 
> documents fields, even if a GET_FIELDS shard request is getting executed just 
> after.
> /solr/someCollectionCore?collection=someOtherCollection&q=*:*&debug=true&fl=id,title
> => id is retrieved during GET_TOP_IDS phase that's ok:: it's our 
> uniqueKeyField
> => title is also retrieved during GET_TOP_IDS phase, that's not ok.
> {quote}
> I'm able to reproduce this. This is pretty bad performance bug that was 
> introduced in SOLR-5768 or it's subsequent related issues. I plan to fix this 
> bug and add substantial tests to assert such things.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to