Colin Bartolome created SOLR-4445:
-------------------------------------
Summary: Join queries should execute their "from" queries on all
shards
Key: SOLR-4445
URL: https://issues.apache.org/jira/browse/SOLR-4445
Project: Solr
Issue Type: Improvement
Components: query parsers, SolrCloud
Affects Versions: 4.1
Reporter: Colin Bartolome
When running join queries on a collection with multiple shards, the "from" site
of the query is executed on the shard that serves the request only, instead of
on all shards. The matching documents are then passed to the "to" side of the
query. This leads to the overall result set being a subset of what it would be,
if the join query were run on a collection with only one shard.
That is, a four-shard collection will, on average, return 25% of the results a
single-shard collection would.
The code should execute the "from" side of the query on all available shards
before passing those matching documents to the "to" side of the query.
Note: [LUCENE-3759|https://issues.apache.org/jira/browse/LUCENE-3759] proposes
an upgrade to {{JoinUtil}} to support joining where the documents matched by
the "from" side of the query exist on multiple shards. Solr does not use that
class for joining (nor does anything else?), so this would have to be
implemented separately.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]