[
https://issues.apache.org/jira/browse/SOLR-10678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16010338#comment-16010338
]
Dawid Weiss edited comment on SOLR-10678 at 5/15/17 11:21 AM:
--------------------------------------------------------------
I looked at it, here's a summary of my findings.
1. Clustering isn't currently run twice in distributed mode, so this is a
non-issue functionally. This is so because a distributed request goes through
{{modifyRequest}} and {{ClusteringComponent}} removes itself (by disabling the
{{clustering}} attribute) from subsequent shard requests:
{code}
public void modifyRequest(ResponseBuilder rb, SearchComponent who,
ShardRequest sreq) {
SolrParams params = rb.req.getParams();
if (!params.getBool(COMPONENT_NAME, false) ||
!params.getBool(ClusteringParams.USE_SEARCH_RESULTS, false)) {
return;
}
sreq.params.remove(COMPONENT_NAME);
{code}
This is then checked in {{process}}:
{code}
public void process(ResponseBuilder rb) throws IOException {
SolrParams params = rb.req.getParams();
if (!params.getBool(COMPONENT_NAME, false)) {
return;
}
{code}
2. What confused me a *lot* was why {{process}} is invoked during the
distributed test (and why the test was executed so darn many times). Turned out
it's because of the default {{ShardsRepeatRule}} and this gem inside inside
{{BaseDistributedSearchTestCase}}:
{code}
protected QueryResponse query(boolean setDistribParams, SolrParams p) throws
Exception {
final ModifiableSolrParams params = new ModifiableSolrParams(p);
// TODO: look into why passing true causes fails
params.set("distrib", "false");
final QueryResponse controlRsp = controlClient.query(params);
validateControlData(controlRsp);
params.remove("distrib");
{code}
So the distributed test is running a forced-non-distributed request first,
followed by the distributed request, hence the confusing logs.
was (Author: dweiss):
I looked at it, here's a summary of my findings.
1) Clustering isn't currently run twice in distributed mode, so this is a
non-issue functionally. This is so because a distributed request goes through
{{modifyRequest}} and {{ClusteringComponent}} removes itself (by disabling the
{{clustering}} attribute) from subsequent shard requests:
{code}
public void modifyRequest(ResponseBuilder rb, SearchComponent who,
ShardRequest sreq) {
SolrParams params = rb.req.getParams();
if (!params.getBool(COMPONENT_NAME, false) ||
!params.getBool(ClusteringParams.USE_SEARCH_RESULTS, false)) {
return;
}
sreq.params.remove(COMPONENT_NAME);
{code}
This is then checked in {{process}}:
{code}
public void process(ResponseBuilder rb) throws IOException {
SolrParams params = rb.req.getParams();
if (!params.getBool(COMPONENT_NAME, false)) {
return;
}
{code}
2. What confused me a *lot* was why {{process}} is invoked during the
distributed test (and why the test was executed so darn many times). Turned out
it's because of the default {{ShardsRepeatRule}} and this gem inside inside
{{BaseDistributedSearchTestCase}}:
{code}
protected QueryResponse query(boolean setDistribParams, SolrParams p) throws
Exception {
final ModifiableSolrParams params = new ModifiableSolrParams(p);
// TODO: look into why passing true causes fails
params.set("distrib", "false");
final QueryResponse controlRsp = controlClient.query(params);
validateControlData(controlRsp);
params.remove("distrib");
{code}
So the distributed test is running a forced-non-distributed request first,
followed by the distributed request, hence the confusing logs.
> Clustering can be executed multiple times in distributed mode
> -------------------------------------------------------------
>
> Key: SOLR-10678
> URL: https://issues.apache.org/jira/browse/SOLR-10678
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Dawid Weiss
> Assignee: Dawid Weiss
> Priority: Minor
>
> As reported on SO:
> http://stackoverflow.com/questions/43877284/how-does-solr-clustering-component-work/43937064#43937064
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]