Vamsee Yarlagadda created SOLR-6299:
---------------------------------------

             Summary: Facet count on facet queries returns different results if 
#shards > 1
                 Key: SOLR-6299
                 URL: https://issues.apache.org/jira/browse/SOLR-6299
             Project: Solr
          Issue Type: Bug
          Components: SolrCloud
    Affects Versions: 5.0
            Reporter: Vamsee Yarlagadda


I am trying to run some facet counts on facet queries and looks like i am 
getting different counts if i use >1 shards in the SolrCloud cluster.

Here is the upstream unit test:
https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/test/org/apache/solr/request/SimpleFacetsTest.java#L173

Setup:
* Ingested 5 solr docs.
{code}
{
  "responseHeader": {
    "status": 0,
    "QTime": 22,
    "params": {
      "indent": "true",
      "q": "*:*",
      "_": "1406346687337",
      "wt": "json"
    }
  },
  "response": {
    "numFound": 5,
    "start": 0,
    "maxScore": 1,
    "docs": [
      {
        "id": 2004,
        "range_facet_l": [
          2004
        ],
        "hotel_s1": "b",
        "airport_s1": "ams",
        "duration_i1": 5,
        "_version_": 1474661321774465000,
        "timestamp": "2014-07-26T03:50:27.975Z",
        "multiDefault": [
          "muLti-Default"
        ],
        "intDefault": 42
      },
      {
        "id": 2000,
        "range_facet_l": [
          2000
        ],
        "hotel_s1": "a",
        "airport_s1": "ams",
        "duration_i1": 5,
        "_version_": 1474661323604230100,
        "timestamp": "2014-07-26T03:50:29.734Z",
        "multiDefault": [
          "muLti-Default"
        ],
        "intDefault": 42
      },
      {
        "id": 2003,
        "range_facet_l": [
          2003
        ],
        "hotel_s1": "b",
        "airport_s1": "ams",
        "duration_i1": 5,
        "_version_": 1474661326312702000,
        "timestamp": "2014-07-26T03:50:32.317Z",
        "multiDefault": [
          "muLti-Default"
        ],
        "intDefault": 42
      },
      {
        "id": 2001,
        "range_facet_l": [
          2001
        ],
        "hotel_s1": "a",
        "airport_s1": "dus",
        "duration_i1": 10,
        "_version_": 1474661326389248000,
        "timestamp": "2014-07-26T03:50:32.375Z",
        "multiDefault": [
          "muLti-Default"
        ],
        "intDefault": 42
      },
      {
        "id": 2002,
        "range_facet_l": [
          2002
        ],
        "hotel_s1": "b",
        "airport_s1": "ams",
        "duration_i1": 10,
        "_version_": 1474661326464745500,
        "timestamp": "2014-07-26T03:50:32.446Z",
        "multiDefault": [
          "muLti-Default"
        ],
        "intDefault": 42
      }
    ]
  }
}
{code}

Here is the query being run:
{code}
Test code:
    assertQ(
        req(
            "q", "*:*",
            "fq", "id:[2000 TO 2004]",
            "group", "true",
            "group.facet", "true",
            "group.field", "hotel_s1",
            "facet", "true",
            "facet.limit", facetLimit,
            "facet.query", "airport_s1:ams"
        ),
        "//lst[@name='facet_queries']/int[@name='airport_s1:ams'][.='2']"
    );

$ curl  
"http://localhost:8983/solr/collection1/select?facet=true&facet.query=airport_s1%3Aams&q=*%3A*&facet.limit=-100&group.field=hotel_s1&group=true&group.facet=true&fq=id%3A%5B2000+TO+2004%5D&indent=true&wt=xml";
 
{code}

Now, if i issue a query statement - On *1* shard system (Works as expected)
{code}
$ curl  
"http://localhost:8983/solr/collection1/select?facet=true&facet.query=airport_s1%3Aams&q=*%3A*&facet.limit=-100&group.field=hotel_s1&group=true&group.facet=true&fq=id%3A%5B2000+TO+2004%5D&indent=true&wt=xml";
 

<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">17</int>
  <lst name="params">
    <str name="facet">true</str>
    <str name="indent">true</str>
    <str name="facet.query">airport_s1:ams</str>
    <str name="q">*:*</str>
    <str name="facet.limit">-100</str>
    <str name="group.field">hotel_s1</str>
    <str name="group">true</str>
    <str name="wt">xml</str>
    <str name="fq">id:[2000 TO 2004]</str>
    <str name="group.facet">true</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="hotel_s1">
    <int name="matches">5</int>
    <arr name="groups">
      <lst>
        <str name="groupValue">a</str>
        <result name="doclist" numFound="2" start="0">
          <doc>
            <int name="id">2001</int>
            <arr name="range_facet_l">
              <long>2001</long>
            </arr>
            <str name="hotel_s1">a</str>
            <str name="airport_s1">dus</str>
            <int name="duration_i1">10</int>
            <long name="_version_">1474989437819551744</long>
            <date name="timestamp">2014-07-29T18:45:43.819Z</date>
            <arr name="multiDefault">
              <str>muLti-Default</str>
            </arr>
            <int name="intDefault">42</int></doc>
        </result>
      </lst>
      <lst>
        <str name="groupValue">b</str>
        <result name="doclist" numFound="3" start="0">
          <doc>
            <int name="id">2003</int>
            <arr name="range_facet_l">
              <long>2003</long>
            </arr>
            <str name="hotel_s1">b</str>
            <str name="airport_s1">ams</str>
            <int name="duration_i1">5</int>
            <long name="_version_">1474989439611568128</long>
            <date name="timestamp">2014-07-29T18:45:45.528Z</date>
            <arr name="multiDefault">
              <str>muLti-Default</str>
            </arr>
            <int name="intDefault">42</int></doc>
        </result>
      </lst>
    </arr>
  </lst>
</lst>
<lst name="facet_counts">
  <lst name="facet_queries">
    <int name="airport_s1:ams">2</int>
  </lst>
  <lst name="facet_fields"/>
  <lst name="facet_dates"/>
  <lst name="facet_ranges"/>
</lst>
</response>
{code}

Now, if i run the same query on 2 shard system, i see facet count as *3* 
instead of *2*.

Solr result on 2 shard cluster:
{code}
[systest@search-testing-c5-1 search]$ curl  
"http://localhost:8983/solr/collection1/select?facet=true&facet.query=airport_s1%3Aams&q=*%3A*&facet.limit=-100&group.field=hotel_s1&group=true&group.facet=true&fq=id%3A%5B2000+TO+2004%5D&indent=true&wt=xml";
 
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">69</int>
  <lst name="params">
    <str name="facet">true</str>
    <str name="indent">true</str>
    <str name="facet.query">airport_s1:ams</str>
    <str name="q">*:*</str>
    <str name="facet.limit">-100</str>
    <str name="group.field">hotel_s1</str>
    <str name="group">true</str>
    <str name="wt">xml</str>
    <str name="fq">id:[2000 TO 2004]</str>
    <str name="group.facet">true</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="hotel_s1">
    <int name="matches">5</int>
    <arr name="groups">
      <lst>
        <str name="groupValue">b</str>
        <result name="doclist" numFound="3" start="0" maxScore="1.0">
          <doc>
            <int name="id">2002</int>
            <arr name="range_facet_l">
              <long>2002</long>
            </arr>
            <str name="hotel_s1">b</str>
            <str name="airport_s1">ams</str>
            <int name="duration_i1">10</int>
            <long name="_version_">1474661326464745472</long>
            <date name="timestamp">2014-07-26T03:50:32.446Z</date>
            <arr name="multiDefault">
              <str>muLti-Default</str>
            </arr>
            <int name="intDefault">42</int></doc>
        </result>
      </lst>
      <lst>
        <str name="groupValue">a</str>
        <result name="doclist" numFound="2" start="0" maxScore="1.0">
          <doc>
            <int name="id">2001</int>
            <arr name="range_facet_l">
              <long>2001</long>
            </arr>
            <str name="hotel_s1">a</str>
            <str name="airport_s1">dus</str>
            <int name="duration_i1">10</int>
            <long name="_version_">1474661326389248000</long>
            <date name="timestamp">2014-07-26T03:50:32.375Z</date>
            <arr name="multiDefault">
              <str>muLti-Default</str>
            </arr>
            <int name="intDefault">42</int></doc>
        </result>
      </lst>
    </arr>
  </lst>
</lst>
<lst name="facet_counts">
  <lst name="facet_queries">
    <int name="airport_s1:ams">3</int>
  </lst>
  <lst name="facet_fields"/>
  <lst name="facet_dates"/>
  <lst name="facet_ranges"/>
</lst>
</response>
{code} 

In order to replicate this, we can simply run the above test on >1 shard system 
and the solr response will be different.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to