Vasiliy Bout created SOLR-8540:
----------------------------------

             Summary: Multi select facets give incorrect results
                 Key: SOLR-8540
                 URL: https://issues.apache.org/jira/browse/SOLR-8540
             Project: Solr
          Issue Type: Bug
    Affects Versions: 5.3.1, 5.3
            Reporter: Vasiliy Bout


We have a single core and use faceting to search documents. When we started to 
use multi select faceting we noticed, that results do not match with the real 
data in the core.

For example, we make this simple query:
{noformat}
q=*:*
rows=0
facet=true
facet.limit=5
facet.field=file_type
{noformat}

Corresponding URL is 
{{http://localhost:8983/.../select?q=*%3A*&rows=0&wt=json&indent=true&facet=true&facet.limit=5&facet.field=file_type}}

We get the following results:
{noformat}
{
  "responseHeader": {
    "status": 0,
    "QTime": 42
  },
  "response": {
    "numFound": 1240067,
    "start": 0,
    "docs": []
  },
  "facet_counts": {
    "facet_queries": {},
    "facet_fields": {
      "file_type": [
        "5",
        1073053,
        "3",
        51078,
        "7",
        41956,
        "10",
        16121,
        "12",
        12585
      ]
    },
    "facet_dates": {},
    "facet_ranges": {},
    "facet_intervals": {},
    "facet_heatmaps": {}
  }
}
{noformat}

When we add a filter by {{file_type}}:
{noformat}
q=*:*
fq=file_type:3
rows=0
facet=true
facet.limit=5
facet.field=file_type
{noformat}

Corresponding URL is 
{{http://localhost:8983/.../select?q=*%3A*&fq=file_type%3A3&rows=0&wt=json&indent=true&facet=true&facet.limit=5&facet.field=file_type}}

then we get nonzero count only for the filtered value:
{noformat}
{
  "responseHeader": {
    "status": 0,
    "QTime": 5
  },
  "response": {
    "numFound": 51078,
    "start": 0,
    "docs": []
  },
  "facet_counts": {
    "facet_queries": {},
    "facet_fields": {
      "file_type": [
        "3",
        51078,
        "1",
        0,
        "4",
        0,
        "5",
        0,
        "7",
        0
      ]
    },
    "facet_dates": {},
    "facet_ranges": {},
    "facet_intervals": {},
    "facet_heatmaps": {}
  }
}
{noformat}

But we want to have multi select faceting by file_type, so we exclude the 
filter from faceting:
{noformat}
q=*:*
fq={!tag=ft}file_type:3
rows=0
facet=true
facet.limit=5
facet.field={!ex=ft}file_type
{noformat}

Corresponding URL is 
{{http://localhost:8983/.../select?q=*%3A*&fq=%7B!tag%3Dft%7Dfile_type%3A3&rows=0&wt=json&indent=true&facet=true&facet.limit=5&facet.field={!ex=ft}file_type}}

But results contain incorrect values for all available file_type values. All 
counts are greater than they were before we added the filter:
{noformat}
{
  "responseHeader": {
    "status": 0,
    "QTime": 38
  },
  "response": {
    "numFound": 51078,
    "start": 0,
    "docs": []
  },
  "facet_counts": {
    "facet_queries": {},
    "facet_fields": {
      "file_type": [
        "5",
        1073146,
        "3",
        66705,
        "7",
        42202,
        "10",
        16903,
        "12",
        12710
      ]
    },
    "facet_dates": {},
    "facet_ranges": {},
    "facet_intervals": {},
    "facet_heatmaps": {}
  }
}{noformat}

We expect multi select facet counts to be exactly the same as without filters. 
Before we added the filter by {{file_type}} we saw that there are only 51078 
documents with {{file_type=3}}. But when we add the filter and exclude it from 
the faceting, Solr tells us about 66705 documents with {{file_type=3}} while 
the {{numFound}} is still 51078.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to