Vasiliy Bout created SOLR-8540:
----------------------------------
Summary: Multi select facets give incorrect results
Key: SOLR-8540
URL: https://issues.apache.org/jira/browse/SOLR-8540
Project: Solr
Issue Type: Bug
Affects Versions: 5.3.1, 5.3
Reporter: Vasiliy Bout
We have a single core and use faceting to search documents. When we started to
use multi select faceting we noticed, that results do not match with the real
data in the core.
For example, we make this simple query:
{noformat}
q=*:*
rows=0
facet=true
facet.limit=5
facet.field=file_type
{noformat}
Corresponding URL is
{{http://localhost:8983/.../select?q=*%3A*&rows=0&wt=json&indent=true&facet=true&facet.limit=5&facet.field=file_type}}
We get the following results:
{noformat}
{
"responseHeader": {
"status": 0,
"QTime": 42
},
"response": {
"numFound": 1240067,
"start": 0,
"docs": []
},
"facet_counts": {
"facet_queries": {},
"facet_fields": {
"file_type": [
"5",
1073053,
"3",
51078,
"7",
41956,
"10",
16121,
"12",
12585
]
},
"facet_dates": {},
"facet_ranges": {},
"facet_intervals": {},
"facet_heatmaps": {}
}
}
{noformat}
When we add a filter by {{file_type}}:
{noformat}
q=*:*
fq=file_type:3
rows=0
facet=true
facet.limit=5
facet.field=file_type
{noformat}
Corresponding URL is
{{http://localhost:8983/.../select?q=*%3A*&fq=file_type%3A3&rows=0&wt=json&indent=true&facet=true&facet.limit=5&facet.field=file_type}}
then we get nonzero count only for the filtered value:
{noformat}
{
"responseHeader": {
"status": 0,
"QTime": 5
},
"response": {
"numFound": 51078,
"start": 0,
"docs": []
},
"facet_counts": {
"facet_queries": {},
"facet_fields": {
"file_type": [
"3",
51078,
"1",
0,
"4",
0,
"5",
0,
"7",
0
]
},
"facet_dates": {},
"facet_ranges": {},
"facet_intervals": {},
"facet_heatmaps": {}
}
}
{noformat}
But we want to have multi select faceting by file_type, so we exclude the
filter from faceting:
{noformat}
q=*:*
fq={!tag=ft}file_type:3
rows=0
facet=true
facet.limit=5
facet.field={!ex=ft}file_type
{noformat}
Corresponding URL is
{{http://localhost:8983/.../select?q=*%3A*&fq=%7B!tag%3Dft%7Dfile_type%3A3&rows=0&wt=json&indent=true&facet=true&facet.limit=5&facet.field={!ex=ft}file_type}}
But results contain incorrect values for all available file_type values. All
counts are greater than they were before we added the filter:
{noformat}
{
"responseHeader": {
"status": 0,
"QTime": 38
},
"response": {
"numFound": 51078,
"start": 0,
"docs": []
},
"facet_counts": {
"facet_queries": {},
"facet_fields": {
"file_type": [
"5",
1073146,
"3",
66705,
"7",
42202,
"10",
16903,
"12",
12710
]
},
"facet_dates": {},
"facet_ranges": {},
"facet_intervals": {},
"facet_heatmaps": {}
}
}{noformat}
We expect multi select facet counts to be exactly the same as without filters.
Before we added the filter by {{file_type}} we saw that there are only 51078
documents with {{file_type=3}}. But when we add the filter and exclude it from
the faceting, Solr tells us about 66705 documents with {{file_type=3}} while
the {{numFound}} is still 51078.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]