[
https://issues.apache.org/jira/browse/SOLR-8540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098036#comment-15098036
]
Vasiliy Bout commented on SOLR-8540:
------------------------------------
Hi, we noticed that these erroneous values (that are greater than the real
count) are exactly the same as the values, that can be found in the Solr Admin
UI in the "Schema Browser" tab. When you select your core in the dropdown and
go to the "Schema Browser" tab, then select document field ({{file_type}} in
our case) and press "Load Term Info" button, you will see exactly the same
values, that are returned by multi select facets.
And you are right that optimizing the core updates these counts to be correct.
But using docValues does not help. We use EmbeddedSolrCore to run some junit
tests, and one of the test fails because of this problem, and changing the
schema to use docValues does not help. Is there any way to programmatically
update these counts, maybe to drop the caches? Or is there any trick that are
necessary to make docValues fix this problem?
> Multi select facets give incorrect results
> ------------------------------------------
>
> Key: SOLR-8540
> URL: https://issues.apache.org/jira/browse/SOLR-8540
> Project: Solr
> Issue Type: Bug
> Affects Versions: 5.3, 5.3.1
> Reporter: Vasiliy Bout
>
> We have a single core and use faceting to search documents. When we started
> to use multi select faceting we noticed, that results do not match with the
> real data in the core.
> For example, we make this simple query:
> {noformat}
> q=*:*
> rows=0
> facet=true
> facet.limit=5
> facet.field=file_type
> {noformat}
> Corresponding URL is
> {{http://localhost:8983/.../select?q=*%3A*&rows=0&wt=json&indent=true&facet=true&facet.limit=5&facet.field=file_type}}
> We get the following results:
> {noformat}
> {
> "responseHeader": {
> "status": 0,
> "QTime": 42
> },
> "response": {
> "numFound": 1240067,
> "start": 0,
> "docs": []
> },
> "facet_counts": {
> "facet_queries": {},
> "facet_fields": {
> "file_type": [
> "5",
> 1073053,
> "3",
> 51078,
> "7",
> 41956,
> "10",
> 16121,
> "12",
> 12585
> ]
> },
> "facet_dates": {},
> "facet_ranges": {},
> "facet_intervals": {},
> "facet_heatmaps": {}
> }
> }
> {noformat}
> When we add a filter by {{file_type}}:
> {noformat}
> q=*:*
> fq=file_type:3
> rows=0
> facet=true
> facet.limit=5
> facet.field=file_type
> {noformat}
> Corresponding URL is
> {{http://localhost:8983/.../select?q=*%3A*&fq=file_type%3A3&rows=0&wt=json&indent=true&facet=true&facet.limit=5&facet.field=file_type}}
> then we get nonzero count only for the filtered value:
> {noformat}
> {
> "responseHeader": {
> "status": 0,
> "QTime": 5
> },
> "response": {
> "numFound": 51078,
> "start": 0,
> "docs": []
> },
> "facet_counts": {
> "facet_queries": {},
> "facet_fields": {
> "file_type": [
> "3",
> 51078,
> "1",
> 0,
> "4",
> 0,
> "5",
> 0,
> "7",
> 0
> ]
> },
> "facet_dates": {},
> "facet_ranges": {},
> "facet_intervals": {},
> "facet_heatmaps": {}
> }
> }
> {noformat}
> But we want to have multi select faceting by file_type, so we exclude the
> filter from faceting:
> {noformat}
> q=*:*
> fq={!tag=ft}file_type:3
> rows=0
> facet=true
> facet.limit=5
> facet.field={!ex=ft}file_type
> {noformat}
> Corresponding URL is
> {{http://localhost:8983/.../select?q=*%3A*&fq=%7B!tag%3Dft%7Dfile_type%3A3&rows=0&wt=json&indent=true&facet=true&facet.limit=5&facet.field={!ex=ft}file_type}}
> But results contain incorrect values for all available file_type values. All
> counts are greater than they were before we added the filter:
> {noformat}
> {
> "responseHeader": {
> "status": 0,
> "QTime": 38
> },
> "response": {
> "numFound": 51078,
> "start": 0,
> "docs": []
> },
> "facet_counts": {
> "facet_queries": {},
> "facet_fields": {
> "file_type": [
> "5",
> 1073146,
> "3",
> 66705,
> "7",
> 42202,
> "10",
> 16903,
> "12",
> 12710
> ]
> },
> "facet_dates": {},
> "facet_ranges": {},
> "facet_intervals": {},
> "facet_heatmaps": {}
> }
> }{noformat}
> We expect multi select facet counts to be exactly the same as without
> filters. Before we added the filter by {{file_type}} we saw that there are
> only 51078 documents with {{file_type=3}}. But when we add the filter and
> exclude it from the faceting, Solr tells us about 66705 documents with
> {{file_type=3}} while the {{numFound}} is still 51078.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]