Peter Davie created SOLR-13838: ---------------------------------- Summary: igain query parser generating invalid output Key: SOLR-13838 URL: https://issues.apache.org/jira/browse/SOLR-13838 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: query parsers Affects Versions: 8.2 Environment: The issue is a generic Java defect and therefore will be independent of the operating system or software platform. Reporter: Peter Davie Fix For: 8.3 Attachments: IGainTermsQParserPlugin.java.patch
Investigating the output from the "features()" stream source, terms are being returned with NaN for the score_f field: {{{{ "docs": [}}}} {{{{ {}}}} {{{{ "featureSet_s": "business",}}}} {{{{ "score_f": "NaN",}}}} {{{{ "term_s": "1,011.15",}}}} {{{{ "idf_d": "-Infinity",}}}} {{{{ "index_i": 1,}}}} {{{{ "id": "business_1"}}}} {{{{ },}}}} {{{{ {}}}} {{{{ "featureSet_s": "business",}}}} {{{{ "score_f": "NaN",}}}} {{{{ "term_s": "10.3m",}}}} {{{{ "idf_d": "-Infinity",}}}} {{{{ "index_i": 2,}}}} {{{{ "id": "business_2"}}}} {{{{ },}}}} {{{{ {}}}} {{{{ "featureSet_s": "business",}}}} {{{{ "score_f": "NaN",}}}} {{{{ "term_s": "01",}}}} {{{{ "idf_d": "-Infinity",}}}} {{{{ "index_i": 3,}}}} {{{{ "id": "business_3"}}}} {{{{ },...}}}} Looking into{{ org/apache/solr/search/IGainTermsQParserPlugin.java}}, it seems that when a term is not included in the positive or negative documents, the docFreq calculation (docFreq = xc + nc) is 0, which means that subsequent calculations result in NaN (division by 0). Attached is a patch which skips terms for which docFreq is 0 in the finish() method of IGainTermsQParserPlugin and this resolves the issues with NaN scores in the features() output. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org