[ https://issues.apache.org/jira/browse/SOLR-12958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
mosh updated SOLR-12958: ------------------------ Affects Version/s: master (8.0) 7.5 > Statistical Phrase Identifier should return phrases in single field > ------------------------------------------------------------------- > > Key: SOLR-12958 > URL: https://issues.apache.org/jira/browse/SOLR-12958 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Affects Versions: 7.5, master (8.0) > Reporter: mosh > Priority: Major > Labels: phrase, phrasequery > Attachments: SOLR-12958.patch > > > It has come to my attention that the phrase identifier introduced in > SOLR-9418 does not return phrases that are found in only one of the fields > specified by phrases.fields. > This has proved troublesome for our use case. > The offending line seems to be > {code:java} > final List<Phrase> validScoringPhrasesSorted = contextData.allPhrases.stream() > .filter(p -> 0.0D < p.getTotalScore()) > .sorted(Comparator.comparing((p -> p.getTotalScore()), > Collections.reverseOrder())) > .collect(Collectors.toList());{code} > Since fields where the phrase is not present return -1.0, and fields that > contain the phrase return a score in the range of 0.0 <= score >= 1.0, the > total score turn out negative, and the phrase gets filtered. > I changed separated the filters to 2 distinct cases: > # Filter out single word phrases (*phrases.singleWordPhrases* is set to > false) > # Include single word phrases (*phrases.singleWordPhrases* is set to true) > This can be observed by this change to the component's logid: > {code:java} > if(!rb.req.getParams().getBool(PHRASE_MATCH_SINGLE_WORD, false)) { > // filter single word phrases > phraseStream = contextData.allPhrases.stream() > .filter(p -> p.fieldScores.values().stream().anyMatch(fieldScore -> > fieldScore > 0.0D)); > } else { > // include single word phrases, which return a constant score of 0.0 > phraseStream = contextData.allPhrases.stream() > .filter(p -> p.fieldScores.values().stream().anyMatch(fieldScore -> > fieldScore >= 0.0D)); > }{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org