[ 
https://issues.apache.org/jira/browse/LUCENE-7284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe reopened LUCENE-7284:
--------------------------------

Reopening to backport to 5.6 and 5.5.2.

> UnsupportedOperationException wrt SpanNearQuery with Gap (Needed for Synonym 
> Query Expansion)
> ---------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-7284
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7284
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/search
>            Reporter: Daniel Bigham
>            Assignee: Alan Woodward
>            Priority: Minor
>             Fix For: 6.1, 6.0.1
>
>         Attachments: LUCENE-7284.patch
>
>
> I am trying to support synonyms on the query side by doing 
> query expansion.
> For example, the query "open webpage" can be expanded if the following 
> things are synonyms:
> "open" | "go to"
> This becomes the following: (I'm using both the stop word filter and the 
> stemming filter)
> {code}
> spanNear(
>          [
>                  spanOr([Title:open, Title:go]),
>                  Title:webpag
>          ],
>          0,
>          true
> )
> {code}
> Notice that "go to" became just "go", because apparently "to" is removed 
> by the stop word filter.
> Interestingly, if you turn "go to webpage" into a phrase, you get "go ? 
> webpage", but if you turn "go to" into a phrase, you just get "go", 
> because apparently a trailing stop word in a PhraseQuery gets dropped. 
> (there would actually be no way to represent the gap currently because 
> it represents gaps implicitly via the position of the phrase tokens, and 
> if there is no second token, there's no way to implicitly indicate that 
> there is a gap there)
> The above query then fails to match "go to webpage", because "go to 
> webpage" in the index tokenizes as "go _ webpage", and the query, 
> because it lost its gap, tried to only match "go webpage".
> To try and work around that, I represent "go to" not as a phrase, but as 
> a SpanNearQuery, like this:
> {code}
> spanNear(
>          [
>                  spanOr(
>                          [
>                                  Title:open,
>                                  spanNear([Title:go, SpanGap(:1)], 0, true),
>                          ]
>                  ),
>                  Title:webpag
>          ],
>          0,
>          true
> )
> {code}
> However, when I run that query, I get the following:
> {code}
> A Java exception occurred: java.lang.UnsupportedOperationException
>      at 
> org.apache.lucene.search.spans.SpanNearQuery$GapSpans.positionsCost(SpanNearQuery.java:398)
>      at 
> org.apache.lucene.search.spans.ConjunctionSpans.asTwoPhaseIterator(ConjunctionSpans.java:96)
>      at 
> org.apache.lucene.search.spans.NearSpansOrdered.asTwoPhaseIterator(NearSpansOrdered.java:45)
>      at 
> org.apache.lucene.search.spans.ScoringWrapperSpans.asTwoPhaseIterator(ScoringWrapperSpans.java:88)
>      at 
> org.apache.lucene.search.ConjunctionDISI.addSpans(ConjunctionDISI.java:104)
>      at 
> org.apache.lucene.search.ConjunctionDISI.intersectSpans(ConjunctionDISI.java:82)
>      at 
> org.apache.lucene.search.spans.ConjunctionSpans.<init>(ConjunctionSpans.java:41)
>      at 
> org.apache.lucene.search.spans.NearSpansOrdered.<init>(NearSpansOrdered.java:54)
>      at 
> org.apache.lucene.search.spans.SpanNearQuery$SpanNearWeight.getSpans(SpanNearQuery.java:232)
>      at 
> org.apache.lucene.search.spans.SpanWeight.scorer(SpanWeight.java:134)
>      at org.apache.lucene.search.spans.SpanWeight.scorer(SpanWeight.java:38)
>      at org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
> {code}
> ... and when I look up that GapSpans class in SpanNearQuery.java, I see:
> {code}
> @Override
> public float positionsCost() {
>    throw new UnsupportedOperationException();
> }
> {code}
> I asked this question on the mailing list on May 14 and was directed to 
> submit a bug here.
> This issue is of relatively high priority for us, since this represents the 
> most promising technique we have for supporting synonyms on top of Lucene. 
> (since the SynonymFilter suffers serious issues wrt multi-word synonyms)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to