[jira] [Commented] (LUCENE-6796) Some terms incorrectly highlighted in complex SpanQuery

2015-11-26 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1502#comment-1502
 ] 

Erick Erickson commented on LUCENE-6796:


Are you willing to help put together a patch? Perhaps David has some pointers 
on how to go about it.

> Some terms incorrectly highlighted in complex SpanQuery
> ---
>
> Key: LUCENE-6796
> URL: https://issues.apache.org/jira/browse/LUCENE-6796
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/highlighter
>Affects Versions: 5.3
>Reporter: Tim Allison
>Assignee: David Smiley
>Priority: Trivial
> Attachments: LUCENE-6796-testcase.patch
>
>
> [~modassar] initially raised this on LUCENE-5205.  I'm opening this as a 
> separate issue.
> If a SpanNear is within a SpanOr, it looks like the child terms within the 
> SpanNear query are getting highlighted even if there is no match on that 
> SpanNear query...in some special cases.  Specifically, in the format of the 
> parser in LUCENE-5205 {{"(b [c z]) d\"~2"}}, which is equivalent to: find "b" 
> or the phrase "c z" within two words of "d" either direction
> This affects trunk. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6796) Some terms incorrectly highlighted in complex SpanQuery

2015-11-26 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15029251#comment-15029251
 ] 

David Smiley commented on LUCENE-6796:
--

Sorry; I said "stay tuned" too soon.  If you want to push things forward, I 
suggest taking a look at what Luwak does:
https://github.com/flaxsearch/luwak/blob/master/luwak/src/main/java/uk/co/flax/luwak/matchers/HighlightingMatcher.java

> Some terms incorrectly highlighted in complex SpanQuery
> ---
>
> Key: LUCENE-6796
> URL: https://issues.apache.org/jira/browse/LUCENE-6796
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/highlighter
>Affects Versions: 5.3
>Reporter: Tim Allison
>Assignee: David Smiley
>Priority: Trivial
> Attachments: LUCENE-6796-testcase.patch
>
>
> [~modassar] initially raised this on LUCENE-5205.  I'm opening this as a 
> separate issue.
> If a SpanNear is within a SpanOr, it looks like the child terms within the 
> SpanNear query are getting highlighted even if there is no match on that 
> SpanNear query...in some special cases.  Specifically, in the format of the 
> parser in LUCENE-5205 {{"(b [c z]) d\"~2"}}, which is equivalent to: find "b" 
> or the phrase "c z" within two words of "d" either direction
> This affects trunk. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6796) Some terms incorrectly highlighted in complex SpanQuery

2015-11-25 Thread Modassar Ather (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028171#comment-15028171
 ] 

Modassar Ather commented on LUCENE-6796:


Hi David,

Please let me know your plan for the fix of this issue.

Thanks,
Modassar

> Some terms incorrectly highlighted in complex SpanQuery
> ---
>
> Key: LUCENE-6796
> URL: https://issues.apache.org/jira/browse/LUCENE-6796
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/highlighter
>Affects Versions: 5.3
>Reporter: Tim Allison
>Assignee: David Smiley
>Priority: Trivial
> Attachments: LUCENE-6796-testcase.patch
>
>
> [~modassar] initially raised this on LUCENE-5205.  I'm opening this as a 
> separate issue.
> If a SpanNear is within a SpanOr, it looks like the child terms within the 
> SpanNear query are getting highlighted even if there is no match on that 
> SpanNear query...in some special cases.  Specifically, in the format of the 
> parser in LUCENE-5205 {{"(b [c z]) d\"~2"}}, which is equivalent to: find "b" 
> or the phrase "c z" within two words of "d" either direction
> This affects trunk. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6796) Some terms incorrectly highlighted in complex SpanQuery

2015-09-10 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14739894#comment-14739894
 ] 

Tim Allison commented on LUCENE-6796:
-

Great. There's no rush on this from my perspective. Thank you.

> Some terms incorrectly highlighted in complex SpanQuery
> ---
>
> Key: LUCENE-6796
> URL: https://issues.apache.org/jira/browse/LUCENE-6796
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/highlighter
>Affects Versions: 5.3
>Reporter: Tim Allison
>Assignee: David Smiley
>Priority: Trivial
> Attachments: LUCENE-6796-testcase.patch
>
>
> [~modassar] initially raised this on LUCENE-5205.  I'm opening this as a 
> separate issue.
> If a SpanNear is within a SpanOr, it looks like the child terms within the 
> SpanNear query are getting highlighted even if there is no match on that 
> SpanNear query...in some special cases.  Specifically, in the format of the 
> parser in LUCENE-5205 {{"(b [c z]) d\"~2"}}, which is equivalent to: find "b" 
> or the phrase "c z" within two words of "d" either direction
> This affects trunk. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6796) Some terms incorrectly highlighted in complex SpanQuery

2015-09-10 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14739479#comment-14739479
 ] 

David Smiley commented on LUCENE-6796:
--

This is a fundamental limitation in how WeightedSpanTermExtractor maps 
positions of SpanQueries to terms.  It takes any SpanQuery tree and considers 
_all_ terms it has to be valid within an entire position span match.  The API 
doesn't expose which underlying SpanTermQuery instances were found at the 
position range.  There's even more to it than that since a term might be active 
for a given span position range but not necessarily at every position.  For 
example imagine a SpanQuery representing this, roughly: {{"foo bar" NEAR20 "foo 
baz"}}.  WSTE would highlight _all_ occurrences of foo, bar, and baz, _even 
those standing alone not next to each other in the phrases as shown_ in a 
matching span of the requisite length.

I was thinking of this problem a year ago.  I believe Lucene trunk may finally 
have the API needed with the new SpanCollector API thanks to [~romseygeek] -- 
we've conversed on the implications of this on highlighting.  I anticipate 
leveraging this somewhat soon (month or two?); stay tuned.

> Some terms incorrectly highlighted in complex SpanQuery
> ---
>
> Key: LUCENE-6796
> URL: https://issues.apache.org/jira/browse/LUCENE-6796
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/highlighter
>Affects Versions: 5.3
>Reporter: Tim Allison
>Priority: Trivial
> Attachments: LUCENE-6796-testcase.patch
>
>
> [~modassar] initially raised this on LUCENE-5205.  I'm opening this as a 
> separate issue.
> If a SpanNear is within a SpanOr, it looks like the child terms within the 
> SpanNear query are getting highlighted even if there is no match on that 
> SpanNear query...in some special cases.  Specifically, in the format of the 
> parser in LUCENE-5205 {{"(b [c z]) d\"~2"}}, which is equivalent to: find "b" 
> or the phrase "c z" within two words of "d" either direction
> This affects trunk. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org