[ 
https://issues.apache.org/jira/browse/LUCENE-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martin Schoenmakers updated LUCENE-5697:
----------------------------------------

    Description: 
In DocFetcher, which uses Lucene v3.5.0, we stumbled on a bug. The lead of 
DocFetcher has investigated and found the problem seems to be in Lucene. I do 
not know if this bug has been fixed in a later Lucene version.

Issue: 
We use "proximity search": search on multiple words in a directory with about 
300 PDF files.   
E.g. search for "wordA wordB wordC"~50, i.e. three words within 50 words 
distance of each other. The resulting documents are correct. But the highligted 
text in the document is often missing. 

If the words are in the SAME order as in the search AND on the SAME page, then 
the higlight works correct. But if the order of the words is different from the 
search (like "wordA wordC wordB" OR the words are not on the same page, then 
that text is not highlighted. 

As we use the proximity search on multiple words often, it severely degrades 
the usability.

  was:
In DocFetcher, which uses Lucene v3.5.0, we stumbled on a bug. The lead of 
DocFetcher has investigated and found the problem seems to be in Lucene. I do 
not know if this bug has been fixed in a later Lucene version.

Issue: 
We use "proximity search": search on multiple words in a directory with about 
300 PDF files.   
E.g. search for "wordA wordB wordC"~50, i.e. three words within 50 words 
distance of each other. The resulting documents are correct. But the highligted 
text in the document is often missing. 

If the words are in the SAME order as in the search AND on the SAME page, then 
the higlight works correct. But if the order of the words is different from the 
search (like "wordA wordC wordB" OR the words are not on the same page, then 
that text is not highlighted. 

As we use the proximity search on multiple words often, it severely
degrades the usability.


> Preview issue
> -------------
>
>                 Key: LUCENE-5697
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5697
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: modules/highlighter
>         Environment: DocFetcher 1.1.11 on Win 7(64) pro
>            Reporter: Martin Schoenmakers
>
> In DocFetcher, which uses Lucene v3.5.0, we stumbled on a bug. The lead of 
> DocFetcher has investigated and found the problem seems to be in Lucene. I do 
> not know if this bug has been fixed in a later Lucene version.
> Issue: 
> We use "proximity search": search on multiple words in a directory with about 
> 300 PDF files.   
> E.g. search for "wordA wordB wordC"~50, i.e. three words within 50 words 
> distance of each other. The resulting documents are correct. But the 
> highligted text in the document is often missing. 
> If the words are in the SAME order as in the search AND on the SAME page, 
> then the higlight works correct. But if the order of the words is different 
> from the search (like "wordA wordC wordB" OR the words are not on the same 
> page, then that text is not highlighted. 
> As we use the proximity search on multiple words often, it severely degrades 
> the usability.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to