[
https://issues.apache.org/jira/browse/SOLR-5759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
eric casteleijn updated SOLR-5759:
----------------------------------
Description:
yWhen using the highlighter, and increasing the fragsize from 100 (the default)
to 200, sometimes the search term is no longer entirely contained by the
returned fragment, even though it was in the smaller snippet.
For instance:
http://host/solr/index/select?q=("Tony+Yet"+AND+exact_text:"Tony+Yet")&wt=json&indent=true&hl=true&hl.fl=title,summary,extracted_text&hl.simple.pre=<em>&hl.simple.post=</em>&hl.fragsize=100
results in the fragment:
"7618861":{
"extracted_text":[" enterprise forward.\n\n<em>Tony</em> <em>Yet</em>,
one of the centre's organisers, explains: \"I think what Hong Kong needs"]},
whereas:
http://host/solr/index/select?q=("Tony+Yet"+AND+exact_text:"Tony+Yet")&wt=json&indent=true&hl=true&hl.fl=title,summary,extracted_text&hl.simple.pre=<em>&hl.simple.post=</em>&hl.fragsize=200
results in:
"7618861":{
"extracted_text":[" interested in social issues, as well as mentorship
for upcoming enterprises.\n\nAs in the UK, it is also creating the community of
people, skills and ideas that is needed to push social enterprise
forward.\n\n<em>Tony</em>"]},
Both reference roughly the same position from the same field, but I can't for
the life of me imagine why the larger fragment would shift to the left so far
as to drop half of the search term.
If desirable, I can upload the entire json results for both requests.
Let me know if there is any other information I can supply, or checks I can
perform.
was:
When using the highlighter, and increasing the fragsize from 100 (the default)
to 200, sometimes the search term is no longer entirely contained by the
returned fragment, even though it was in the smaller snippet.
For instance:
http://host/solr/index/select?q=("Tony+Yet"+AND+exact_text:"Tony+Yet")&wt=json&indent=true&hl=true&hl.fl=title,summary,extracted_text&hl.simple.pre=<em>&hl.simple.post=</em>&hl.fragsize=100
results in the fragment:
"7618861":{
"extracted_text":[" enterprise forward.\n\n<em>Tony</em> <em>Yet</em>,
one of the centre's organisers, explains: \"I think what Hong Kong needs"]},
whereas:
http://host/solr/index/select?q=("Tony+Yet"+AND+exact_text:"Tony+Yet")&wt=json&indent=true&hl=true&hl.fl=title,summary,extracted_text&hl.simple.pre=<em>&hl.simple.post=</em>&hl.fragsize=200
results in:
"7618861":{
"extracted_text":[" interested in social issues, as well as mentorship
for upcoming enterprises.\n\nAs in the UK, it is also creating the community of
people, skills and ideas that is needed to push social enterprise
forward.\n\n<em>Tony</em>"]},
Both reference roughly the same position from the same field, but I can't for
the life of me imagine why the larger fragment would shift to the left so far
as to drop half of the search term.
If desirable, I can upload the entire json results for both requests.
Let me know if there is any other information I can supply, or checks I can
perform.
> increasing hl.fragsize loses part of the search term
> ----------------------------------------------------
>
> Key: SOLR-5759
> URL: https://issues.apache.org/jira/browse/SOLR-5759
> Project: Solr
> Issue Type: Bug
> Components: highlighter
> Affects Versions: 4.4
> Environment: Ubuntu 12.04
> Reporter: eric casteleijn
>
> yWhen using the highlighter, and increasing the fragsize from 100 (the
> default) to 200, sometimes the search term is no longer entirely contained by
> the returned fragment, even though it was in the smaller snippet.
> For instance:
> http://host/solr/index/select?q=("Tony+Yet"+AND+exact_text:"Tony+Yet")&wt=json&indent=true&hl=true&hl.fl=title,summary,extracted_text&hl.simple.pre=<em>&hl.simple.post=</em>&hl.fragsize=100
> results in the fragment:
> "7618861":{
> "extracted_text":[" enterprise forward.\n\n<em>Tony</em> <em>Yet</em>,
> one of the centre's organisers, explains: \"I think what Hong Kong needs"]},
> whereas:
> http://host/solr/index/select?q=("Tony+Yet"+AND+exact_text:"Tony+Yet")&wt=json&indent=true&hl=true&hl.fl=title,summary,extracted_text&hl.simple.pre=<em>&hl.simple.post=</em>&hl.fragsize=200
> results in:
> "7618861":{
> "extracted_text":[" interested in social issues, as well as mentorship
> for upcoming enterprises.\n\nAs in the UK, it is also creating the community
> of people, skills and ideas that is needed to push social enterprise
> forward.\n\n<em>Tony</em>"]},
> Both reference roughly the same position from the same field, but I can't for
> the life of me imagine why the larger fragment would shift to the left so far
> as to drop half of the search term.
> If desirable, I can upload the entire json results for both requests.
> Let me know if there is any other information I can supply, or checks I can
> perform.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]