Re: Issue with highlighter
Works perfectly for me. Let's see: > your solrconfig file, particularly the "select" handler. > the field definition you use for the content field. Be sure to include the > associated fieldType. > the results of debug=on attached to the query. > What version of Solr? Best, Erick On Sat, Jun 17, 2017 at 7:14 PM, Ali Husain <alihus...@outlook.com> wrote: > Damien, I tried that too before I sent the email. Nothing :/ > > > http://localhost:8983/solr/testHighlight/select?hl.q=something=*=on=on=something=json > > > This is a bug, right? > > > From: Damien Kamerman <dami...@gmail.com> > Sent: Friday, June 16, 2017 12:11:57 AM > To: solr-user@lucene.apache.org > Subject: Re: Issue with highlighter > > Ali, does adding a 'hl.q' param help? q=something=something&... > > On 16 June 2017 at 06:21, Ali Husain <alihus...@outlook.com> wrote: > >> Thanks for the replies. Let me try and explain this a little better. >> >> >> I haven't modified anything in solrconfig. All I did was get a fresh >> instance of solr 6.4.1 and create a core testHighlight. I then created a >> content field of type text_en via the Solr Admin UI. id was already there, >> and that is of type string. >> >> >> I then use the UI, once again to check the hl checkbox, hl.fl is set to * >> because I want any and every match. >> >> >> I push the following content into this new solr instance: >> >> id:91101 >> >> content:'I am adding something to the core field and we will try and find >> it. We want to make sure the highlighter works! >> >> This is short so fragsize and max characters shouldn\'t be an issue.' >> >> As you can see, very few characters, fragsize, maxAnalyzedChars, all that >> should not be an issue. >> >> >> I then send this query: >> >> http://localhost:8983/solr/testHighlight/select?hl.fl=*; >> hl=on=on=something=json >> >> >> My results: >> >> >> "response":{"numFound":1,"start":0,"docs":[ >> >> {"id":"91101", >> >> "content":"I am adding something to the core field and we will try >> and find it. We want to make sure the highlighter works! This is short so >> fragsize and max characters shouldn't be an issue.", >> "_version_":1570302668841156608}] >> >> >> }, >> >> >> "highlighting":{ >> "91101":{}} >> >> >> I change q to be core instead of something. >> >> >> http://localhost:8983/solr/testHighlight/select?hl.fl=*; >> hl=on=on=core=json >> >> >> { >> "id":"91101", >> "content":"I am adding something to the core field and we will try >> and find it. We want to make sure the highlighter works! This is short so >> fragsize and max characters shouldn't be an issue.", >> "_version_":1570302668841156608}, >> >> >> >> "highlighting":{ >> "91101":{ >> "content":["I am adding something to the core field and we >> will try and find it. We want to make sure"]}} >> >> I've tried a bunch of queries. 'adding', 'something' both don't return any >> highlights. 'core' 'am' 'field' all work. >> >> Am I doing a better job of explaining this? Quite puzzling why this would >> be happening. My guess is there is some file/config somewhere that is >> ignoring some words? It isn't stopwords.txt in my case though. If that >> isn't the case then it definitely seems like a bug to me. >> >> Thanks, Ali >> >> >> >> From: David Smiley <david.w.smi...@gmail.com> >> Sent: Thursday, June 15, 2017 12:33:39 AM >> To: solr-user@lucene.apache.org >> Subject: Re: Issue with highlighter >> >> > Beware of NOT plus OR in a search. That will certainly produce no >> highlights. (eg test -results when default op is OR) >> >> Seems like a bug to me; the default operator shouldn't matter in that case >> I think since there is only one clause that has no BooleanQuery.Occur >> operator and thus the OR/AND shouldn't matter. The end effect is "test" is >> effectively required and should definitely be highlighted. >> >> Note to Ali: Phil's comment implies use of hl.method=unified which is not >> the default. >> >> On Wed, Jun 14, 2017
Re: Issue with highlighter
Damien, I tried that too before I sent the email. Nothing :/ http://localhost:8983/solr/testHighlight/select?hl.q=something=*=on=on=something=json This is a bug, right? From: Damien Kamerman <dami...@gmail.com> Sent: Friday, June 16, 2017 12:11:57 AM To: solr-user@lucene.apache.org Subject: Re: Issue with highlighter Ali, does adding a 'hl.q' param help? q=something=something&... On 16 June 2017 at 06:21, Ali Husain <alihus...@outlook.com> wrote: > Thanks for the replies. Let me try and explain this a little better. > > > I haven't modified anything in solrconfig. All I did was get a fresh > instance of solr 6.4.1 and create a core testHighlight. I then created a > content field of type text_en via the Solr Admin UI. id was already there, > and that is of type string. > > > I then use the UI, once again to check the hl checkbox, hl.fl is set to * > because I want any and every match. > > > I push the following content into this new solr instance: > > id:91101 > > content:'I am adding something to the core field and we will try and find > it. We want to make sure the highlighter works! > > This is short so fragsize and max characters shouldn\'t be an issue.' > > As you can see, very few characters, fragsize, maxAnalyzedChars, all that > should not be an issue. > > > I then send this query: > > http://localhost:8983/solr/testHighlight/select?hl.fl=*; > hl=on=on=something=json > > > My results: > > > "response":{"numFound":1,"start":0,"docs":[ > > {"id":"91101", > > "content":"I am adding something to the core field and we will try > and find it. We want to make sure the highlighter works! This is short so > fragsize and max characters shouldn't be an issue.", > "_version_":1570302668841156608}] > > > }, > > > "highlighting":{ > "91101":{}} > > > I change q to be core instead of something. > > > http://localhost:8983/solr/testHighlight/select?hl.fl=*; > hl=on=on=core=json > > > { > "id":"91101", > "content":"I am adding something to the core field and we will try > and find it. We want to make sure the highlighter works! This is short so > fragsize and max characters shouldn't be an issue.", > "_version_":1570302668841156608}, > > > > "highlighting":{ > "91101":{ > "content":["I am adding something to the core field and we > will try and find it. We want to make sure"]}} > > I've tried a bunch of queries. 'adding', 'something' both don't return any > highlights. 'core' 'am' 'field' all work. > > Am I doing a better job of explaining this? Quite puzzling why this would > be happening. My guess is there is some file/config somewhere that is > ignoring some words? It isn't stopwords.txt in my case though. If that > isn't the case then it definitely seems like a bug to me. > > Thanks, Ali > > > > From: David Smiley <david.w.smi...@gmail.com> > Sent: Thursday, June 15, 2017 12:33:39 AM > To: solr-user@lucene.apache.org > Subject: Re: Issue with highlighter > > > Beware of NOT plus OR in a search. That will certainly produce no > highlights. (eg test -results when default op is OR) > > Seems like a bug to me; the default operator shouldn't matter in that case > I think since there is only one clause that has no BooleanQuery.Occur > operator and thus the OR/AND shouldn't matter. The end effect is "test" is > effectively required and should definitely be highlighted. > > Note to Ali: Phil's comment implies use of hl.method=unified which is not > the default. > > On Wed, Jun 14, 2017 at 10:22 PM Phil Scadden <p.scad...@gns.cri.nz> > wrote: > > > Just had similar issue - works for some, not others. First thing to look > > at is hl.maxAnalyzedChars is the query. The default is quite small. > > Since many of my documents are large PDF files, I opted to use > > storeOffsetsWithPositions="true" termVectors="true" on the field I was > > searching on. > > This certainly did increase my index size but not too bad and certainly > > fast. > > https://cwiki.apache.org/confluence/display/solr/Highlighting > > > > Beware of NOT plus OR in a search. That will certainly produce no > > highlights. (eg test -results when default op is OR) > > > > > > -Original Message- > > From: Ali Husain [mailto:alihus...@outlook.com] > > Sent: Thursday, 15
Re: Issue with highlighter
Ali, does adding a 'hl.q' param help? q=something=something&... On 16 June 2017 at 06:21, Ali Husain <alihus...@outlook.com> wrote: > Thanks for the replies. Let me try and explain this a little better. > > > I haven't modified anything in solrconfig. All I did was get a fresh > instance of solr 6.4.1 and create a core testHighlight. I then created a > content field of type text_en via the Solr Admin UI. id was already there, > and that is of type string. > > > I then use the UI, once again to check the hl checkbox, hl.fl is set to * > because I want any and every match. > > > I push the following content into this new solr instance: > > id:91101 > > content:'I am adding something to the core field and we will try and find > it. We want to make sure the highlighter works! > > This is short so fragsize and max characters shouldn\'t be an issue.' > > As you can see, very few characters, fragsize, maxAnalyzedChars, all that > should not be an issue. > > > I then send this query: > > http://localhost:8983/solr/testHighlight/select?hl.fl=*; > hl=on=on=something=json > > > My results: > > > "response":{"numFound":1,"start":0,"docs":[ > > {"id":"91101", > > "content":"I am adding something to the core field and we will try > and find it. We want to make sure the highlighter works! This is short so > fragsize and max characters shouldn't be an issue.", > "_version_":1570302668841156608}] > > > }, > > > "highlighting":{ > "91101":{}} > > > I change q to be core instead of something. > > > http://localhost:8983/solr/testHighlight/select?hl.fl=*; > hl=on=on=core=json > > > { > "id":"91101", > "content":"I am adding something to the core field and we will try > and find it. We want to make sure the highlighter works! This is short so > fragsize and max characters shouldn't be an issue.", > "_version_":1570302668841156608}, > > > > "highlighting":{ > "91101":{ > "content":["I am adding something to the core field and we > will try and find it. We want to make sure"]}} > > I've tried a bunch of queries. 'adding', 'something' both don't return any > highlights. 'core' 'am' 'field' all work. > > Am I doing a better job of explaining this? Quite puzzling why this would > be happening. My guess is there is some file/config somewhere that is > ignoring some words? It isn't stopwords.txt in my case though. If that > isn't the case then it definitely seems like a bug to me. > > Thanks, Ali > > > > From: David Smiley <david.w.smi...@gmail.com> > Sent: Thursday, June 15, 2017 12:33:39 AM > To: solr-user@lucene.apache.org > Subject: Re: Issue with highlighter > > > Beware of NOT plus OR in a search. That will certainly produce no > highlights. (eg test -results when default op is OR) > > Seems like a bug to me; the default operator shouldn't matter in that case > I think since there is only one clause that has no BooleanQuery.Occur > operator and thus the OR/AND shouldn't matter. The end effect is "test" is > effectively required and should definitely be highlighted. > > Note to Ali: Phil's comment implies use of hl.method=unified which is not > the default. > > On Wed, Jun 14, 2017 at 10:22 PM Phil Scadden <p.scad...@gns.cri.nz> > wrote: > > > Just had similar issue - works for some, not others. First thing to look > > at is hl.maxAnalyzedChars is the query. The default is quite small. > > Since many of my documents are large PDF files, I opted to use > > storeOffsetsWithPositions="true" termVectors="true" on the field I was > > searching on. > > This certainly did increase my index size but not too bad and certainly > > fast. > > https://cwiki.apache.org/confluence/display/solr/Highlighting > > > > Beware of NOT plus OR in a search. That will certainly produce no > > highlights. (eg test -results when default op is OR) > > > > > > -Original Message- > > From: Ali Husain [mailto:alihus...@outlook.com] > > Sent: Thursday, 15 June 2017 11:11 a.m. > > To: solr-user@lucene.apache.org > > Subject: Issue with highlighter > > > > Hi, > > > > > > I think I've found a bug with the highlighter. I search for the word > > "something" and I get an empty highlighting response for all the > documents > > that are returned sh
Re: Issue with highlighter
Thanks for the replies. Let me try and explain this a little better. I haven't modified anything in solrconfig. All I did was get a fresh instance of solr 6.4.1 and create a core testHighlight. I then created a content field of type text_en via the Solr Admin UI. id was already there, and that is of type string. I then use the UI, once again to check the hl checkbox, hl.fl is set to * because I want any and every match. I push the following content into this new solr instance: id:91101 content:'I am adding something to the core field and we will try and find it. We want to make sure the highlighter works! This is short so fragsize and max characters shouldn\'t be an issue.' As you can see, very few characters, fragsize, maxAnalyzedChars, all that should not be an issue. I then send this query: http://localhost:8983/solr/testHighlight/select?hl.fl=*=on=on=something=json My results: "response":{"numFound":1,"start":0,"docs":[ {"id":"91101", "content":"I am adding something to the core field and we will try and find it. We want to make sure the highlighter works! This is short so fragsize and max characters shouldn't be an issue.", "_version_":1570302668841156608}] }, "highlighting":{ "91101":{}} I change q to be core instead of something. http://localhost:8983/solr/testHighlight/select?hl.fl=*=on=on=core=json { "id":"91101", "content":"I am adding something to the core field and we will try and find it. We want to make sure the highlighter works! This is short so fragsize and max characters shouldn't be an issue.", "_version_":1570302668841156608}, "highlighting":{ "91101":{ "content":["I am adding something to the core field and we will try and find it. We want to make sure"]}} I've tried a bunch of queries. 'adding', 'something' both don't return any highlights. 'core' 'am' 'field' all work. Am I doing a better job of explaining this? Quite puzzling why this would be happening. My guess is there is some file/config somewhere that is ignoring some words? It isn't stopwords.txt in my case though. If that isn't the case then it definitely seems like a bug to me. Thanks, Ali From: David Smiley <david.w.smi...@gmail.com> Sent: Thursday, June 15, 2017 12:33:39 AM To: solr-user@lucene.apache.org Subject: Re: Issue with highlighter > Beware of NOT plus OR in a search. That will certainly produce no highlights. (eg test -results when default op is OR) Seems like a bug to me; the default operator shouldn't matter in that case I think since there is only one clause that has no BooleanQuery.Occur operator and thus the OR/AND shouldn't matter. The end effect is "test" is effectively required and should definitely be highlighted. Note to Ali: Phil's comment implies use of hl.method=unified which is not the default. On Wed, Jun 14, 2017 at 10:22 PM Phil Scadden <p.scad...@gns.cri.nz> wrote: > Just had similar issue - works for some, not others. First thing to look > at is hl.maxAnalyzedChars is the query. The default is quite small. > Since many of my documents are large PDF files, I opted to use > storeOffsetsWithPositions="true" termVectors="true" on the field I was > searching on. > This certainly did increase my index size but not too bad and certainly > fast. > https://cwiki.apache.org/confluence/display/solr/Highlighting > > Beware of NOT plus OR in a search. That will certainly produce no > highlights. (eg test -results when default op is OR) > > > -Original Message- > From: Ali Husain [mailto:alihus...@outlook.com] > Sent: Thursday, 15 June 2017 11:11 a.m. > To: solr-user@lucene.apache.org > Subject: Issue with highlighter > > Hi, > > > I think I've found a bug with the highlighter. I search for the word > "something" and I get an empty highlighting response for all the documents > that are returned shown below. The fields that I am searching over are > text_en, the highlighter works for a lot of queries. I have no > stopwords.txt list that could be messing this up either. > > > "highlighting":{ > "310":{}, > "103":{}, > "406":{}, > "1189":{}, > "54":{}, > "292":{}, > "309":{}}} > > > Just changing the search term to "something like" I get back this: > > > "highlighting":{ > "310":{}, > "309":{ > "content":["1949 Convention, like those"]}, > "103":{}, > "406":{},
Re: Issue with highlighter
> Beware of NOT plus OR in a search. That will certainly produce no highlights. (eg test -results when default op is OR) Seems like a bug to me; the default operator shouldn't matter in that case I think since there is only one clause that has no BooleanQuery.Occur operator and thus the OR/AND shouldn't matter. The end effect is "test" is effectively required and should definitely be highlighted. Note to Ali: Phil's comment implies use of hl.method=unified which is not the default. On Wed, Jun 14, 2017 at 10:22 PM Phil Scadden <p.scad...@gns.cri.nz> wrote: > Just had similar issue - works for some, not others. First thing to look > at is hl.maxAnalyzedChars is the query. The default is quite small. > Since many of my documents are large PDF files, I opted to use > storeOffsetsWithPositions="true" termVectors="true" on the field I was > searching on. > This certainly did increase my index size but not too bad and certainly > fast. > https://cwiki.apache.org/confluence/display/solr/Highlighting > > Beware of NOT plus OR in a search. That will certainly produce no > highlights. (eg test -results when default op is OR) > > > -Original Message- > From: Ali Husain [mailto:alihus...@outlook.com] > Sent: Thursday, 15 June 2017 11:11 a.m. > To: solr-user@lucene.apache.org > Subject: Issue with highlighter > > Hi, > > > I think I've found a bug with the highlighter. I search for the word > "something" and I get an empty highlighting response for all the documents > that are returned shown below. The fields that I am searching over are > text_en, the highlighter works for a lot of queries. I have no > stopwords.txt list that could be messing this up either. > > > "highlighting":{ > "310":{}, > "103":{}, > "406":{}, > "1189":{}, > "54":{}, > "292":{}, > "309":{}}} > > > Just changing the search term to "something like" I get back this: > > > "highlighting":{ > "310":{}, > "309":{ > "content":["1949 Convention, like those"]}, > "103":{}, > "406":{}, > "1189":{}, > "54":{}, > "292":{}, > "286":{ > "content":["persons in these classes are treated like > combatants, but in other respects"]}, > "336":{ > "content":[" be treated like engagement"]}}} > > > So I know that I have it setup correctly, but I can't figure this out. > I've searched through JIRA/Google and haven't been able to find a similar > issue. > > > Any ideas? > > > Thanks, > > Ali > Notice: This email and any attachments are confidential and may not be > used, published or redistributed without the prior written consent of the > Institute of Geological and Nuclear Sciences Limited (GNS Science). If > received in error please destroy and immediately notify GNS Science. Do not > copy or disclose the contents. > -- Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker LinkedIn: http://linkedin.com/in/davidwsmiley | Book: http://www.solrenterprisesearchserver.com
RE: Issue with highlighter
Just had similar issue - works for some, not others. First thing to look at is hl.maxAnalyzedChars is the query. The default is quite small. Since many of my documents are large PDF files, I opted to use storeOffsetsWithPositions="true" termVectors="true" on the field I was searching on. This certainly did increase my index size but not too bad and certainly fast. https://cwiki.apache.org/confluence/display/solr/Highlighting Beware of NOT plus OR in a search. That will certainly produce no highlights. (eg test -results when default op is OR) -Original Message- From: Ali Husain [mailto:alihus...@outlook.com] Sent: Thursday, 15 June 2017 11:11 a.m. To: solr-user@lucene.apache.org Subject: Issue with highlighter Hi, I think I've found a bug with the highlighter. I search for the word "something" and I get an empty highlighting response for all the documents that are returned shown below. The fields that I am searching over are text_en, the highlighter works for a lot of queries. I have no stopwords.txt list that could be messing this up either. "highlighting":{ "310":{}, "103":{}, "406":{}, "1189":{}, "54":{}, "292":{}, "309":{}}} Just changing the search term to "something like" I get back this: "highlighting":{ "310":{}, "309":{ "content":["1949 Convention, like those"]}, "103":{}, "406":{}, "1189":{}, "54":{}, "292":{}, "286":{ "content":["persons in these classes are treated like combatants, but in other respects"]}, "336":{ "content":[" be treated like engagement"]}}} So I know that I have it setup correctly, but I can't figure this out. I've searched through JIRA/Google and haven't been able to find a similar issue. Any ideas? Thanks, Ali Notice: This email and any attachments are confidential and may not be used, published or redistributed without the prior written consent of the Institute of Geological and Nuclear Sciences Limited (GNS Science). If received in error please destroy and immediately notify GNS Science. Do not copy or disclose the contents.
Re: Issue with highlighter
If the default operator is OR, then you're just matching on the "like" word and it's being properly highlighted. If you're saying that doc 286 (or whatever) has both "something" and "like" in the content and you expect to find them both, try increasing the number of snippets returned. Otherwise we need to see the _complete_ query and response, preferably with =true. Plus your schema, plus a sample document and exactly what you think should be happening that isn't. Best, Erick On Wed, Jun 14, 2017 at 4:11 PM, Ali Husainwrote: > Hi, > > > I think I've found a bug with the highlighter. I search for the word > "something" and I get an empty highlighting response for all the documents > that are returned shown below. The fields that I am searching over are > text_en, the highlighter works for a lot of queries. I have no stopwords.txt > list that could be messing this up either. > > > "highlighting":{ > "310":{}, > "103":{}, > "406":{}, > "1189":{}, > "54":{}, > "292":{}, > "309":{}}} > > > Just changing the search term to "something like" I get back this: > > > "highlighting":{ > "310":{}, > "309":{ > "content":["1949 Convention, like those"]}, > "103":{}, > "406":{}, > "1189":{}, > "54":{}, > "292":{}, > "286":{ > "content":["persons in these classes are treated like > combatants, but in other respects"]}, > "336":{ > "content":[" be treated like engagement"]}}} > > > So I know that I have it setup correctly, but I can't figure this out. I've > searched through JIRA/Google and haven't been able to find a similar issue. > > > Any ideas? > > > Thanks, > > Ali
Issue with highlighter
Hi, I think I've found a bug with the highlighter. I search for the word "something" and I get an empty highlighting response for all the documents that are returned shown below. The fields that I am searching over are text_en, the highlighter works for a lot of queries. I have no stopwords.txt list that could be messing this up either. "highlighting":{ "310":{}, "103":{}, "406":{}, "1189":{}, "54":{}, "292":{}, "309":{}}} Just changing the search term to "something like" I get back this: "highlighting":{ "310":{}, "309":{ "content":["1949 Convention, like those"]}, "103":{}, "406":{}, "1189":{}, "54":{}, "292":{}, "286":{ "content":["persons in these classes are treated like combatants, but in other respects"]}, "336":{ "content":[" be treated like engagement"]}}} So I know that I have it setup correctly, but I can't figure this out. I've searched through JIRA/Google and haven't been able to find a similar issue. Any ideas? Thanks, Ali