question about query parsing
Hi list, while searching with debug on I see strange query parsing: str name=rawquerystringidentifier:ub.uni-bielefeld.de/str str name=querystringidentifier:ub.uni-bielefeld.de/str str name=parsedquery +MultiPhraseQuery(identifier:(ub.uni-bielefeld.de ub) uni bielefeld de) /str str name=parsedquery_toString +identifier:(ub.uni-bielefeld.de ub) uni bielefeld de /str It is a PhraseQuery, but - why is the string split apart? - why is it grouped this way? Default is edismax. FIELD: field name=identifier type=text_url indexed=true stored=false multiValued=true/ FIELDTYPE: fieldType name=text_url class=solr.TextField positionIncrementGap=100 − analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=0 preserveOriginal=1/ filter class=solr.LowerCaseFilterFactory/ /analyzer − analyzer type=query tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=0 preserveOriginal=1/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType Regards Bernd
Re: Question about query
Hey ... 10x for you reply ... unfortunately this is not a case for me .. I have canceled the feature which needs this ... KInd regards Armando Erick Erickson wrote: One thing I've seen suggested is to add the number of values to a separate field, say topic_count. Then, in your situation above you could append AND topic_count=1. This can extend to work if you wanted any number of matches (and only that number). For instance, topic=5 AND topic=10 AND topic=20 AND topic_count=3 would give you article 4. Don't know if this works in your particular situation Erick On Mon, Mar 22, 2010 at 10:32 AM, Armando Ota armando...@siol.net wrote: Hi I need a little help with query for my problem (if it can be solved) I have a field in a document called topic this field contains some values, 0 (for no topic) or 1 (topic 1), 2, 3, etc ... It can contain many values like 1, 10, 50, etc (for 1 doc) So now to the problem: I would like to get documents that have 0 for topic value and documents that only have for example 1 for topic value inserted articles for example: article 1topics: 1, 5, 10, 20, 24 article 2 topics: 0 article 3 topics: 1 article 4 topic: 5, 10, 20 article 5 topic: 1, 13, 19 So I need search query to return me only article 2 and 3 not other articles with 1 for topic value Can that be done ? Any help appreciated Kind regards Armando
Re: Question about query
Well, here what I figure out ! (mm=150% , qf=topic , q=1 0 ) == q=topic:0 or topic:1 On 3/22/10, Armando Ota armando...@siol.net wrote: Hi I need a little help with query for my problem (if it can be solved) I have a field in a document called topic this field contains some values, 0 (for no topic) or 1 (topic 1), 2, 3, etc ... It can contain many values like 1, 10, 50, etc (for 1 doc) So now to the problem: I would like to get documents that have 0 for topic value and documents that only have for example 1 for topic value inserted articles for example: article 1topics: 1, 5, 10, 20, 24 article 2 topics: 0 article 3 topics: 1 article 4 topic: 5, 10, 20 article 5 topic: 1, 13, 19 So I need search query to return me only article 2 and 3 not other articles with 1 for topic value Can that be done ? Any help appreciated Kind regards Armando -- Elsadek Software Engineer- J2EE / WEB / ESB MULE
Re: Question about query
Hey Thank you for your reply .. but it's not working ... I still get other articles Kind regards Armando Abdelhamid ABID wrote: Well, here what I figure out ! (mm=150% , qf=topic , q=1 0 ) == q=topic:0 or topic:1 On 3/22/10, Armando Ota armando...@siol.net wrote: Hi I need a little help with query for my problem (if it can be solved) I have a field in a document called topic this field contains some values, 0 (for no topic) or 1 (topic 1), 2, 3, etc ... It can contain many values like 1, 10, 50, etc (for 1 doc) So now to the problem: I would like to get documents that have 0 for topic value and documents that only have for example 1 for topic value inserted articles for example: article 1topics: 1, 5, 10, 20, 24 article 2 topics: 0 article 3 topics: 1 article 4 topic: 5, 10, 20 article 5 topic: 1, 13, 19 So I need search query to return me only article 2 and 3 not other articles with 1 for topic value Can that be done ? Any help appreciated Kind regards Armando
Re: Question about query
One thing I've seen suggested is to add the number of values to a separate field, say topic_count. Then, in your situation above you could append AND topic_count=1. This can extend to work if you wanted any number of matches (and only that number). For instance, topic=5 AND topic=10 AND topic=20 AND topic_count=3 would give you article 4. Don't know if this works in your particular situation Erick On Mon, Mar 22, 2010 at 10:32 AM, Armando Ota armando...@siol.net wrote: Hi I need a little help with query for my problem (if it can be solved) I have a field in a document called topic this field contains some values, 0 (for no topic) or 1 (topic 1), 2, 3, etc ... It can contain many values like 1, 10, 50, etc (for 1 doc) So now to the problem: I would like to get documents that have 0 for topic value and documents that only have for example 1 for topic value inserted articles for example: article 1topics: 1, 5, 10, 20, 24 article 2 topics: 0 article 3 topics: 1 article 4 topic: 5, 10, 20 article 5 topic: 1, 13, 19 So I need search query to return me only article 2 and 3 not other articles with 1 for topic value Can that be done ? Any help appreciated Kind regards Armando
Re: Question about query sintax
: If I query for 'ferrar*' on my index, I will get 'ferrari' and 'red ferrari' : as a result. And that's fine. But if I try to query for 'red ferrar*', I : have to put it between double quotes as I want to grant that it will be used : as only one term, but the '*' is being ignored, as I don't get any result. : What should be the apropriate query for it? when you add the double quotes you tell solr that the * should now be treated as a literal, and it's no longer a special character. it is possible to have query structures like what you are interested in, but i don't think it's possible to express it using the Lucene syntax. -Hoss
Question about query sintax
Hello, If I query for 'ferrar*' on my index, I will get 'ferrari' and 'red ferrari' as a result. And that's fine. But if I try to query for 'red ferrar*', I have to put it between double quotes as I want to grant that it will be used as only one term, but the '*' is being ignored, as I don't get any result. What should be the apropriate query for it? FYI I am querying one standard text field. - http://www.nabble.com/RPG-da-Ilha-f35514.html RPG da Ilha -- View this message in context: http://www.nabble.com/Question-about-query-sintax-tp21455970p21455970.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Question about Query Phrase Slop (qs) in dismax
: From the solr wiki, it sounded like if qs is set to 5 for example, if the : search term is 'child custody', only docs with 'child' 'custody' within 5 : words of one another would be returned in results. Is this correct? If so, No. as explained on the wiki... Amount of slop on phrase queries explicitly included in the user's query string note the explicitly included part ... if the query string doesn't contain any quotation marks, 'qs' isn't used at all. (as opposed to 'ps' which is Amount of slop on phrase queries built for 'pf' fields) in a query like this... q=child+custodyqs=5qf=... ...the 'qs' is ignored. if you want to require that the input words all appear within a set slop of eachother (in at least one 'qf' field) you need to quote the users input... q=child+custodyqs=5qf=... : in bad user experience as those docs are not so relevant. What more could i : do to improve quality in the results? use 'pf' with very high boosts (compared to the 'qf' boosts) so that phrse matching docs appear before non phrase matching docs. -Hoss
Re: Please Help !! Question about Query Phrase Slop (qs) in dismax
: Subject: Re: Please Help !! Question about Query Phrase Slop (qs) in dismax : : : Please help someone...i've been waiting for an answer for the last couple of : days no one seems to be helping out here. I did search the wiki this Please don't send messages like this. This is a volunteer community -- no one (that I know of) is paid to read/reply to questions on the solr-user list. Many of us do our best to make sure that all user questions get addressed, but this is a fairly high volume list, and sometimes other things in life (work, health, relationships, family, etc...) make that take a little longer then we would like -- sometimes questions don't get answered for a few days, it's just the way it is, please be patient. Sending multiple please help, still no reply type messages just adds noise to the list, and give people who *do* want to help more to read which means it takes that much longer to actually reply. If you need an answer to a question in a hurry: read the archives and the docs, experiment, read the code (if you know java), or hire a consultant to help you figure it out. In this specific case, debugQuery=true would have quickly shown you that your qs=5 value wasn't making it's way into the parsedquery at all, which might have helped you understand what was happening. -Hoss
Re: Question about Query Phrase Slop (qs) in dismax
Somebody please help clear this doubt. What more could i do with the dismax handler to remove results that don't have 'word1'', 'word2', 'word3' etc in a search phrase not within 5 words of one another, to not come up in the results? anuvenk wrote: From the solr wiki, it sounded like if qs is set to 5 for example, if the search term is 'child custody', only docs with 'child' 'custody' within 5 words of one another would be returned in results. Is this correct? If so, it doesn't seem to be working for me. I see docs with 'child' 'custody' more than 5 words of one another (excluding stop words) which is resulting in bad user experience as those docs are not so relevant. What more could i do to improve quality in the results? -- View this message in context: http://www.nabble.com/Question-about-Query-Phrase-Slop-%28qs%29-in-dismax-tp20643003p20648109.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Please Help !! Question about Query Phrase Slop (qs) in dismax
Thanks for the response. Well my current ps setting works great for most search terms. But say this typical example, north dakota 1031 exchange lawyers - we don't have any relevant docs in the index. Solr is returning the irrelevant doc, just because it found 'lawyer', exchange, north dakota somewhere. I thought if there is a way to just not return any results if they are not within close proximity, it would be great. Yonik Seeley wrote: On Sun, Nov 23, 2008 at 11:51 PM, anuvenk [EMAIL PROTECTED] wrote: Please help someone...i've been waiting for an answer for the last couple of days no one seems to be helping out here. I did search the wiki this forum for an answer. But couldn't find an answer. I know if ps is set to 5 words within 5 words of one another receive a boost in score. But is there a way to not return results that have the words in search terms more than 5 words apart. ? Not with dismax. I'm not sure why it's a problem, given that with enough boost you should be able to ensure that all of the results with a slop less than 5 appear before other results. Anyway, if you want to restrict results to those with a slop of 5, use the standard query parser with an explicit sloppy phrase query: north dakota 1031 exchange lawyers~5 -Yonik Typical example: north dakota 1031 exchange lawyers My first result is absolutely ir-relevant. It returned a north dakota doc though but had an occurrence of attorney somewhere an occurrence of exchange (not related to 1031 exchange though). They were not within 5 words of one another. My guys have been hammering me reg this relevancy issue. Please help someone. anuvenk wrote: From the solr wiki, it sounded like if qs is set to 5 for example, if the search term is 'child custody', only docs with 'child' 'custody' within 5 words of one another would be returned in results. Is this correct? If so, it doesn't seem to be working for me. I see docs with 'child' 'custody' more than 5 words of one another (excluding stop words) which is resulting in bad user experience as those docs are not so relevant. What more could i do to improve quality in the results? -- View this message in context: http://www.nabble.com/Please-Help-%21%21-Question-about-Query-Phrase-Slop-%28qs%29-in-dismax-tp20643003p20654906.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Please-Help-%21%21-Question-about-Query-Phrase-Slop-%28qs%29-in-dismax-tp20643003p20655014.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Please Help !! Question about Query Phrase Slop (qs) in dismax
If you boost the phrase queries by enough, you could tell when you hit the less relevant documents by the score. -Yonik On Mon, Nov 24, 2008 at 12:07 AM, anuvenk [EMAIL PROTECTED] wrote: Thanks for the response. Well my current ps setting works great for most search terms. But say this typical example, north dakota 1031 exchange lawyers - we don't have any relevant docs in the index. Solr is returning the irrelevant doc, just because it found 'lawyer', exchange, north dakota somewhere. I thought if there is a way to just not return any results if they are not within close proximity, it would be great. Yonik Seeley wrote: On Sun, Nov 23, 2008 at 11:51 PM, anuvenk [EMAIL PROTECTED] wrote: Please help someone...i've been waiting for an answer for the last couple of days no one seems to be helping out here. I did search the wiki this forum for an answer. But couldn't find an answer. I know if ps is set to 5 words within 5 words of one another receive a boost in score. But is there a way to not return results that have the words in search terms more than 5 words apart. ? Not with dismax. I'm not sure why it's a problem, given that with enough boost you should be able to ensure that all of the results with a slop less than 5 appear before other results. Anyway, if you want to restrict results to those with a slop of 5, use the standard query parser with an explicit sloppy phrase query: north dakota 1031 exchange lawyers~5 -Yonik Typical example: north dakota 1031 exchange lawyers My first result is absolutely ir-relevant. It returned a north dakota doc though but had an occurrence of attorney somewhere an occurrence of exchange (not related to 1031 exchange though). They were not within 5 words of one another. My guys have been hammering me reg this relevancy issue. Please help someone. anuvenk wrote: From the solr wiki, it sounded like if qs is set to 5 for example, if the search term is 'child custody', only docs with 'child' 'custody' within 5 words of one another would be returned in results. Is this correct? If so, it doesn't seem to be working for me. I see docs with 'child' 'custody' more than 5 words of one another (excluding stop words) which is resulting in bad user experience as those docs are not so relevant. What more could i do to improve quality in the results? -- View this message in context: http://www.nabble.com/Please-Help-%21%21-Question-about-Query-Phrase-Slop-%28qs%29-in-dismax-tp20643003p20654906.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Please-Help-%21%21-Question-about-Query-Phrase-Slop-%28qs%29-in-dismax-tp20643003p20655014.html Sent from the Solr - User mailing list archive at Nabble.com.
Question about Query Phrase Slop (qs) in dismax
From the solr wiki, it sounded like if qs is set to 5 for example, if the search term is 'child custody', only docs with 'child' 'custody' within 5 words of one another would be returned in results. Is this correct? If so, it doesn't seem to be working for me. I see docs with 'child' 'custody' more than 5 words of one another (excluding stop words) which is resulting in bad user experience as those docs are not so relevant. What more could i do to improve quality in the results? -- View this message in context: http://www.nabble.com/Question-about-Query-Phrase-Slop-%28qs%29-in-dismax-tp20643003p20643003.html Sent from the Solr - User mailing list archive at Nabble.com.