Re: query problem
If at all possible, denormalize the data But you can also use Solr's Join capability here, see: https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-JoinQueryParser Best, Erick On Mon, Aug 8, 2016 at 8:47 AM, Pithon Philippewrote: > Hello, > I have two documents type : > - tickets (type_s:"ticket", customerid_i:10) > - customers (type_s:customer,customerid_i:10,name_s:"FISHER" ) > > I want a query to find all tickets for name customer FISHER > In document ticket (type_s:"ticket") , I have id customer but not name > customer... > > Any ideas ??? > > Thanks
Re: query problem
Hi, I suspect q=State:tamil nadu parsed as State:tamil text:nadu. You can confirm this by adding debugQuery=on. Either use quotes q=State:tamil nadu or use term query parser q={!term f=State}tamil nadu Ahmet On Wednesday, March 5, 2014 8:29 PM, Kishan Parmar kishan@gmail.com wrote: hi there my schema file is this--- ?xml version=1.0 encoding=UTF-8 ? schema name=example version=1.2 types fieldType name=string class=solr.StrField sortMissingLast=true omitNorms=true / fieldType name=int class=solr.TrieIntField precisionStep=0 omitNorms=true positionIncrementGap=0 / fieldType name=date class=solr.TrieDateField omitNorms=true precisionStep=0 positionIncrementGap=0 / fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=2 generateNumberParts=2 catenateWords=2 catenateNumbers=2 catenateAll=1 splitOnCaseChange=2 / filter class=solr.LowerCaseFilterFactory / filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt / /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=2 generateNumberParts=2 catenateWords=1 catenateNumbers=1 catenateAll=1 splitOnCaseChange=2 / filter class=solr.LowerCaseFilterFactory / filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt / /analyzer /fieldType /types fields field name=SrNo type=int indexed=true stored=true required=true / field name=Name type=string indexed=true stored=true required=true / field name=Scheme type=string indexed=true stored=true required=true / field name=State type=string indexed=true stored=true required=true / field name=text type=text indexed=true stored=true multiValued=true / field name=_version_ type=string indexed=true stored=true required=true multiValued=false / /fields copyField source=SrNo dest=text / copyField source=Name dest=text / copyField source=Scheme dest=text / copyField source=State dest=text / uniqueKeySrNo/uniqueKey defaultSearchFieldtext/defaultSearchField solrQueryParser defaultOperator=AND / /schema and when i try to query in solr 4.6.0 which is State:tamil nadu it gives 0 result but is there any problem with whitesapce in type=String Regards, Kishan Parmar Software Developer +91 95 100 77394 Jay Shree Krishnaa !!
Re: query problem
Thanks , but still no change in output --- q=State:tamil nadu it parse as q: State:\tamil nadu\ Regards, Kishan Parmar Software Developer +91 95 100 77394 Jay Shree Krishnaa !! 2014-03-06 0:17 GMT+05:30 Ahmet Arslan iori...@yahoo.com: Hi, I suspect q=State:tamil nadu parsed as State:tamil text:nadu. You can confirm this by adding debugQuery=on. Either use quotes q=State:tamil nadu or use term query parser q={!term f=State}tamil nadu Ahmet On Wednesday, March 5, 2014 8:29 PM, Kishan Parmar kishan@gmail.com wrote: hi there my schema file is this--- ?xml version=1.0 encoding=UTF-8 ? schema name=example version=1.2 types fieldType name=string class=solr.StrField sortMissingLast=true omitNorms=true / fieldType name=int class=solr.TrieIntField precisionStep=0 omitNorms=true positionIncrementGap=0 / fieldType name=date class=solr.TrieDateField omitNorms=true precisionStep=0 positionIncrementGap=0 / fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=2 generateNumberParts=2 catenateWords=2 catenateNumbers=2 catenateAll=1 splitOnCaseChange=2 / filter class=solr.LowerCaseFilterFactory / filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt / /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=2 generateNumberParts=2 catenateWords=1 catenateNumbers=1 catenateAll=1 splitOnCaseChange=2 / filter class=solr.LowerCaseFilterFactory / filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt / /analyzer /fieldType /types fields field name=SrNo type=int indexed=true stored=true required=true / field name=Name type=string indexed=true stored=true required=true / field name=Scheme type=string indexed=true stored=true required=true / field name=State type=string indexed=true stored=true required=true / field name=text type=text indexed=true stored=true multiValued=true / field name=_version_ type=string indexed=true stored=true required=true multiValued=false / /fields copyField source=SrNo dest=text / copyField source=Name dest=text / copyField source=Scheme dest=text / copyField source=State dest=text / uniqueKeySrNo/uniqueKey defaultSearchFieldtext/defaultSearchField solrQueryParser defaultOperator=AND / /schema and when i try to query in solr 4.6.0 which is State:tamil nadu it gives 0 result but is there any problem with whitesapce in type=String Regards, Kishan Parmar Software Developer +91 95 100 77394 Jay Shree Krishnaa !!
Re: query problem
Hi Kishan, can you please give us example document query pair that query should retrieve that document. e.g. query q=State:tamil nadu should return what document text? Ahmet On Wednesday, March 5, 2014 9:04 PM, Kishan Parmar kishan@gmail.com wrote: Thanks , but still no change in output --- q=State:tamil nadu it parse as q: State:\tamil nadu\ Regards, Kishan Parmar Software Developer +91 95 100 77394 Jay Shree Krishnaa !! 2014-03-06 0:17 GMT+05:30 Ahmet Arslan iori...@yahoo.com: Hi, I suspect q=State:tamil nadu parsed as State:tamil text:nadu. You can confirm this by adding debugQuery=on. Either use quotes q=State:tamil nadu or use term query parser q={!term f=State}tamil nadu Ahmet On Wednesday, March 5, 2014 8:29 PM, Kishan Parmar kishan@gmail.com wrote: hi there my schema file is this--- ?xml version=1.0 encoding=UTF-8 ? schema name=example version=1.2 types fieldType name=string class=solr.StrField sortMissingLast=true omitNorms=true / fieldType name=int class=solr.TrieIntField precisionStep=0 omitNorms=true positionIncrementGap=0 / fieldType name=date class=solr.TrieDateField omitNorms=true precisionStep=0 positionIncrementGap=0 / fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=2 generateNumberParts=2 catenateWords=2 catenateNumbers=2 catenateAll=1 splitOnCaseChange=2 / filter class=solr.LowerCaseFilterFactory / filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt / /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=2 generateNumberParts=2 catenateWords=1 catenateNumbers=1 catenateAll=1 splitOnCaseChange=2 / filter class=solr.LowerCaseFilterFactory / filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt / /analyzer /fieldType /types fields field name=SrNo type=int indexed=true stored=true required=true / field name=Name type=string indexed=true stored=true required=true / field name=Scheme type=string indexed=true stored=true required=true / field name=State type=string indexed=true stored=true required=true / field name=text type=text indexed=true stored=true multiValued=true / field name=_version_ type=string indexed=true stored=true required=true multiValued=false / /fields copyField source=SrNo dest=text / copyField source=Name dest=text / copyField source=Scheme dest=text / copyField source=State dest=text / uniqueKeySrNo/uniqueKey defaultSearchFieldtext/defaultSearchField solrQueryParser defaultOperator=AND / /schema and when i try to query in solr 4.6.0 which is State:tamil nadu it gives 0 result but is there any problem with whitesapce in type=String Regards, Kishan Parmar Software Developer +91 95 100 77394 Jay Shree Krishnaa !!
Re: query problem
Thanks, my documents are xml files i am attaching that document in this and in my project i have to search from each field defined in schema.xml and my output should be in solr is like { responseHeader: { status: 0, QTime: 1, params: { indent: true, q: State:Delhi, _: 1394085162344, wt: json } }, response: { numFound: 2, start: 0, docs: [ { SrNo: 3, text: [ 3, RGNF-2013-2015-4841, AASHRITI GAUTAM LMANF, GEN, Delhi, 17/11/1992, sagittariusaashriti@gmail.com, B.A, RGNF, Social Sciences, University of Jammu, 78.84, First, , State University, 9282, M.A, Politics (Specialization in International Relations), 7/22/2013, Jawahar Lal Nehru, BHAGWATI CHARAN SHARMA, DA-10- DDA-FLATS- MUNIRKA NEW DELHI NEW DELHI 110067, , JAHANVI, Unverified, ], CandidateID: RGNF-2013-2015-4841, Name: AASHRITI GAUTAM LMANF, Category: GEN, State: Delhi, DOB: 17/11/1992, Email: sagittariusaashriti@gmail.com, UGExam: B.A, Scheme: RGNF, UGSubject: Social Sciences, UGCollageORUniversity: University of Jammu, UGInPersentage: 78.84, Rank: First, UGRankSubject: , StatusOFUGInstituteORCollageORUniversity: State University, NoOfStudentsAppered: 9282, PGDegree: M.A, PGSubject: Politics (Specialization in International Relations), DateOfPGAdmission: 7/22/2013, PGCollageORUniversity: Jawahar Lal Nehru, FatherName: BHAGWATI CHARAN SHARMA, Address: DA-10- DDA-FLATS- MUNIRKA NEW DELHI NEW DELHI 110067, Contect: , Updated_By: JAHANVI, FinalRemarks: Unverified, _version_: 1461765344581386240 }, { SrNo: 8, text: [ 8, URH-2013-2015-1888, ABHISHEK MISHRA, GEN, Delhi, 09/11/1992, a.sudham...@yahoo.com, B.SC CHEMISTRY, RGNF, Chemical Sciences, Queen marys college/madras university, 66.5, Second, chemisty, State University, 446, M.A., Politics (Specialization in International Relations), 7/22/2013, Jawahar Lal Nehru, PARTHA SARATHI BHATTACHARYA, DA-10- DDA-FLATS- MUNIRKA NEW DELHI NEW DELHI 110067, 7278635773, RAVI, Duplicate, ], CandidateID: URH-2013-2015-1888, Name: ABHISHEK MISHRA, Category: GEN, State: Delhi, DOB: 09/11/1992, Email: a.sudham...@yahoo.com, UGExam: B.SC CHEMISTRY, Scheme: RGNF, UGSubject: Chemical Sciences, UGCollageORUniversity: Queen marys college/madras university, UGInPersentage: 66.5, Rank: Second, UGRankSubject: chemisty, StatusOFUGInstituteORCollageORUniversity: State University, NoOfStudentsAppered: 446, PGDegree: M.A., PGSubject: Politics (Specialization in International Relations), DateOfPGAdmission: 7/22/2013, PGCollageORUniversity: Jawahar Lal Nehru, FatherName: PARTHA SARATHI BHATTACHARYA, Address: DA-10- DDA-FLATS- MUNIRKA NEW DELHI NEW DELHI 110067, Contect: 7278635773, Updated_By: RAVI, FinalRemarks: Duplicate, _version_: 1461765344630669312 } ] } } Regards, Kishan Parmar Software Developer +91 95 100 77394 Jay Shree Krishnaa !! On Thu, Mar 6, 2014 at 2:23 AM, Ahmet Arslan iori...@yahoo.com wrote: Hi Kishan, can you please give us example document query pair that query should retrieve that document. e.g. query q=State:tamil nadu should return what document text? Ahmet On Wednesday, March 5, 2014 9:04 PM, Kishan Parmar kishan@gmail.com wrote: Thanks , but still no change in output --- q=State:tamil nadu it parse as q: State:\tamil nadu\ Regards, Kishan Parmar Software Developer +91 95 100 77394 Jay Shree Krishnaa !! 2014-03-06 0:17 GMT+05:30 Ahmet Arslan iori...@yahoo.com: Hi, I suspect q=State:tamil nadu parsed as State:tamil text:nadu. You can confirm this by adding debugQuery=on. Either use quotes q=State:tamil nadu or use term query parser q={!term f=State}tamil nadu Ahmet On Wednesday, March 5, 2014 8:29 PM, Kishan Parmar kishan@gmail.com wrote: hi there my schema file is this--- ?xml version=1.0 encoding=UTF-8 ? schema name=example version=1.2 types fieldType name=string class=solr.StrField sortMissingLast=true omitNorms=true /
Re: query problem
On 6 March 2014 11:23, Kishan Parmar kishan@gmail.com wrote: Thanks, my documents are xml files i am attaching that document in this and in my project i have to search from each field defined in schema.xml [...] The type for State in your schema is string which is a non-analysed field that stores the text verbatim, i.e., here it is preserving case. Try searching for State:Tamil Nadu. Regards, Gora
Re: Query problem in Solr
@ Pravesh: It's 2 seperate cores, not 2 indexes. Sorry for that. @ Erick: Yes, I've seen this suggestion and it seems to be the only possible solution. I'll look into it. Thanks for your answers guys! Kurt On Wed, Jun 1, 2011 at 4:24 PM, Erick Erickson erickerick...@gmail.comwrote: If I read this correctly, one approach is to specify an increment gap in a multiValued field, then search for phrases with a slop less than that increment gap. i.e. incrementGap=100 in your definition, and search for apple orange~99 If this is gibberish, please post some examples and we'll try something else. Best Erick On Wed, Jun 1, 2011 at 4:21 AM, Kurt Sultana kurtanat...@gmail.com wrote: Hi all, We're using Solr to search on a Shop index and a Product index. Currently a Shop has a field `shop_keyword` which also contains the keywords of the products assigned to it. The shop keywords are separated by a space. Consequently, if there is a product which has a keyword apple and another which has orange, a search for shops having `Apple AND Orange` would return the shop for these products. However, this is incorrect since we want that a search for shops having `Apple AND Orange` returns shop(s) having products with both apple and orange as keywords. We tried solving this problem, by making shop keywords multi-valued and assigning the keywords of every product of the shop as a new value in shop keywords. However as was confirmed in another post http://markmail.org/thread/xce4qyzs5367yplo#query:+page:1+mid:76eerw5yqev2aanu+state:results , Solr does not support all words must match in the same value of a multi-valued field. (Hope I explained myself well) How can we go about this? Ideally, we shouldn't change our search infrastructure dramatically. Thanks! Krt_Malta
Re: Query problem in Solr
We're using Solr to search on a Shop index and a Product index Do you have 2 separate indexes (using distributed shard search)?? I'm sure you are actually having only single index. Currently a Shop has a field `shop_keyword` which also contains the keywords of the products assigned to it. You mean, for a shop, you are first concatenating all keywords of all products and then saving in shop_keywords field for the shop?? In this case there is no way u can identify which keyword occurs in which product in ur index. You might need to change the index structure, may be, when u post documents, then post a single document for a single product(with fields like title,price,shop-id, etc), instead of single document for a single shop. Hope I make myself clear -- View this message in context: http://lucene.472066.n3.nabble.com/Query-problem-in-Solr-tp3009812p3010072.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Query problem in Solr
If I read this correctly, one approach is to specify an increment gap in a multiValued field, then search for phrases with a slop less than that increment gap. i.e. incrementGap=100 in your definition, and search for apple orange~99 If this is gibberish, please post some examples and we'll try something else. Best Erick On Wed, Jun 1, 2011 at 4:21 AM, Kurt Sultana kurtanat...@gmail.com wrote: Hi all, We're using Solr to search on a Shop index and a Product index. Currently a Shop has a field `shop_keyword` which also contains the keywords of the products assigned to it. The shop keywords are separated by a space. Consequently, if there is a product which has a keyword apple and another which has orange, a search for shops having `Apple AND Orange` would return the shop for these products. However, this is incorrect since we want that a search for shops having `Apple AND Orange` returns shop(s) having products with both apple and orange as keywords. We tried solving this problem, by making shop keywords multi-valued and assigning the keywords of every product of the shop as a new value in shop keywords. However as was confirmed in another post http://markmail.org/thread/xce4qyzs5367yplo#query:+page:1+mid:76eerw5yqev2aanu+state:results, Solr does not support all words must match in the same value of a multi-valued field. (Hope I explained myself well) How can we go about this? Ideally, we shouldn't change our search infrastructure dramatically. Thanks! Krt_Malta
Re: Query Problem
Hi Erick, you were right. I'm looking the source of the search result (instead of the render of internet explorer :$) and i see this: str name=SectionNameProgramas_Home /str So i think that is the problem is in the SSIS process that retrieves data from the DB and sends it to solr. The data type in the db is VARCHAR(100)... but i'm sure that somewhere is mapping it to CHAR(100) so it's length its always 100. Thank you very much, i will keep you informed Thanks On Thu, Dec 16, 2010 at 9:38 PM, Erick Erickson erickerick...@gmail.comwrote: OK, it works perfectly for me on a 1.4.1 instance. I've looked over your files a couple of times and see nothing obvious (but you'll never find anyone better at overlooking the obvious than me!). Tokenizing and stemming are irrelevant in this case because your type is string, which is an untokenizedtype so you don't need to go there. The way your query parses and analyzes backs this up, so you're getting to the right schema definition. Which may bring us to whether what's in the index is what you *think* is in there. I'm betting not. Either you changed the schema and didn't re-index (say changed index=false to index=true), you didn't commit the documents after indexing or other such-like, or changed the field type and didn't reindex. So go into /solr/admin. Click on schema browser, click on fields. Along the left you should see SectionName, click on that. That will show you the #indexed# terms, and you should see, exactly, Programas_Home in there, just like in your returned documents. Let us know if that's in fact what you do see. It's possible you're being mislead by the difference between seeing the value in a returned document (the stored value) and what's searched on (the indexed token(s)). And I'm assuming that some asterisks in your mails were really there for bolding and you are NOT doing wildcard searches for, for instance, *SectionName:Programas_Home*. But we're at a point where my 1.4.1 instance produces the results you're expecting, at least as I understand them so I don't think it's a problem with Solr, but some change you've made is producing results you don't expect but are correct. Like I said, look at the indexed terms. If you see Programas_Home in the admin console after following the steps above, then I don't know what to suggest Best Erick On Thu, Dec 16, 2010 at 5:12 PM, Ezequiel Calderara ezech...@gmail.com wrote: The jars are named like *1.4.1* . So i suppose its the version 1.4.1 Thanks! On Thu, Dec 16, 2010 at 6:54 PM, Erick Erickson erickerick...@gmail.com wrote: OK, what version of Solr are you using? I can take a quick check to see what behavior I get Erick On Thu, Dec 16, 2010 at 4:44 PM, Ezequiel Calderara ezech...@gmail.com wrote: I'll check the Tokenizer to see if that's the problem. The results of Analysis Page for SectionName:Programas_Home Query Analyzer org.apache.solr.schema.FieldType$DefaultAnalyzer {} term position 1 term text Programas_Home term type word source start,end 0,14 payload So it's not having problems with that... Also in the debug you can see that the parsed query is correct... So i don't know where to look... I know nothing about Stemming or tokenizing, but i will look if that has anything to do. If anyone can help me out, please do :D On Thu, Dec 16, 2010 at 5:55 PM, Erick Erickson erickerick...@gmail.com wrote: Ezequiel: Nice job of including relevant details, by the way. Unfortunately I'm puzzled too. Your SectionName is a string type, so it should be placed in the index as-is. Be a bit cautious about looking at returned results (as I see in one of your xml files) because the returned values are the verbatim, stored field NOT what's tokenized, and the tokenized data is what's searched.. That said, you SectionName should not be tokenized at all because it's a string type. Take a look at the admin page, schema browser and see what values for SectionName look (these will be the tokenized values. They should be exactly Programas_Name, complete with underscore, case changes, etc. Is that the case? Another place that might help is the admin/analysis page. Check the debug boxes and input your steps and it'll show you what the transformations are applied. But a quick look leaves me completely baffled. Sorry I can't be more help Erick On Thu, Dec 16, 2010 at 2:07 PM, Ezequiel Calderara ezech...@gmail.com wrote: Hi all, I have the following problems. I have this set of data (View data (Pastebin) http://pastebin.com/jKbUhjVS ) If i do a search for: *SectionName:Programas_Home* i have no results: Returned Data (PasteBin)
Re: Query Problem
Right, I *love* problems like this... NOT You might get some joy out of using TrimFilterFactory along with KeywordAnalyzer, something like this: fieldType name=trimField class=solr.TextField your options here analyzer tokenizer class=solr.KeywordTokenizerFactory / filter class=solr.TrimFilterFactory / /analyzer /fieldType but it depends upon what your fields are padded with Best Erick On Fri, Dec 17, 2010 at 8:12 AM, Ezequiel Calderara ezech...@gmail.comwrote: Hi Erick, you were right. I'm looking the source of the search result (instead of the render of internet explorer :$) and i see this: str name=SectionNameProgramas_Home /str So i think that is the problem is in the SSIS process that retrieves data from the DB and sends it to solr. The data type in the db is VARCHAR(100)... but i'm sure that somewhere is mapping it to CHAR(100) so it's length its always 100. Thank you very much, i will keep you informed Thanks On Thu, Dec 16, 2010 at 9:38 PM, Erick Erickson erickerick...@gmail.com wrote: OK, it works perfectly for me on a 1.4.1 instance. I've looked over your files a couple of times and see nothing obvious (but you'll never find anyone better at overlooking the obvious than me!). Tokenizing and stemming are irrelevant in this case because your type is string, which is an untokenizedtype so you don't need to go there. The way your query parses and analyzes backs this up, so you're getting to the right schema definition. Which may bring us to whether what's in the index is what you *think* is in there. I'm betting not. Either you changed the schema and didn't re-index (say changed index=false to index=true), you didn't commit the documents after indexing or other such-like, or changed the field type and didn't reindex. So go into /solr/admin. Click on schema browser, click on fields. Along the left you should see SectionName, click on that. That will show you the #indexed# terms, and you should see, exactly, Programas_Home in there, just like in your returned documents. Let us know if that's in fact what you do see. It's possible you're being mislead by the difference between seeing the value in a returned document (the stored value) and what's searched on (the indexed token(s)). And I'm assuming that some asterisks in your mails were really there for bolding and you are NOT doing wildcard searches for, for instance, *SectionName:Programas_Home*. But we're at a point where my 1.4.1 instance produces the results you're expecting, at least as I understand them so I don't think it's a problem with Solr, but some change you've made is producing results you don't expect but are correct. Like I said, look at the indexed terms. If you see Programas_Home in the admin console after following the steps above, then I don't know what to suggest Best Erick On Thu, Dec 16, 2010 at 5:12 PM, Ezequiel Calderara ezech...@gmail.com wrote: The jars are named like *1.4.1* . So i suppose its the version 1.4.1 Thanks! On Thu, Dec 16, 2010 at 6:54 PM, Erick Erickson erickerick...@gmail.com wrote: OK, what version of Solr are you using? I can take a quick check to see what behavior I get Erick On Thu, Dec 16, 2010 at 4:44 PM, Ezequiel Calderara ezech...@gmail.com wrote: I'll check the Tokenizer to see if that's the problem. The results of Analysis Page for SectionName:Programas_Home Query Analyzer org.apache.solr.schema.FieldType$DefaultAnalyzer {} term position 1 term text Programas_Home term type word source start,end 0,14 payload So it's not having problems with that... Also in the debug you can see that the parsed query is correct... So i don't know where to look... I know nothing about Stemming or tokenizing, but i will look if that has anything to do. If anyone can help me out, please do :D On Thu, Dec 16, 2010 at 5:55 PM, Erick Erickson erickerick...@gmail.com wrote: Ezequiel: Nice job of including relevant details, by the way. Unfortunately I'm puzzled too. Your SectionName is a string type, so it should be placed in the index as-is. Be a bit cautious about looking at returned results (as I see in one of your xml files) because the returned values are the verbatim, stored field NOT what's tokenized, and the tokenized data is what's searched.. That said, you SectionName should not be tokenized at all because it's a string type. Take a look at the admin page, schema browser and see what values for SectionName look (these will be the tokenized values. They should be exactly Programas_Name, complete with underscore, case changes, etc. Is that the case? Another place
Re: Query Problem
Well... finally... isn't solr problem. Isn't solr config problem. Is Microsoft's problem: http://flyingtriangles.blogspot.com/2006/08/workaround-to-ssis-strings-are-not.html Thank you very much erick!! you really helped on the solution of this! On Fri, Dec 17, 2010 at 10:52 AM, Erick Erickson erickerick...@gmail.comwrote: Right, I *love* problems like this... NOT You might get some joy out of using TrimFilterFactory along with KeywordAnalyzer, something like this: fieldType name=trimField class=solr.TextField your options here analyzer tokenizer class=solr.KeywordTokenizerFactory / filter class=solr.TrimFilterFactory / /analyzer /fieldType but it depends upon what your fields are padded with Best Erick On Fri, Dec 17, 2010 at 8:12 AM, Ezequiel Calderara ezech...@gmail.com wrote: Hi Erick, you were right. I'm looking the source of the search result (instead of the render of internet explorer :$) and i see this: str name=SectionNameProgramas_Home /str So i think that is the problem is in the SSIS process that retrieves data from the DB and sends it to solr. The data type in the db is VARCHAR(100)... but i'm sure that somewhere is mapping it to CHAR(100) so it's length its always 100. Thank you very much, i will keep you informed Thanks On Thu, Dec 16, 2010 at 9:38 PM, Erick Erickson erickerick...@gmail.com wrote: OK, it works perfectly for me on a 1.4.1 instance. I've looked over your files a couple of times and see nothing obvious (but you'll never find anyone better at overlooking the obvious than me!). Tokenizing and stemming are irrelevant in this case because your type is string, which is an untokenizedtype so you don't need to go there. The way your query parses and analyzes backs this up, so you're getting to the right schema definition. Which may bring us to whether what's in the index is what you *think* is in there. I'm betting not. Either you changed the schema and didn't re-index (say changed index=false to index=true), you didn't commit the documents after indexing or other such-like, or changed the field type and didn't reindex. So go into /solr/admin. Click on schema browser, click on fields. Along the left you should see SectionName, click on that. That will show you the #indexed# terms, and you should see, exactly, Programas_Home in there, just like in your returned documents. Let us know if that's in fact what you do see. It's possible you're being mislead by the difference between seeing the value in a returned document (the stored value) and what's searched on (the indexed token(s)). And I'm assuming that some asterisks in your mails were really there for bolding and you are NOT doing wildcard searches for, for instance, *SectionName:Programas_Home*. But we're at a point where my 1.4.1 instance produces the results you're expecting, at least as I understand them so I don't think it's a problem with Solr, but some change you've made is producing results you don't expect but are correct. Like I said, look at the indexed terms. If you see Programas_Home in the admin console after following the steps above, then I don't know what to suggest Best Erick On Thu, Dec 16, 2010 at 5:12 PM, Ezequiel Calderara ezech...@gmail.com wrote: The jars are named like *1.4.1* . So i suppose its the version 1.4.1 Thanks! On Thu, Dec 16, 2010 at 6:54 PM, Erick Erickson erickerick...@gmail.com wrote: OK, what version of Solr are you using? I can take a quick check to see what behavior I get Erick On Thu, Dec 16, 2010 at 4:44 PM, Ezequiel Calderara ezech...@gmail.com wrote: I'll check the Tokenizer to see if that's the problem. The results of Analysis Page for SectionName:Programas_Home Query Analyzer org.apache.solr.schema.FieldType$DefaultAnalyzer {} term position 1 term text Programas_Home term type word source start,end 0,14 payload So it's not having problems with that... Also in the debug you can see that the parsed query is correct... So i don't know where to look... I know nothing about Stemming or tokenizing, but i will look if that has anything to do. If anyone can help me out, please do :D On Thu, Dec 16, 2010 at 5:55 PM, Erick Erickson erickerick...@gmail.com wrote: Ezequiel: Nice job of including relevant details, by the way. Unfortunately I'm puzzled too. Your SectionName is a string type, so it should be placed in the index as-is. Be a bit cautious about looking at returned results (as I see in one of your xml files) because the returned
Re: Query Problem
Ezequiel: Nice job of including relevant details, by the way. Unfortunately I'm puzzled too. Your SectionName is a string type, so it should be placed in the index as-is. Be a bit cautious about looking at returned results (as I see in one of your xml files) because the returned values are the verbatim, stored field NOT what's tokenized, and the tokenized data is what's searched.. That said, you SectionName should not be tokenized at all because it's a string type. Take a look at the admin page, schema browser and see what values for SectionName look (these will be the tokenized values. They should be exactly Programas_Name, complete with underscore, case changes, etc. Is that the case? Another place that might help is the admin/analysis page. Check the debug boxes and input your steps and it'll show you what the transformations are applied. But a quick look leaves me completely baffled. Sorry I can't be more help Erick On Thu, Dec 16, 2010 at 2:07 PM, Ezequiel Calderara ezech...@gmail.comwrote: Hi all, I have the following problems. I have this set of data (View data (Pastebin) http://pastebin.com/jKbUhjVS ) If i do a search for: *SectionName:Programas_Home* i have no results: Returned Data (PasteBin) http://pastebin.com/wnPdHqBm If i do a search for: *Programas_Home* i have only 1 result: Result Returned (Pastebin) http://pastebin.com/fMZkLvYK if i do a search for: SectionName:Programa* i have 1 result: Result Returned (Pastebin) http://pastebin.com/kLLnVp4b This is my *schema* http://pastebin.com/PQM8uap4 (Pastebin) and this is my *solrconfig* http://%3c/?xml version=1.0 encoding=UTF-8 ?(PasteBin) I don't understand why when searching for SectionName:Programas_Home isn't returning any results at all... Can someone send some light on this? -- __ Ezequiel. Http://www.ironicnet.com
Re: Query Problem
I'll check the Tokenizer to see if that's the problem. The results of Analysis Page for SectionName:Programas_Home Query Analyzer org.apache.solr.schema.FieldType$DefaultAnalyzer {} term position 1 term text Programas_Home term type word source start,end 0,14 payload So it's not having problems with that... Also in the debug you can see that the parsed query is correct... So i don't know where to look... I know nothing about Stemming or tokenizing, but i will look if that has anything to do. If anyone can help me out, please do :D On Thu, Dec 16, 2010 at 5:55 PM, Erick Erickson erickerick...@gmail.comwrote: Ezequiel: Nice job of including relevant details, by the way. Unfortunately I'm puzzled too. Your SectionName is a string type, so it should be placed in the index as-is. Be a bit cautious about looking at returned results (as I see in one of your xml files) because the returned values are the verbatim, stored field NOT what's tokenized, and the tokenized data is what's searched.. That said, you SectionName should not be tokenized at all because it's a string type. Take a look at the admin page, schema browser and see what values for SectionName look (these will be the tokenized values. They should be exactly Programas_Name, complete with underscore, case changes, etc. Is that the case? Another place that might help is the admin/analysis page. Check the debug boxes and input your steps and it'll show you what the transformations are applied. But a quick look leaves me completely baffled. Sorry I can't be more help Erick On Thu, Dec 16, 2010 at 2:07 PM, Ezequiel Calderara ezech...@gmail.com wrote: Hi all, I have the following problems. I have this set of data (View data (Pastebin) http://pastebin.com/jKbUhjVS ) If i do a search for: *SectionName:Programas_Home* i have no results: Returned Data (PasteBin) http://pastebin.com/wnPdHqBm If i do a search for: *Programas_Home* i have only 1 result: Result Returned (Pastebin) http://pastebin.com/fMZkLvYK if i do a search for: SectionName:Programa* i have 1 result: Result Returned (Pastebin) http://pastebin.com/kLLnVp4b This is my *schema* http://pastebin.com/PQM8uap4 (Pastebin) and this is my *solrconfig* http://%3c/?xml version=1.0 encoding=UTF-8 ?(PasteBin) I don't understand why when searching for SectionName:Programas_Home isn't returning any results at all... Can someone send some light on this? -- __ Ezequiel. Http://www.ironicnet.com http://www.ironicnet.com/ -- __ Ezequiel. Http://www.ironicnet.com
Re: Query Problem
OK, what version of Solr are you using? I can take a quick check to see what behavior I get Erick On Thu, Dec 16, 2010 at 4:44 PM, Ezequiel Calderara ezech...@gmail.comwrote: I'll check the Tokenizer to see if that's the problem. The results of Analysis Page for SectionName:Programas_Home Query Analyzer org.apache.solr.schema.FieldType$DefaultAnalyzer {} term position 1 term text Programas_Home term type word source start,end 0,14 payload So it's not having problems with that... Also in the debug you can see that the parsed query is correct... So i don't know where to look... I know nothing about Stemming or tokenizing, but i will look if that has anything to do. If anyone can help me out, please do :D On Thu, Dec 16, 2010 at 5:55 PM, Erick Erickson erickerick...@gmail.com wrote: Ezequiel: Nice job of including relevant details, by the way. Unfortunately I'm puzzled too. Your SectionName is a string type, so it should be placed in the index as-is. Be a bit cautious about looking at returned results (as I see in one of your xml files) because the returned values are the verbatim, stored field NOT what's tokenized, and the tokenized data is what's searched.. That said, you SectionName should not be tokenized at all because it's a string type. Take a look at the admin page, schema browser and see what values for SectionName look (these will be the tokenized values. They should be exactly Programas_Name, complete with underscore, case changes, etc. Is that the case? Another place that might help is the admin/analysis page. Check the debug boxes and input your steps and it'll show you what the transformations are applied. But a quick look leaves me completely baffled. Sorry I can't be more help Erick On Thu, Dec 16, 2010 at 2:07 PM, Ezequiel Calderara ezech...@gmail.com wrote: Hi all, I have the following problems. I have this set of data (View data (Pastebin) http://pastebin.com/jKbUhjVS ) If i do a search for: *SectionName:Programas_Home* i have no results: Returned Data (PasteBin) http://pastebin.com/wnPdHqBm If i do a search for: *Programas_Home* i have only 1 result: Result Returned (Pastebin) http://pastebin.com/fMZkLvYK if i do a search for: SectionName:Programa* i have 1 result: Result Returned (Pastebin) http://pastebin.com/kLLnVp4b This is my *schema* http://pastebin.com/PQM8uap4 (Pastebin) and this is my *solrconfig* http://%3c/?xml version=1.0 encoding=UTF-8 ?(PasteBin) I don't understand why when searching for SectionName:Programas_Home isn't returning any results at all... Can someone send some light on this? -- __ Ezequiel. Http://www.ironicnet.com http://www.ironicnet.com/ -- __ Ezequiel. Http://www.ironicnet.com
Re: Query Problem
The jars are named like *1.4.1* . So i suppose its the version 1.4.1 Thanks! On Thu, Dec 16, 2010 at 6:54 PM, Erick Erickson erickerick...@gmail.comwrote: OK, what version of Solr are you using? I can take a quick check to see what behavior I get Erick On Thu, Dec 16, 2010 at 4:44 PM, Ezequiel Calderara ezech...@gmail.com wrote: I'll check the Tokenizer to see if that's the problem. The results of Analysis Page for SectionName:Programas_Home Query Analyzer org.apache.solr.schema.FieldType$DefaultAnalyzer {} term position 1 term text Programas_Home term type word source start,end 0,14 payload So it's not having problems with that... Also in the debug you can see that the parsed query is correct... So i don't know where to look... I know nothing about Stemming or tokenizing, but i will look if that has anything to do. If anyone can help me out, please do :D On Thu, Dec 16, 2010 at 5:55 PM, Erick Erickson erickerick...@gmail.com wrote: Ezequiel: Nice job of including relevant details, by the way. Unfortunately I'm puzzled too. Your SectionName is a string type, so it should be placed in the index as-is. Be a bit cautious about looking at returned results (as I see in one of your xml files) because the returned values are the verbatim, stored field NOT what's tokenized, and the tokenized data is what's searched.. That said, you SectionName should not be tokenized at all because it's a string type. Take a look at the admin page, schema browser and see what values for SectionName look (these will be the tokenized values. They should be exactly Programas_Name, complete with underscore, case changes, etc. Is that the case? Another place that might help is the admin/analysis page. Check the debug boxes and input your steps and it'll show you what the transformations are applied. But a quick look leaves me completely baffled. Sorry I can't be more help Erick On Thu, Dec 16, 2010 at 2:07 PM, Ezequiel Calderara ezech...@gmail.com wrote: Hi all, I have the following problems. I have this set of data (View data (Pastebin) http://pastebin.com/jKbUhjVS ) If i do a search for: *SectionName:Programas_Home* i have no results: Returned Data (PasteBin) http://pastebin.com/wnPdHqBm If i do a search for: *Programas_Home* i have only 1 result: Result Returned (Pastebin) http://pastebin.com/fMZkLvYK if i do a search for: SectionName:Programa* i have 1 result: Result Returned (Pastebin) http://pastebin.com/kLLnVp4b This is my *schema* http://pastebin.com/PQM8uap4 (Pastebin) and this is my *solrconfig* http://%3c/?xml version=1.0 encoding=UTF-8 ?(PasteBin) I don't understand why when searching for SectionName:Programas_Home isn't returning any results at all... Can someone send some light on this? -- __ Ezequiel. Http://www.ironicnet.com http://www.ironicnet.com/ http://www.ironicnet.com/ -- __ Ezequiel. Http://www.ironicnet.com http://www.ironicnet.com/ -- __ Ezequiel. Http://www.ironicnet.com
Re: Query Problem
OK, it works perfectly for me on a 1.4.1 instance. I've looked over your files a couple of times and see nothing obvious (but you'll never find anyone better at overlooking the obvious than me!). Tokenizing and stemming are irrelevant in this case because your type is string, which is an untokenizedtype so you don't need to go there. The way your query parses and analyzes backs this up, so you're getting to the right schema definition. Which may bring us to whether what's in the index is what you *think* is in there. I'm betting not. Either you changed the schema and didn't re-index (say changed index=false to index=true), you didn't commit the documents after indexing or other such-like, or changed the field type and didn't reindex. So go into /solr/admin. Click on schema browser, click on fields. Along the left you should see SectionName, click on that. That will show you the #indexed# terms, and you should see, exactly, Programas_Home in there, just like in your returned documents. Let us know if that's in fact what you do see. It's possible you're being mislead by the difference between seeing the value in a returned document (the stored value) and what's searched on (the indexed token(s)). And I'm assuming that some asterisks in your mails were really there for bolding and you are NOT doing wildcard searches for, for instance, *SectionName:Programas_Home*. But we're at a point where my 1.4.1 instance produces the results you're expecting, at least as I understand them so I don't think it's a problem with Solr, but some change you've made is producing results you don't expect but are correct. Like I said, look at the indexed terms. If you see Programas_Home in the admin console after following the steps above, then I don't know what to suggest Best Erick On Thu, Dec 16, 2010 at 5:12 PM, Ezequiel Calderara ezech...@gmail.comwrote: The jars are named like *1.4.1* . So i suppose its the version 1.4.1 Thanks! On Thu, Dec 16, 2010 at 6:54 PM, Erick Erickson erickerick...@gmail.com wrote: OK, what version of Solr are you using? I can take a quick check to see what behavior I get Erick On Thu, Dec 16, 2010 at 4:44 PM, Ezequiel Calderara ezech...@gmail.com wrote: I'll check the Tokenizer to see if that's the problem. The results of Analysis Page for SectionName:Programas_Home Query Analyzer org.apache.solr.schema.FieldType$DefaultAnalyzer {} term position 1 term text Programas_Home term type word source start,end 0,14 payload So it's not having problems with that... Also in the debug you can see that the parsed query is correct... So i don't know where to look... I know nothing about Stemming or tokenizing, but i will look if that has anything to do. If anyone can help me out, please do :D On Thu, Dec 16, 2010 at 5:55 PM, Erick Erickson erickerick...@gmail.com wrote: Ezequiel: Nice job of including relevant details, by the way. Unfortunately I'm puzzled too. Your SectionName is a string type, so it should be placed in the index as-is. Be a bit cautious about looking at returned results (as I see in one of your xml files) because the returned values are the verbatim, stored field NOT what's tokenized, and the tokenized data is what's searched.. That said, you SectionName should not be tokenized at all because it's a string type. Take a look at the admin page, schema browser and see what values for SectionName look (these will be the tokenized values. They should be exactly Programas_Name, complete with underscore, case changes, etc. Is that the case? Another place that might help is the admin/analysis page. Check the debug boxes and input your steps and it'll show you what the transformations are applied. But a quick look leaves me completely baffled. Sorry I can't be more help Erick On Thu, Dec 16, 2010 at 2:07 PM, Ezequiel Calderara ezech...@gmail.com wrote: Hi all, I have the following problems. I have this set of data (View data (Pastebin) http://pastebin.com/jKbUhjVS ) If i do a search for: *SectionName:Programas_Home* i have no results: Returned Data (PasteBin) http://pastebin.com/wnPdHqBm If i do a search for: *Programas_Home* i have only 1 result: Result Returned (Pastebin) http://pastebin.com/fMZkLvYK if i do a search for: SectionName:Programa* i have 1 result: Result Returned (Pastebin) http://pastebin.com/kLLnVp4b This is my *schema* http://pastebin.com/PQM8uap4 (Pastebin) and this is my *solrconfig* http://%3c/?xml version=1.0 encoding=UTF-8 ?(PasteBin) I don't understand why when searching for SectionName:Programas_Home isn't returning any results at all... Can someone send some light on this? -- __ Ezequiel.
Re: Query problem related to * symbol
On Sat, Oct 25, 2008 at 2:00 PM, Aleksey Gogolev [EMAIL PROTECTED] wrote: I made this query: http://localhost:8983/solr/select/?q=suggestion:ipod+nano+80* Note that in Lucene syntax, this query is equivalent to suggestion:ipod default_field:nano default_field:80* For debugging, add debugQuery=true to your request to see what the parsed query looks like. -Yonik