Re: autocomplete_edge type split words
in fact, I've removed the autoGeneratePhraseQuery=true, and it doesn't change anything. behaviour is the same with or without (ie request with debugQuery=on is the same) Thanks for your comments. Best, Elisabeth 2013/9/28 Erick Erickson erickerick...@gmail.com You've probably been doing this right along, but adding debug=query will show the parsed query. I really question though, your apparent combination of autoGeneratePhraseQuery what looks like an ngram field. I'm not at all sure how those would interact... Best, Erick On Fri, Sep 27, 2013 at 10:12 AM, elisabeth benoit elisaelisael...@gmail.com wrote: Yes! what I've done is set autoGeneratePhraseQueries to true for my field, then give it a boost (bq=myAutompleteEdgeNGramField=my query with spaces^50). This only worked with autoGeneratePhraseQueries=true, for a reason I didn't understand. since when I did q= myAutompleteEdgeNGramField=my query with spaces, I didn't need autoGeneratePhraseQueries set to true. and, another thing is when I tried q=myAutocompleteNGramField:(my query with spaces) OR myAutompleteEdgeNGramField=my query with spaces (with a request handler with edismax and default operator field = AND), the request on myAutocompleteNGramField would OR the grams, so I had to put an AND (myAutocompleteNGramField:(my AND query AND with AND spaces)), which was pretty ugly. I don't always understand what is exactly going on. If you have a pointer to some text I could read to get more insights about this, please let me know. Thanks again, Best regards, Elisabeth 2013/9/27 Erick Erickson erickerick...@gmail.com Have you looked at autoGeneratePhraseQueries? That might help. If that doesn't work, you can always do something like add an OR clause like OR original query and optionally boost it high. But I'd start with the autoGenerate bits. Best, Erick On Fri, Sep 27, 2013 at 7:37 AM, elisabeth benoit elisaelisael...@gmail.com wrote: Thanks for your answer. So I guess if someone wants to search on two fields, on with phrase query and one with normal query (splitted in words), one has to find a way to send query twice: one with quote and one without... Best regards, Elisabeth 2013/9/27 Erick Erickson erickerick...@gmail.com This is a classic issue where there's confusion between the query parser and field analysis. Early in the process the query parser has to take the input and break it up. that's how, for instance, a query like text:term1 term2 gets parsed as text:term1 defaultfield:term2 This happens long before the terms get to the analysis chain for the field. So your only options are to either quote the string or escape the spaces. Best, Erick On Wed, Sep 25, 2013 at 9:24 AM, elisabeth benoit elisaelisael...@gmail.com wrote: Hello, I am using solr 4.2.1 and I have a autocomplete_edge type defined in schema.xml fieldType name=autocomplete_edge class=solr.TextField analyzer type=index charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=\s+ replacement= replace=all/ filter class=solr.EdgeNGramFilterFactory maxGramSize=30 minGramSize=1/ /analyzer analyzer type=query charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=\s+ replacement= replace=all/ filter class=solr.PatternReplaceFilterFactory pattern=^(.{30})(.*)? replacement=$1 replace=all/ /analyzer /fieldType When I have a request with more then one word, for instance rue de la, my request doesn't match with my autocomplete_edge field unless I use quotes around the query. In other words q=rue de la doesnt work and q=rue de la works. I've check the request with debugQuery=on, and I can see in first case, the query is splitted into words, and I don't understand why since my field type uses KeywordTokenizerFactory. Does anyone have a clue on how I can request my field without using quotes? Thanks, Elisabeth
Re: autocomplete_edge type split words
You've probably been doing this right along, but adding debug=query will show the parsed query. I really question though, your apparent combination of autoGeneratePhraseQuery what looks like an ngram field. I'm not at all sure how those would interact... Best, Erick On Fri, Sep 27, 2013 at 10:12 AM, elisabeth benoit elisaelisael...@gmail.com wrote: Yes! what I've done is set autoGeneratePhraseQueries to true for my field, then give it a boost (bq=myAutompleteEdgeNGramField=my query with spaces^50). This only worked with autoGeneratePhraseQueries=true, for a reason I didn't understand. since when I did q= myAutompleteEdgeNGramField=my query with spaces, I didn't need autoGeneratePhraseQueries set to true. and, another thing is when I tried q=myAutocompleteNGramField:(my query with spaces) OR myAutompleteEdgeNGramField=my query with spaces (with a request handler with edismax and default operator field = AND), the request on myAutocompleteNGramField would OR the grams, so I had to put an AND (myAutocompleteNGramField:(my AND query AND with AND spaces)), which was pretty ugly. I don't always understand what is exactly going on. If you have a pointer to some text I could read to get more insights about this, please let me know. Thanks again, Best regards, Elisabeth 2013/9/27 Erick Erickson erickerick...@gmail.com Have you looked at autoGeneratePhraseQueries? That might help. If that doesn't work, you can always do something like add an OR clause like OR original query and optionally boost it high. But I'd start with the autoGenerate bits. Best, Erick On Fri, Sep 27, 2013 at 7:37 AM, elisabeth benoit elisaelisael...@gmail.com wrote: Thanks for your answer. So I guess if someone wants to search on two fields, on with phrase query and one with normal query (splitted in words), one has to find a way to send query twice: one with quote and one without... Best regards, Elisabeth 2013/9/27 Erick Erickson erickerick...@gmail.com This is a classic issue where there's confusion between the query parser and field analysis. Early in the process the query parser has to take the input and break it up. that's how, for instance, a query like text:term1 term2 gets parsed as text:term1 defaultfield:term2 This happens long before the terms get to the analysis chain for the field. So your only options are to either quote the string or escape the spaces. Best, Erick On Wed, Sep 25, 2013 at 9:24 AM, elisabeth benoit elisaelisael...@gmail.com wrote: Hello, I am using solr 4.2.1 and I have a autocomplete_edge type defined in schema.xml fieldType name=autocomplete_edge class=solr.TextField analyzer type=index charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=\s+ replacement= replace=all/ filter class=solr.EdgeNGramFilterFactory maxGramSize=30 minGramSize=1/ /analyzer analyzer type=query charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=\s+ replacement= replace=all/ filter class=solr.PatternReplaceFilterFactory pattern=^(.{30})(.*)? replacement=$1 replace=all/ /analyzer /fieldType When I have a request with more then one word, for instance rue de la, my request doesn't match with my autocomplete_edge field unless I use quotes around the query. In other words q=rue de la doesnt work and q=rue de la works. I've check the request with debugQuery=on, and I can see in first case, the query is splitted into words, and I don't understand why since my field type uses KeywordTokenizerFactory. Does anyone have a clue on how I can request my field without using quotes? Thanks, Elisabeth
Re: autocomplete_edge type split words
Thanks for your answer. So I guess if someone wants to search on two fields, on with phrase query and one with normal query (splitted in words), one has to find a way to send query twice: one with quote and one without... Best regards, Elisabeth 2013/9/27 Erick Erickson erickerick...@gmail.com This is a classic issue where there's confusion between the query parser and field analysis. Early in the process the query parser has to take the input and break it up. that's how, for instance, a query like text:term1 term2 gets parsed as text:term1 defaultfield:term2 This happens long before the terms get to the analysis chain for the field. So your only options are to either quote the string or escape the spaces. Best, Erick On Wed, Sep 25, 2013 at 9:24 AM, elisabeth benoit elisaelisael...@gmail.com wrote: Hello, I am using solr 4.2.1 and I have a autocomplete_edge type defined in schema.xml fieldType name=autocomplete_edge class=solr.TextField analyzer type=index charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=\s+ replacement= replace=all/ filter class=solr.EdgeNGramFilterFactory maxGramSize=30 minGramSize=1/ /analyzer analyzer type=query charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=\s+ replacement= replace=all/ filter class=solr.PatternReplaceFilterFactory pattern=^(.{30})(.*)? replacement=$1 replace=all/ /analyzer /fieldType When I have a request with more then one word, for instance rue de la, my request doesn't match with my autocomplete_edge field unless I use quotes around the query. In other words q=rue de la doesnt work and q=rue de la works. I've check the request with debugQuery=on, and I can see in first case, the query is splitted into words, and I don't understand why since my field type uses KeywordTokenizerFactory. Does anyone have a clue on how I can request my field without using quotes? Thanks, Elisabeth
Re: autocomplete_edge type split words
Have you looked at autoGeneratePhraseQueries? That might help. If that doesn't work, you can always do something like add an OR clause like OR original query and optionally boost it high. But I'd start with the autoGenerate bits. Best, Erick On Fri, Sep 27, 2013 at 7:37 AM, elisabeth benoit elisaelisael...@gmail.com wrote: Thanks for your answer. So I guess if someone wants to search on two fields, on with phrase query and one with normal query (splitted in words), one has to find a way to send query twice: one with quote and one without... Best regards, Elisabeth 2013/9/27 Erick Erickson erickerick...@gmail.com This is a classic issue where there's confusion between the query parser and field analysis. Early in the process the query parser has to take the input and break it up. that's how, for instance, a query like text:term1 term2 gets parsed as text:term1 defaultfield:term2 This happens long before the terms get to the analysis chain for the field. So your only options are to either quote the string or escape the spaces. Best, Erick On Wed, Sep 25, 2013 at 9:24 AM, elisabeth benoit elisaelisael...@gmail.com wrote: Hello, I am using solr 4.2.1 and I have a autocomplete_edge type defined in schema.xml fieldType name=autocomplete_edge class=solr.TextField analyzer type=index charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=\s+ replacement= replace=all/ filter class=solr.EdgeNGramFilterFactory maxGramSize=30 minGramSize=1/ /analyzer analyzer type=query charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=\s+ replacement= replace=all/ filter class=solr.PatternReplaceFilterFactory pattern=^(.{30})(.*)? replacement=$1 replace=all/ /analyzer /fieldType When I have a request with more then one word, for instance rue de la, my request doesn't match with my autocomplete_edge field unless I use quotes around the query. In other words q=rue de la doesnt work and q=rue de la works. I've check the request with debugQuery=on, and I can see in first case, the query is splitted into words, and I don't understand why since my field type uses KeywordTokenizerFactory. Does anyone have a clue on how I can request my field without using quotes? Thanks, Elisabeth
Re: autocomplete_edge type split words
Yes! what I've done is set autoGeneratePhraseQueries to true for my field, then give it a boost (bq=myAutompleteEdgeNGramField=my query with spaces^50). This only worked with autoGeneratePhraseQueries=true, for a reason I didn't understand. since when I did q= myAutompleteEdgeNGramField=my query with spaces, I didn't need autoGeneratePhraseQueries set to true. and, another thing is when I tried q=myAutocompleteNGramField:(my query with spaces) OR myAutompleteEdgeNGramField=my query with spaces (with a request handler with edismax and default operator field = AND), the request on myAutocompleteNGramField would OR the grams, so I had to put an AND (myAutocompleteNGramField:(my AND query AND with AND spaces)), which was pretty ugly. I don't always understand what is exactly going on. If you have a pointer to some text I could read to get more insights about this, please let me know. Thanks again, Best regards, Elisabeth 2013/9/27 Erick Erickson erickerick...@gmail.com Have you looked at autoGeneratePhraseQueries? That might help. If that doesn't work, you can always do something like add an OR clause like OR original query and optionally boost it high. But I'd start with the autoGenerate bits. Best, Erick On Fri, Sep 27, 2013 at 7:37 AM, elisabeth benoit elisaelisael...@gmail.com wrote: Thanks for your answer. So I guess if someone wants to search on two fields, on with phrase query and one with normal query (splitted in words), one has to find a way to send query twice: one with quote and one without... Best regards, Elisabeth 2013/9/27 Erick Erickson erickerick...@gmail.com This is a classic issue where there's confusion between the query parser and field analysis. Early in the process the query parser has to take the input and break it up. that's how, for instance, a query like text:term1 term2 gets parsed as text:term1 defaultfield:term2 This happens long before the terms get to the analysis chain for the field. So your only options are to either quote the string or escape the spaces. Best, Erick On Wed, Sep 25, 2013 at 9:24 AM, elisabeth benoit elisaelisael...@gmail.com wrote: Hello, I am using solr 4.2.1 and I have a autocomplete_edge type defined in schema.xml fieldType name=autocomplete_edge class=solr.TextField analyzer type=index charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=\s+ replacement= replace=all/ filter class=solr.EdgeNGramFilterFactory maxGramSize=30 minGramSize=1/ /analyzer analyzer type=query charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=\s+ replacement= replace=all/ filter class=solr.PatternReplaceFilterFactory pattern=^(.{30})(.*)? replacement=$1 replace=all/ /analyzer /fieldType When I have a request with more then one word, for instance rue de la, my request doesn't match with my autocomplete_edge field unless I use quotes around the query. In other words q=rue de la doesnt work and q=rue de la works. I've check the request with debugQuery=on, and I can see in first case, the query is splitted into words, and I don't understand why since my field type uses KeywordTokenizerFactory. Does anyone have a clue on how I can request my field without using quotes? Thanks, Elisabeth
Re: autocomplete_edge type split words
This is a classic issue where there's confusion between the query parser and field analysis. Early in the process the query parser has to take the input and break it up. that's how, for instance, a query like text:term1 term2 gets parsed as text:term1 defaultfield:term2 This happens long before the terms get to the analysis chain for the field. So your only options are to either quote the string or escape the spaces. Best, Erick On Wed, Sep 25, 2013 at 9:24 AM, elisabeth benoit elisaelisael...@gmail.com wrote: Hello, I am using solr 4.2.1 and I have a autocomplete_edge type defined in schema.xml fieldType name=autocomplete_edge class=solr.TextField analyzer type=index charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=\s+ replacement= replace=all/ filter class=solr.EdgeNGramFilterFactory maxGramSize=30 minGramSize=1/ /analyzer analyzer type=query charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=\s+ replacement= replace=all/ filter class=solr.PatternReplaceFilterFactory pattern=^(.{30})(.*)? replacement=$1 replace=all/ /analyzer /fieldType When I have a request with more then one word, for instance rue de la, my request doesn't match with my autocomplete_edge field unless I use quotes around the query. In other words q=rue de la doesnt work and q=rue de la works. I've check the request with debugQuery=on, and I can see in first case, the query is splitted into words, and I don't understand why since my field type uses KeywordTokenizerFactory. Does anyone have a clue on how I can request my field without using quotes? Thanks, Elisabeth
autocomplete_edge type split words
Hello, I am using solr 4.2.1 and I have a autocomplete_edge type defined in schema.xml fieldType name=autocomplete_edge class=solr.TextField analyzer type=index charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=\s+ replacement= replace=all/ filter class=solr.EdgeNGramFilterFactory maxGramSize=30 minGramSize=1/ /analyzer analyzer type=query charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=\s+ replacement= replace=all/ filter class=solr.PatternReplaceFilterFactory pattern=^(.{30})(.*)? replacement=$1 replace=all/ /analyzer /fieldType When I have a request with more then one word, for instance rue de la, my request doesn't match with my autocomplete_edge field unless I use quotes around the query. In other words q=rue de la doesnt work and q=rue de la works. I've check the request with debugQuery=on, and I can see in first case, the query is splitted into words, and I don't understand why since my field type uses KeywordTokenizerFactory. Does anyone have a clue on how I can request my field without using quotes? Thanks, Elisabeth