Re: autocomplete_edge type split words

2013-09-30 Thread elisabeth benoit
in fact, I've removed the autoGeneratePhraseQuery=true, and it doesn't
change anything. behaviour is the same with or without (ie request with
debugQuery=on is the same)

Thanks for your comments.

Best,
Elisabeth


2013/9/28 Erick Erickson erickerick...@gmail.com

 You've probably been doing this right along, but adding
 debug=query will show the parsed query.

 I really question though, your apparent combination of
 autoGeneratePhraseQuery what looks like an ngram field.
 I'm not at all sure how those would interact...

 Best,
 Erick

 On Fri, Sep 27, 2013 at 10:12 AM, elisabeth benoit
 elisaelisael...@gmail.com wrote:
  Yes!
 
  what I've done is set autoGeneratePhraseQueries to true for my field,
 then
  give it a boost (bq=myAutompleteEdgeNGramField=my query with
 spaces^50).
  This only worked with autoGeneratePhraseQueries=true, for a reason I
 didn't
  understand.
 
  since when I did
 
  q= myAutompleteEdgeNGramField=my query with spaces, I didn't need
  autoGeneratePhraseQueries
  set to true.
 
  and, another thing is when I tried
 
  q=myAutocompleteNGramField:(my query with spaces) OR
  myAutompleteEdgeNGramField=my
  query with spaces
 
  (with a request handler with edismax and default operator field = AND),
 the
  request on myAutocompleteNGramField would OR the grams, so I had to put
 an
  AND (myAutocompleteNGramField:(my AND query AND with AND spaces)), which
  was pretty ugly.
 
  I don't always understand what is exactly going on. If you have a pointer
  to some text I could read to get more insights about this, please let me
  know.
 
  Thanks again,
  Best regards,
  Elisabeth
 
 
 
 
  2013/9/27 Erick Erickson erickerick...@gmail.com
 
  Have you looked at autoGeneratePhraseQueries? That might help.
 
  If that doesn't work, you can always do something like add an OR clause
  like
  OR original query
  and optionally boost it high. But I'd start with the autoGenerate bits.
 
  Best,
  Erick
 
 
  On Fri, Sep 27, 2013 at 7:37 AM, elisabeth benoit
  elisaelisael...@gmail.com wrote:
   Thanks for your answer.
  
   So I guess if someone wants to search on two fields, on with phrase
 query
   and one with normal query (splitted in words), one has to find a
 way to
   send query twice: one with quote and one without...
  
   Best regards,
   Elisabeth
  
  
   2013/9/27 Erick Erickson erickerick...@gmail.com
  
   This is a classic issue where there's confusion between
   the query parser and field analysis.
  
   Early in the process the query parser has to take the input
   and break it up. that's how, for instance, a query like
   text:term1 term2
   gets parsed as
   text:term1 defaultfield:term2
   This happens long before the terms get to the analysis chain
   for the field.
  
   So your only options are to either quote the string or
   escape the spaces.
  
   Best,
   Erick
  
   On Wed, Sep 25, 2013 at 9:24 AM, elisabeth benoit
   elisaelisael...@gmail.com wrote:
Hello,
   
I am using solr 4.2.1 and I have a autocomplete_edge type defined
 in
schema.xml
   
   
fieldType name=autocomplete_edge class=solr.TextField
  analyzer type=index
charFilter class=solr.MappingCharFilterFactory
mapping=mapping-ISOLatin1Accent.txt/
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.PatternReplaceFilterFactory pattern=\s+
replacement=  replace=all/
filter class=solr.EdgeNGramFilterFactory maxGramSize=30
minGramSize=1/
   /analyzer
  analyzer type=query
charFilter class=solr.MappingCharFilterFactory
mapping=mapping-ISOLatin1Accent.txt/
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.PatternReplaceFilterFactory pattern=\s+
replacement=  replace=all/
 filter class=solr.PatternReplaceFilterFactory
pattern=^(.{30})(.*)? replacement=$1 replace=all/
  /analyzer
/fieldType
   
When I have a request with more then one word, for instance rue de
  la,
   my
request doesn't match with my autocomplete_edge field unless I use
  quotes
around the query. In other words q=rue de la doesnt work and
 q=rue de
   la
works.
   
I've check the request with debugQuery=on, and I can see in first
  case,
   the
query is splitted into words, and I don't understand why since my
  field
type uses KeywordTokenizerFactory.
   
Does anyone have a clue on how I can request my field without using
   quotes?
   
Thanks,
Elisabeth
  
 



Re: autocomplete_edge type split words

2013-09-28 Thread Erick Erickson
You've probably been doing this right along, but adding
debug=query will show the parsed query.

I really question though, your apparent combination of
autoGeneratePhraseQuery what looks like an ngram field.
I'm not at all sure how those would interact...

Best,
Erick

On Fri, Sep 27, 2013 at 10:12 AM, elisabeth benoit
elisaelisael...@gmail.com wrote:
 Yes!

 what I've done is set autoGeneratePhraseQueries to true for my field, then
 give it a boost (bq=myAutompleteEdgeNGramField=my query with spaces^50).
 This only worked with autoGeneratePhraseQueries=true, for a reason I didn't
 understand.

 since when I did

 q= myAutompleteEdgeNGramField=my query with spaces, I didn't need
 autoGeneratePhraseQueries
 set to true.

 and, another thing is when I tried

 q=myAutocompleteNGramField:(my query with spaces) OR
 myAutompleteEdgeNGramField=my
 query with spaces

 (with a request handler with edismax and default operator field = AND), the
 request on myAutocompleteNGramField would OR the grams, so I had to put an
 AND (myAutocompleteNGramField:(my AND query AND with AND spaces)), which
 was pretty ugly.

 I don't always understand what is exactly going on. If you have a pointer
 to some text I could read to get more insights about this, please let me
 know.

 Thanks again,
 Best regards,
 Elisabeth




 2013/9/27 Erick Erickson erickerick...@gmail.com

 Have you looked at autoGeneratePhraseQueries? That might help.

 If that doesn't work, you can always do something like add an OR clause
 like
 OR original query
 and optionally boost it high. But I'd start with the autoGenerate bits.

 Best,
 Erick


 On Fri, Sep 27, 2013 at 7:37 AM, elisabeth benoit
 elisaelisael...@gmail.com wrote:
  Thanks for your answer.
 
  So I guess if someone wants to search on two fields, on with phrase query
  and one with normal query (splitted in words), one has to find a way to
  send query twice: one with quote and one without...
 
  Best regards,
  Elisabeth
 
 
  2013/9/27 Erick Erickson erickerick...@gmail.com
 
  This is a classic issue where there's confusion between
  the query parser and field analysis.
 
  Early in the process the query parser has to take the input
  and break it up. that's how, for instance, a query like
  text:term1 term2
  gets parsed as
  text:term1 defaultfield:term2
  This happens long before the terms get to the analysis chain
  for the field.
 
  So your only options are to either quote the string or
  escape the spaces.
 
  Best,
  Erick
 
  On Wed, Sep 25, 2013 at 9:24 AM, elisabeth benoit
  elisaelisael...@gmail.com wrote:
   Hello,
  
   I am using solr 4.2.1 and I have a autocomplete_edge type defined in
   schema.xml
  
  
   fieldType name=autocomplete_edge class=solr.TextField
 analyzer type=index
   charFilter class=solr.MappingCharFilterFactory
   mapping=mapping-ISOLatin1Accent.txt/
   tokenizer class=solr.KeywordTokenizerFactory/
   filter class=solr.LowerCaseFilterFactory/
   filter class=solr.PatternReplaceFilterFactory pattern=\s+
   replacement=  replace=all/
   filter class=solr.EdgeNGramFilterFactory maxGramSize=30
   minGramSize=1/
  /analyzer
 analyzer type=query
   charFilter class=solr.MappingCharFilterFactory
   mapping=mapping-ISOLatin1Accent.txt/
   tokenizer class=solr.KeywordTokenizerFactory/
   filter class=solr.LowerCaseFilterFactory/
   filter class=solr.PatternReplaceFilterFactory pattern=\s+
   replacement=  replace=all/
filter class=solr.PatternReplaceFilterFactory
   pattern=^(.{30})(.*)? replacement=$1 replace=all/
 /analyzer
   /fieldType
  
   When I have a request with more then one word, for instance rue de
 la,
  my
   request doesn't match with my autocomplete_edge field unless I use
 quotes
   around the query. In other words q=rue de la doesnt work and q=rue de
  la
   works.
  
   I've check the request with debugQuery=on, and I can see in first
 case,
  the
   query is splitted into words, and I don't understand why since my
 field
   type uses KeywordTokenizerFactory.
  
   Does anyone have a clue on how I can request my field without using
  quotes?
  
   Thanks,
   Elisabeth
 



Re: autocomplete_edge type split words

2013-09-27 Thread elisabeth benoit
Thanks for your answer.

So I guess if someone wants to search on two fields, on with phrase query
and one with normal query (splitted in words), one has to find a way to
send query twice: one with quote and one without...

Best regards,
Elisabeth


2013/9/27 Erick Erickson erickerick...@gmail.com

 This is a classic issue where there's confusion between
 the query parser and field analysis.

 Early in the process the query parser has to take the input
 and break it up. that's how, for instance, a query like
 text:term1 term2
 gets parsed as
 text:term1 defaultfield:term2
 This happens long before the terms get to the analysis chain
 for the field.

 So your only options are to either quote the string or
 escape the spaces.

 Best,
 Erick

 On Wed, Sep 25, 2013 at 9:24 AM, elisabeth benoit
 elisaelisael...@gmail.com wrote:
  Hello,
 
  I am using solr 4.2.1 and I have a autocomplete_edge type defined in
  schema.xml
 
 
  fieldType name=autocomplete_edge class=solr.TextField
analyzer type=index
  charFilter class=solr.MappingCharFilterFactory
  mapping=mapping-ISOLatin1Accent.txt/
  tokenizer class=solr.KeywordTokenizerFactory/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.PatternReplaceFilterFactory pattern=\s+
  replacement=  replace=all/
  filter class=solr.EdgeNGramFilterFactory maxGramSize=30
  minGramSize=1/
 /analyzer
analyzer type=query
  charFilter class=solr.MappingCharFilterFactory
  mapping=mapping-ISOLatin1Accent.txt/
  tokenizer class=solr.KeywordTokenizerFactory/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.PatternReplaceFilterFactory pattern=\s+
  replacement=  replace=all/
   filter class=solr.PatternReplaceFilterFactory
  pattern=^(.{30})(.*)? replacement=$1 replace=all/
/analyzer
  /fieldType
 
  When I have a request with more then one word, for instance rue de la,
 my
  request doesn't match with my autocomplete_edge field unless I use quotes
  around the query. In other words q=rue de la doesnt work and q=rue de
 la
  works.
 
  I've check the request with debugQuery=on, and I can see in first case,
 the
  query is splitted into words, and I don't understand why since my field
  type uses KeywordTokenizerFactory.
 
  Does anyone have a clue on how I can request my field without using
 quotes?
 
  Thanks,
  Elisabeth



Re: autocomplete_edge type split words

2013-09-27 Thread Erick Erickson
Have you looked at autoGeneratePhraseQueries? That might help.

If that doesn't work, you can always do something like add an OR clause like
OR original query
and optionally boost it high. But I'd start with the autoGenerate bits.

Best,
Erick


On Fri, Sep 27, 2013 at 7:37 AM, elisabeth benoit
elisaelisael...@gmail.com wrote:
 Thanks for your answer.

 So I guess if someone wants to search on two fields, on with phrase query
 and one with normal query (splitted in words), one has to find a way to
 send query twice: one with quote and one without...

 Best regards,
 Elisabeth


 2013/9/27 Erick Erickson erickerick...@gmail.com

 This is a classic issue where there's confusion between
 the query parser and field analysis.

 Early in the process the query parser has to take the input
 and break it up. that's how, for instance, a query like
 text:term1 term2
 gets parsed as
 text:term1 defaultfield:term2
 This happens long before the terms get to the analysis chain
 for the field.

 So your only options are to either quote the string or
 escape the spaces.

 Best,
 Erick

 On Wed, Sep 25, 2013 at 9:24 AM, elisabeth benoit
 elisaelisael...@gmail.com wrote:
  Hello,
 
  I am using solr 4.2.1 and I have a autocomplete_edge type defined in
  schema.xml
 
 
  fieldType name=autocomplete_edge class=solr.TextField
analyzer type=index
  charFilter class=solr.MappingCharFilterFactory
  mapping=mapping-ISOLatin1Accent.txt/
  tokenizer class=solr.KeywordTokenizerFactory/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.PatternReplaceFilterFactory pattern=\s+
  replacement=  replace=all/
  filter class=solr.EdgeNGramFilterFactory maxGramSize=30
  minGramSize=1/
 /analyzer
analyzer type=query
  charFilter class=solr.MappingCharFilterFactory
  mapping=mapping-ISOLatin1Accent.txt/
  tokenizer class=solr.KeywordTokenizerFactory/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.PatternReplaceFilterFactory pattern=\s+
  replacement=  replace=all/
   filter class=solr.PatternReplaceFilterFactory
  pattern=^(.{30})(.*)? replacement=$1 replace=all/
/analyzer
  /fieldType
 
  When I have a request with more then one word, for instance rue de la,
 my
  request doesn't match with my autocomplete_edge field unless I use quotes
  around the query. In other words q=rue de la doesnt work and q=rue de
 la
  works.
 
  I've check the request with debugQuery=on, and I can see in first case,
 the
  query is splitted into words, and I don't understand why since my field
  type uses KeywordTokenizerFactory.
 
  Does anyone have a clue on how I can request my field without using
 quotes?
 
  Thanks,
  Elisabeth



Re: autocomplete_edge type split words

2013-09-27 Thread elisabeth benoit
Yes!

what I've done is set autoGeneratePhraseQueries to true for my field, then
give it a boost (bq=myAutompleteEdgeNGramField=my query with spaces^50).
This only worked with autoGeneratePhraseQueries=true, for a reason I didn't
understand.

since when I did

q= myAutompleteEdgeNGramField=my query with spaces, I didn't need
autoGeneratePhraseQueries
set to true.

and, another thing is when I tried

q=myAutocompleteNGramField:(my query with spaces) OR
myAutompleteEdgeNGramField=my
query with spaces

(with a request handler with edismax and default operator field = AND), the
request on myAutocompleteNGramField would OR the grams, so I had to put an
AND (myAutocompleteNGramField:(my AND query AND with AND spaces)), which
was pretty ugly.

I don't always understand what is exactly going on. If you have a pointer
to some text I could read to get more insights about this, please let me
know.

Thanks again,
Best regards,
Elisabeth




2013/9/27 Erick Erickson erickerick...@gmail.com

 Have you looked at autoGeneratePhraseQueries? That might help.

 If that doesn't work, you can always do something like add an OR clause
 like
 OR original query
 and optionally boost it high. But I'd start with the autoGenerate bits.

 Best,
 Erick


 On Fri, Sep 27, 2013 at 7:37 AM, elisabeth benoit
 elisaelisael...@gmail.com wrote:
  Thanks for your answer.
 
  So I guess if someone wants to search on two fields, on with phrase query
  and one with normal query (splitted in words), one has to find a way to
  send query twice: one with quote and one without...
 
  Best regards,
  Elisabeth
 
 
  2013/9/27 Erick Erickson erickerick...@gmail.com
 
  This is a classic issue where there's confusion between
  the query parser and field analysis.
 
  Early in the process the query parser has to take the input
  and break it up. that's how, for instance, a query like
  text:term1 term2
  gets parsed as
  text:term1 defaultfield:term2
  This happens long before the terms get to the analysis chain
  for the field.
 
  So your only options are to either quote the string or
  escape the spaces.
 
  Best,
  Erick
 
  On Wed, Sep 25, 2013 at 9:24 AM, elisabeth benoit
  elisaelisael...@gmail.com wrote:
   Hello,
  
   I am using solr 4.2.1 and I have a autocomplete_edge type defined in
   schema.xml
  
  
   fieldType name=autocomplete_edge class=solr.TextField
 analyzer type=index
   charFilter class=solr.MappingCharFilterFactory
   mapping=mapping-ISOLatin1Accent.txt/
   tokenizer class=solr.KeywordTokenizerFactory/
   filter class=solr.LowerCaseFilterFactory/
   filter class=solr.PatternReplaceFilterFactory pattern=\s+
   replacement=  replace=all/
   filter class=solr.EdgeNGramFilterFactory maxGramSize=30
   minGramSize=1/
  /analyzer
 analyzer type=query
   charFilter class=solr.MappingCharFilterFactory
   mapping=mapping-ISOLatin1Accent.txt/
   tokenizer class=solr.KeywordTokenizerFactory/
   filter class=solr.LowerCaseFilterFactory/
   filter class=solr.PatternReplaceFilterFactory pattern=\s+
   replacement=  replace=all/
filter class=solr.PatternReplaceFilterFactory
   pattern=^(.{30})(.*)? replacement=$1 replace=all/
 /analyzer
   /fieldType
  
   When I have a request with more then one word, for instance rue de
 la,
  my
   request doesn't match with my autocomplete_edge field unless I use
 quotes
   around the query. In other words q=rue de la doesnt work and q=rue de
  la
   works.
  
   I've check the request with debugQuery=on, and I can see in first
 case,
  the
   query is splitted into words, and I don't understand why since my
 field
   type uses KeywordTokenizerFactory.
  
   Does anyone have a clue on how I can request my field without using
  quotes?
  
   Thanks,
   Elisabeth
 



Re: autocomplete_edge type split words

2013-09-26 Thread Erick Erickson
This is a classic issue where there's confusion between
the query parser and field analysis.

Early in the process the query parser has to take the input
and break it up. that's how, for instance, a query like
text:term1 term2
gets parsed as
text:term1 defaultfield:term2
This happens long before the terms get to the analysis chain
for the field.

So your only options are to either quote the string or
escape the spaces.

Best,
Erick

On Wed, Sep 25, 2013 at 9:24 AM, elisabeth benoit
elisaelisael...@gmail.com wrote:
 Hello,

 I am using solr 4.2.1 and I have a autocomplete_edge type defined in
 schema.xml


 fieldType name=autocomplete_edge class=solr.TextField
   analyzer type=index
 charFilter class=solr.MappingCharFilterFactory
 mapping=mapping-ISOLatin1Accent.txt/
 tokenizer class=solr.KeywordTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.PatternReplaceFilterFactory pattern=\s+
 replacement=  replace=all/
 filter class=solr.EdgeNGramFilterFactory maxGramSize=30
 minGramSize=1/
/analyzer
   analyzer type=query
 charFilter class=solr.MappingCharFilterFactory
 mapping=mapping-ISOLatin1Accent.txt/
 tokenizer class=solr.KeywordTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.PatternReplaceFilterFactory pattern=\s+
 replacement=  replace=all/
  filter class=solr.PatternReplaceFilterFactory
 pattern=^(.{30})(.*)? replacement=$1 replace=all/
   /analyzer
 /fieldType

 When I have a request with more then one word, for instance rue de la, my
 request doesn't match with my autocomplete_edge field unless I use quotes
 around the query. In other words q=rue de la doesnt work and q=rue de la
 works.

 I've check the request with debugQuery=on, and I can see in first case, the
 query is splitted into words, and I don't understand why since my field
 type uses KeywordTokenizerFactory.

 Does anyone have a clue on how I can request my field without using quotes?

 Thanks,
 Elisabeth


autocomplete_edge type split words

2013-09-25 Thread elisabeth benoit
Hello,

I am using solr 4.2.1 and I have a autocomplete_edge type defined in
schema.xml


fieldType name=autocomplete_edge class=solr.TextField
  analyzer type=index
charFilter class=solr.MappingCharFilterFactory
mapping=mapping-ISOLatin1Accent.txt/
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.PatternReplaceFilterFactory pattern=\s+
replacement=  replace=all/
filter class=solr.EdgeNGramFilterFactory maxGramSize=30
minGramSize=1/
   /analyzer
  analyzer type=query
charFilter class=solr.MappingCharFilterFactory
mapping=mapping-ISOLatin1Accent.txt/
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.PatternReplaceFilterFactory pattern=\s+
replacement=  replace=all/
 filter class=solr.PatternReplaceFilterFactory
pattern=^(.{30})(.*)? replacement=$1 replace=all/
  /analyzer
/fieldType

When I have a request with more then one word, for instance rue de la, my
request doesn't match with my autocomplete_edge field unless I use quotes
around the query. In other words q=rue de la doesnt work and q=rue de la
works.

I've check the request with debugQuery=on, and I can see in first case, the
query is splitted into words, and I don't understand why since my field
type uses KeywordTokenizerFactory.

Does anyone have a clue on how I can request my field without using quotes?

Thanks,
Elisabeth