Re: matching starts with only

2014-08-13 Thread zameer
On solr3.6 search while giving query black\ cat*(as you mentioned in post),
I am not getting any result. 
Instead of black\ cat* if I am querying black*\ cat*, its giving result
as
black forest cat
black cat
black color cat.

But I need only these type result i.e.
black cat
black cat is beautiful
black cat and dog 

Note: I am using solr3.6


Erick Erickson wrote
 Right, this is a quirk of phrase queries. For wildcards to work in phrase
 queries you need SOLR-1604 (ComplexPhraseQueryParser).
 
 Or you need to escape your spaces, i.e.
 black\ cat*
 
 Best,
 Erick
 
 
 On Tue, Aug 5, 2014 at 11:09 PM, zameer lt;

 zameerulhasan121@

 gt; wrote:
 
 If we search only black* it works but when we use search text black
 cat*
 or (black cat)* or (black cat*)* it come blank.

 
 fieldType name=text_general_long class=solr.TextField

  positionIncrementGap=100
   
 analyzer
 
 tokenizer class=solr.KeywordTokenizerFactory/
 
 filter class=solr.LowerCaseFilterFactory/
   
 /analyzer
 
 /fieldType

 
 field indexed=true name=my_name stored=true

  type=text_general_long/

 Thank you in advance




 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/matching-starts-with-only-tp4094430p4151379.html
 Sent from the Solr - User mailing list archive at Nabble.com.






--
View this message in context: 
http://lucene.472066.n3.nabble.com/matching-starts-with-only-tp4094430p4152662.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: matching starts with only

2014-08-13 Thread Erick Erickson
I'd recommend that you spend some time with the
admin/analysis page.

KeywordTokenizer doesn't break up the input at _all_. So
the text this is a black cat will never match anything that
starts out black. String is even more restrictive, it not only doesn't
tokenize, it won't allow lower case.

You haven't articulated the use-case you're really trying to support.
Is it a requirement that you always match from the left? I.e. if the
text is this is a black cat you don't want to match on black cat, but
require this is a black cat? If so try EdgeNgramTokenizer.

Best,
Erick


On Tue, Aug 12, 2014 at 11:47 PM, zameer zameerulhasan...@gmail.com wrote:

 On solr3.6 search while giving query black\ cat*(as you mentioned in
 post),
 I am not getting any result.
 Instead of black\ cat* if I am querying black*\ cat*, its giving result
 as
 black forest cat
 black cat
 black color cat.

 But I need only these type result i.e.
 black cat
 black cat is beautiful
 black cat and dog

 Note: I am using solr3.6


 Erick Erickson wrote
  Right, this is a quirk of phrase queries. For wildcards to work in phrase
  queries you need SOLR-1604 (ComplexPhraseQueryParser).
 
  Or you need to escape your spaces, i.e.
  black\ cat*
 
  Best,
  Erick
 
 
  On Tue, Aug 5, 2014 at 11:09 PM, zameer lt;

  zameerulhasan121@

  gt; wrote:
 
  If we search only black* it works but when we use search text black
  cat*
  or (black cat)* or (black cat*)* it come blank.
 
 
  fieldType name=text_general_long class=solr.TextField
 
   positionIncrementGap=100
 
  analyzer
 
  tokenizer class=solr.KeywordTokenizerFactory/
 
  filter class=solr.LowerCaseFilterFactory/
 
  /analyzer
 
  /fieldType
 
 
  field indexed=true name=my_name stored=true
 
   type=text_general_long/
 
  Thank you in advance
 
 
 
 
  --
  View this message in context:
 
 http://lucene.472066.n3.nabble.com/matching-starts-with-only-tp4094430p4151379.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 





 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/matching-starts-with-only-tp4094430p4152662.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: matching starts with only

2014-08-06 Thread zameer
If we search only black* it works but when we use search text black cat*
or (black cat)* or (black cat*)* it come blank. 

fieldType name=text_general_long class=solr.TextField
positionIncrementGap=100
  analyzer
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
/fieldType

field indexed=true name=my_name stored=true
type=text_general_long/

Thank you in advance




--
View this message in context: 
http://lucene.472066.n3.nabble.com/matching-starts-with-only-tp4094430p4151379.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: matching starts with only

2014-08-06 Thread Erick Erickson
Right, this is a quirk of phrase queries. For wildcards to work in phrase
queries you need SOLR-1604 (ComplexPhraseQueryParser).

Or you need to escape your spaces, i.e.
black\ cat*

Best,
Erick


On Tue, Aug 5, 2014 at 11:09 PM, zameer zameerulhasan...@gmail.com wrote:

 If we search only black* it works but when we use search text black
 cat*
 or (black cat)* or (black cat*)* it come blank.

 fieldType name=text_general_long class=solr.TextField
 positionIncrementGap=100
   analyzer
 tokenizer class=solr.KeywordTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
   /analyzer
 /fieldType

 field indexed=true name=my_name stored=true
 type=text_general_long/

 Thank you in advance




 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/matching-starts-with-only-tp4094430p4151379.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: matching starts with only

2013-10-10 Thread adm1n
I've changed the field name to string type, the default one presented in
schema.xml, and I got what I needed.


thanks for your time.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/matching-starts-with-only-tp4094430p4094637.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: matching starts with only

2013-10-10 Thread Erick Erickson
Be aware that the string type is not analyzed in any way,
so your searches are case sensitive. There's a lowercase
type in the example schema.xml that combines
KeywordTokenizer with LowercaseFilter for case-insensitive
searches that you might find useful.

Besides regex, this might be a good place or wildcards, just
black*.

Best,
Erick

On Thu, Oct 10, 2013 at 7:31 AM, adm1n evgeni.evg...@gmail.com wrote:
 I've changed the field name to string type, the default one presented in
 schema.xml, and I got what I needed.


 thanks for your time.



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/matching-starts-with-only-tp4094430p4094637.html
 Sent from the Solr - User mailing list archive at Nabble.com.


matching starts with only

2013-10-09 Thread adm1n
My index contains documents which could be a single word or a short sentence
which contains up to 4-5 words. I need to return documents, which starts
with only from the searched pattern.
in regex it would be [^my_query].

for example, for a docs:

black
beautiful black cat
cat
cat is black
black cat

and for the query: black

only black and black cat should be returned.

The text field I'm using is as follows:
fieldType name=text_general_aa class=solr.TextField
positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.NGramFilterFactory minGramSize=4
maxGramSize=15 side=front/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
  analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.NGramFilterFactory minGramSize=4
maxGramSize=15 side=front/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
/fieldType
Solr version is 4.2

thanks!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/matching-starts-with-only-tp4094430.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: matching starts with only

2013-10-09 Thread Shawn Heisey

On 10/9/2013 12:57 PM, adm1n wrote:

My index contains documents which could be a single word or a short sentence
which contains up to 4-5 words. I need to return documents, which starts
with only from the searched pattern.
in regex it would be [^my_query].

for example, for a docs:

black
beautiful black cat
cat
cat is black
black cat

and for the query: black

only black and black cat should be returned.

The text field I'm using is as follows:
fieldType name=text_general_aa class=solr.TextField
positionIncrementGap=100
   analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.NGramFilterFactory minGramSize=4
maxGramSize=15 side=front/
 filter class=solr.LowerCaseFilterFactory/
   /analyzer
   analyzer type=query
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.NGramFilterFactory minGramSize=4
maxGramSize=15 side=front/
 filter class=solr.LowerCaseFilterFactory/
   /analyzer
 /fieldType
Solr version is 4.2

thanks!


The presence of either the whitespace tokenizer or the NGram filter make 
this impossible, because they both break the indexed value into smaller 
pieces.  Together, they *really* break things up.  Matching is done on a 
per-term basis, and these two components in your analysis chain ensure 
that black will be a term for all of those input documents, whether it 
appears at the beginning, middle, or end.


If you set up a copyField to a new field whose fieldType uses the 
Keyword tokenizer (which treats the entire string as a single token) and 
the lowercase filter, you would be able use the regex support in Solr 
4.x and have this as your query string:


newfield:/^black/

Thanks,
Shawn



Re: matching starts with only

2013-10-09 Thread adm1n
Shawn Heisey-4:

thanks for the quick response.

Why this field have to be copyField? Couldn't it be a single field, for
example:
fieldType name=text_general_long class=solr.TextField
positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
  analyzer type=query
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
/fieldType

field name=my_name type=text_general_long stored=true
multiValued=false required=false/



thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/matching-starts-with-only-tp4094430p4094447.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: matching starts with only

2013-10-09 Thread Shawn Heisey

On 10/9/2013 2:16 PM, adm1n wrote:

Why this field have to be copyField? Couldn't it be a single field, for


I always assume that people already are using the existing field and 
type for other purposes.  Offering advice without making that assumption 
will usually result in people making a change and then complaining that 
something else no longer works.


If you don't need what you already have for something else, then you 
could change the type on the existing field with no problem.


Thanks,
Shawn



Re: matching starts with only

2013-10-09 Thread adm1n
search by starts with is something new I have to add, as well as the data I
have to index for this purpose, so it's ok to create a new field.

But once I added the following field type:
fieldType name=text_general_long class=solr.TextField
positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
  analyzer type=query
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
/fieldType

And:
field name=my_name type=text_general_long stored=true
multiValued=false required=false/
indexing, and afterwards searching by my_name:/^black/ returns no results,
while searching by my_name:black returns only black document.

What am I missing?

thanks. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/matching-starts-with-only-tp4094430p4094453.html
Sent from the Solr - User mailing list archive at Nabble.com.