Re: Pattern matching in Solr

2009-08-27 Thread bhaskar chandrasekar
 
Hi,
 
In Schema.xml file,I am not able ot find splitOnCaseChange=1.
I am not looking for case sensitive search.
Let me know what file you are refering to?.
I am looking for exact match search only

Moreover for scenario 2 the KeywordTokenizerFactory
and EdgeNGramFilterFactory refers which link in Solr wiki.
 
Regards
Bhaskar

--- On Wed, 8/26/09, Avlesh Singh avl...@gmail.com wrote:


From: Avlesh Singh avl...@gmail.com
Subject: Re: Pattern matching in Solr
To: solr-user@lucene.apache.org
Date: Wednesday, August 26, 2009, 11:31 AM


You could have used your previous thread itself (
http://www.lucidimagination.com/search/document/31c1ebcedd4442b/exact_pattern_search_in_solr),
Bhaskar.

In your scenario one, you need an exact token match, right? You are getting
expected results if your field type is text. Look for the
WordDelimiterFilterFactory in your field type definition for the text
field inside schema.xml. You'll find an attribute splitOnCaseChange=1.
Because of this, ChandarBhaskar is converted into two tokens Chandra and
Bhaskar and hence the matches. You may choose to remove this attribute if
the behaviour is not desired.

For your scenario two, you may want to look at the KeywordTokenizerFactory
and EdgeNGramFilterFactory on Solr wiki.

Generally, for all such use cases people create multiple fields in their
schema storing the same data analyzed in different ways.

Cheers
Avlesh

On Wed, Aug 26, 2009 at 10:58 PM, bhaskar chandrasekar bas_s...@yahoo.co.in
 wrote:

 Hi,

 Can any one help me with the below scenario?.

 Scenario 1:

 Assume that I give Google as input string
 i am using Carrot with Solr
 Carrot is for front end display purpose
 the issue is
 Assuming i give BHASKAR as input string
 It should give me search results pertaining to BHASKAR only.
  Select * from MASTER where name =Bhaskar;
  Example:It should not display search results as ChandarBhaskar or
  BhaskarC.
  Should display Bhaskar only.

 Scenario 2:
  Select * from MASTER where name like %BHASKAR%;
  It should display records containing the word BHASKAR
  Ex: Bhaskar
 ChandarBhaskar
  BhaskarC
  Bhaskarabc

  How to achieve Scenario 1 in Solr ?.



 Regards
 Bhaskar






__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Re: Pattern matching in Solr

2009-08-27 Thread Avlesh Singh

 In Schema.xml file,I am not able ot find splitOnCaseChange=1.

Unless you have modified the stock field type definition of text field in
your core's schema.xml you should be able to find this property set for the
WordDelimiterFilterFactory. Read more here -
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#head-1c9b83870ca7890cd73b193cefed83c283339089

Moreover for scenario 2 the KeywordTokenizerFactory and
 EdgeNGramFilterFactory refers which link in Solr wiki.

Google for these two.

Cheers
Avlesh

On Thu, Aug 27, 2009 at 12:21 PM, bhaskar chandrasekar bas_s...@yahoo.co.in
 wrote:


 Hi,

 In Schema.xml file,I am not able ot find splitOnCaseChange=1.
 I am not looking for case sensitive search.
 Let me know what file you are refering to?.
 I am looking for exact match search only

 Moreover for scenario 2 the KeywordTokenizerFactory
 and EdgeNGramFilterFactory refers which link in Solr wiki.

 Regards
 Bhaskar

 --- On Wed, 8/26/09, Avlesh Singh avl...@gmail.com wrote:


 From: Avlesh Singh avl...@gmail.com
 Subject: Re: Pattern matching in Solr
 To: solr-user@lucene.apache.org
 Date: Wednesday, August 26, 2009, 11:31 AM


 You could have used your previous thread itself (

 http://www.lucidimagination.com/search/document/31c1ebcedd4442b/exact_pattern_search_in_solr
 ),
 Bhaskar.

 In your scenario one, you need an exact token match, right? You are getting
 expected results if your field type is text. Look for the
 WordDelimiterFilterFactory in your field type definition for the text
 field inside schema.xml. You'll find an attribute splitOnCaseChange=1.
 Because of this, ChandarBhaskar is converted into two tokens Chandra
 and
 Bhaskar and hence the matches. You may choose to remove this attribute if
 the behaviour is not desired.

 For your scenario two, you may want to look at the KeywordTokenizerFactory
 and EdgeNGramFilterFactory on Solr wiki.

 Generally, for all such use cases people create multiple fields in their
 schema storing the same data analyzed in different ways.

 Cheers
 Avlesh

 On Wed, Aug 26, 2009 at 10:58 PM, bhaskar chandrasekar 
 bas_s...@yahoo.co.in
  wrote:

  Hi,
 
  Can any one help me with the below scenario?.
 
  Scenario 1:
 
  Assume that I give Google as input string
  i am using Carrot with Solr
  Carrot is for front end display purpose
  the issue is
  Assuming i give BHASKAR as input string
  It should give me search results pertaining to BHASKAR only.
   Select * from MASTER where name =Bhaskar;
   Example:It should not display search results as ChandarBhaskar or
   BhaskarC.
   Should display Bhaskar only.
 
  Scenario 2:
   Select * from MASTER where name like %BHASKAR%;
   It should display records containing the word BHASKAR
   Ex: Bhaskar
  ChandarBhaskar
   BhaskarC
   Bhaskarabc
 
   How to achieve Scenario 1 in Solr ?.
 
 
 
  Regards
  Bhaskar
 
 
 
 


 __
 Do You Yahoo!?
 Tired of spam?  Yahoo! Mail has the best spam protection around
 http://mail.yahoo.com



Re: Pattern matching in Solr

2009-08-27 Thread bhaskar chandrasekar
Hi,
 
In Schema.xml file,I am not able ot find splitOnCaseChange=1.
I am not looking for case sensitive search.
Let me know what file you are refering to?.
I am looking for exact match search only

Moreover for scenario 2 the KeywordTokenizerFactory
and EdgeNGramFilterFactory refers which link in Solr wiki.
 
Regards
Bhaskar



--- On Thu, 8/27/09, Avlesh Singh avl...@gmail.com wrote:


From: Avlesh Singh avl...@gmail.com
Subject: Re: Pattern matching in Solr
To: solr-user@lucene.apache.org
Date: Thursday, August 27, 2009, 2:10 AM



 In Schema.xml file,I am not able ot find splitOnCaseChange=1.

Unless you have modified the stock field type definition of text field in
your core's schema.xml you should be able to find this property set for the
WordDelimiterFilterFactory. Read more here -
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#head-1c9b83870ca7890cd73b193cefed83c283339089

Moreover for scenario 2 the KeywordTokenizerFactory and
 EdgeNGramFilterFactory refers which link in Solr wiki.

Google for these two.

Cheers
Avlesh

On Thu, Aug 27, 2009 at 12:21 PM, bhaskar chandrasekar bas_s...@yahoo.co.in
 wrote:


 Hi,

 In Schema.xml file,I am not able ot find splitOnCaseChange=1.
 I am not looking for case sensitive search.
 Let me know what file you are refering to?.
 I am looking for exact match search only

 Moreover for scenario 2 the KeywordTokenizerFactory
 and EdgeNGramFilterFactory refers which link in Solr wiki.

 Regards
 Bhaskar

 --- On Wed, 8/26/09, Avlesh Singh avl...@gmail.com wrote:


 From: Avlesh Singh avl...@gmail.com
 Subject: Re: Pattern matching in Solr
 To: solr-user@lucene.apache.org
 Date: Wednesday, August 26, 2009, 11:31 AM


 You could have used your previous thread itself (

 http://www.lucidimagination.com/search/document/31c1ebcedd4442b/exact_pattern_search_in_solr
 ),
 Bhaskar.

 In your scenario one, you need an exact token match, right? You are getting
 expected results if your field type is text. Look for the
 WordDelimiterFilterFactory in your field type definition for the text
 field inside schema.xml. You'll find an attribute splitOnCaseChange=1.
 Because of this, ChandarBhaskar is converted into two tokens Chandra
 and
 Bhaskar and hence the matches. You may choose to remove this attribute if
 the behaviour is not desired.

 For your scenario two, you may want to look at the KeywordTokenizerFactory
 and EdgeNGramFilterFactory on Solr wiki.

 Generally, for all such use cases people create multiple fields in their
 schema storing the same data analyzed in different ways.

 Cheers
 Avlesh

 On Wed, Aug 26, 2009 at 10:58 PM, bhaskar chandrasekar 
 bas_s...@yahoo.co.in
  wrote:

  Hi,
 
  Can any one help me with the below scenario?.
 
  Scenario 1:
 
  Assume that I give Google as input string
  i am using Carrot with Solr
  Carrot is for front end display purpose
  the issue is
  Assuming i give BHASKAR as input string
  It should give me search results pertaining to BHASKAR only.
   Select * from MASTER where name =Bhaskar;
   Example:It should not display search results as ChandarBhaskar or
   BhaskarC.
   Should display Bhaskar only.
 
  Scenario 2:
   Select * from MASTER where name like %BHASKAR%;
   It should display records containing the word BHASKAR
   Ex: Bhaskar
  ChandarBhaskar
   BhaskarC
   Bhaskarabc
 
   How to achieve Scenario 1 in Solr ?.
 
 
 
  Regards
  Bhaskar
 
 
 
 


 __
 Do You Yahoo!?
 Tired of spam?  Yahoo! Mail has the best spam protection around
 http://mail.yahoo.com




  

Pattern matching in Solr

2009-08-26 Thread bhaskar chandrasekar
Hi,
 
Can any one help me with the below scenario?.
 
Scenario 1:
 
Assume that I give Google as input string 
i am using Carrot with Solr 
Carrot is for front end display purpose 
the issue is 
Assuming i give BHASKAR as input string 
It should give me search results pertaining to BHASKAR only.
 Select * from MASTER where name =Bhaskar;
 Example:It should not display search results as ChandarBhaskar or
 BhaskarC.
 Should display Bhaskar only.
 
Scenario 2:
 Select * from MASTER where name like %BHASKAR%;
 It should display records containing the word BHASKAR
 Ex: Bhaskar
ChandarBhaskar
 BhaskarC
 Bhaskarabc

 How to achieve Scenario 1 in Solr ?.


 
Regards
Bhaskar