Re: Problem with caps and star symbol

2011-06-01 Thread Saumitra Chowdhury
Thanks for your point. I was really tripping that issue. But Now I need a
bit help more.
As far I have noticed that in the case of a value like *role_delete* ,
WordDelimiterFilterFactory
index two words like *role* and *delete* and in both search result with
the term *role* and *delete* will
include that document.

Now In the case of the value like *role_delete* I want to index all four
terms like [ *role_delete, roledelete, role, delete ].*
In total both the original and processed word by WordDelimiterFilterFactory
will be indexed.

Is it possible ?? Does any additional filter with WordDelimiterFilterFactory
 can do that ?? Or
any filter can do such like operation ??

On Tue, May 31, 2011 at 8:07 PM, Erick Erickson erickerick...@gmail.comwrote:

 I think you're tripping over the issue that wildcards aren't analyzed, they
 don't go through your analysis chain. So the casing matters. Try
 lowercasing
 the input and I believe you'll see more like what you expect...

 Best
 Erick

 On Mon, May 30, 2011 at 12:07 AM, Saumitra Chowdhury
 saumi...@smartitengineering.com wrote:
  I am sending some xml to understand the scenario.
  Indexed term = ROLE_DELETE
  Search Term = roledelete
  response
  lst name=responseHeader
  int name=status0/int
  int name=QTime4/int
  lst name=params
  str name=indenton/str
  str name=start0/str
  str name=qname : roledelete/str
  str name=version2.2/str
  str name=rows10/str
  /lst
  /lst
  result name=response numFound=1 start=0
 
  Indexed term = ROLE_DELETE
  Search Term = role
  response
  lst name=responseHeader
  int name=status0/int
  int name=QTime5/int
  lst name=params
  str name=indenton/str
  str name=start0/str
  str name=qname : role/str
  str name=version2.2/str
  str name=rows10/str
  /lst
  /lst
  result name=response numFound=1 start=0
  doc
  str name=creationDateMon May 30 13:09:14 BDST 2011/str
  str name=displayNameGlobal Role for Deletion/str
  str name=idrole:9223372036854775802/str
  str name=lastModifiedDateMon May 30 13:09:14 BDST 2011/str
  str name=nameROLE_DELETE/str
  /doc
  /result
  /response
  doc
  str name=creationDateMon May 30 13:09:14 BDST 2011/str
  str name=displayNameGlobal Role for Deletion/str
  str name=idrole:9223372036854775802/str
  str name=lastModifiedDateMon May 30 13:09:14 BDST 2011/str
  str name=nameROLE_DELETE/str
  /doc
  /result
  /response
 
 
  Indexed term = ROLE_DELETE
  Search Term = role*
  response
  lst name=responseHeader
  int name=status0/int
  int name=QTime4/int
  lst name=params
  str name=indenton/str
  str name=start0/str
  str name=qname : role*/str
  str name=version2.2/str
  str name=rows10/str
  /lst
  /lst
  result name=response numFound=1 start=0
  doc
  str name=creationDateMon May 30 13:09:14 BDST 2011/str
  str name=displayNameGlobal Role for Deletion/str
  str name=idrole:9223372036854775802/str
  str name=lastModifiedDateMon May 30 13:09:14 BDST 2011/str
  str name=nameROLE_DELETE/str
  /doc
  /result
  /response
 
 
  Indexed term = ROLE_DELETE
  Search Term = Role*
  response
  lst name=responseHeader
  int name=status0/int
  int name=QTime4/int
  lst name=params
  str name=indenton/str
  str name=start0/str
  str name=qname : Role*/str
  str name=version2.2/str
  str name=rows10/str
  /lst
  /lst
  result name=response numFound=0 start=0/
  /response
 
 
  Indexed term = ROLE_DELETE
  Search Term = ROLE_DELETE*
  response
  lst name=responseHeader
  int name=status0/int
  int name=QTime4/int
  lst name=params
  str name=indenton/str
  str name=start0/str
  str name=qname : ROLE_DELETE*/str
  str name=version2.2/str
  str name=rows10/str
  /lst
  /lst
  result name=response numFound=0 start=0/
  /response
  I am also adding a analysis html.
 
 
  On Mon, May 30, 2011 at 7:19 AM, Erick Erickson erickerick...@gmail.com
 
  wrote:
 
  I'd start by looking at the analysis page from the Solr admin page. That
  will give you an idea of the transformations the various steps carry
 out,
  it's invaluable!
 
  Best
  Erick
  On May 26, 2011 12:53 AM, Saumitra Chowdhury 
  saumi...@smartitengineering.com wrote:
   Hi all ,
   In my schema.xml i am using WordDelimiterFilterFactory,
   LowerCaseFilterFactory, StopFilterFactory for index analyzer and an
   extra
   SynonymFilterFactory for query analyzer. I am indexing a field name
   '*name*'.Now
   if a value with all caps like NAME_BILL is indexed I am able get
 this
   as
   search result with the term  *name_bill *,  *NAME_BILL *, 
   *namebill
  *,
   *namebill** ,  *nameb**  ... But for the term like following  *
   NAME_BILL** ,  *name_bill** ,  *namebill** ,  *NAME**  the
 result
   does mot show this document. Can anyone please explain why this is
   happening? .In fact star  *  is not giving any result in many
   cases specially if it is used after full value of a field.
  
   Portion of my schema is given below.
  
   fieldType name=text_ws class=solr.TextField
  positionIncrementGap=100
   -
   analyzer
   tokenizer class

Re: Problem with caps and star symbol

2011-06-01 Thread Saumitra Chowdhury
Its Working as I was looking for.Thanks Mr. Erick.

On Wed, Jun 1, 2011 at 8:29 PM, Erick Erickson erickerick...@gmail.comwrote:

 Take a look here:

 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory

 I think you want generateWordParts=1, catenateWords=1 and
 preserveOriginal=1,
 but check it out with the admin/analysis page.

 Oh, and your index-time and query-time patterns for WDFF will probably
 be different, see
 the example schema.

 Best
 Erick

 On Wed, Jun 1, 2011 at 7:40 AM, Saumitra Chowdhury
 saumi...@smartitengineering.com wrote:
  Thanks for your point. I was really tripping that issue. But Now I need a
  bit help more.
  As far I have noticed that in the case of a value like *role_delete* ,
  WordDelimiterFilterFactory
  index two words like *role* and *delete* and in both search result
 with
  the term *role* and *delete* will
  include that document.
 
  Now In the case of the value like *role_delete* I want to index all
 four
  terms like [ *role_delete, roledelete, role, delete ].*
  In total both the original and processed word by
 WordDelimiterFilterFactory
  will be indexed.
 
  Is it possible ?? Does any additional filter with
 WordDelimiterFilterFactory
   can do that ?? Or
  any filter can do such like operation ??
 
  On Tue, May 31, 2011 at 8:07 PM, Erick Erickson erickerick...@gmail.com
 wrote:
 
  I think you're tripping over the issue that wildcards aren't analyzed,
 they
  don't go through your analysis chain. So the casing matters. Try
  lowercasing
  the input and I believe you'll see more like what you expect...
 
  Best
  Erick
 
  On Mon, May 30, 2011 at 12:07 AM, Saumitra Chowdhury
  saumi...@smartitengineering.com wrote:
   I am sending some xml to understand the scenario.
   Indexed term = ROLE_DELETE
   Search Term = roledelete
   response
   lst name=responseHeader
   int name=status0/int
   int name=QTime4/int
   lst name=params
   str name=indenton/str
   str name=start0/str
   str name=qname : roledelete/str
   str name=version2.2/str
   str name=rows10/str
   /lst
   /lst
   result name=response numFound=1 start=0
  
   Indexed term = ROLE_DELETE
   Search Term = role
   response
   lst name=responseHeader
   int name=status0/int
   int name=QTime5/int
   lst name=params
   str name=indenton/str
   str name=start0/str
   str name=qname : role/str
   str name=version2.2/str
   str name=rows10/str
   /lst
   /lst
   result name=response numFound=1 start=0
   doc
   str name=creationDateMon May 30 13:09:14 BDST 2011/str
   str name=displayNameGlobal Role for Deletion/str
   str name=idrole:9223372036854775802/str
   str name=lastModifiedDateMon May 30 13:09:14 BDST 2011/str
   str name=nameROLE_DELETE/str
   /doc
   /result
   /response
   doc
   str name=creationDateMon May 30 13:09:14 BDST 2011/str
   str name=displayNameGlobal Role for Deletion/str
   str name=idrole:9223372036854775802/str
   str name=lastModifiedDateMon May 30 13:09:14 BDST 2011/str
   str name=nameROLE_DELETE/str
   /doc
   /result
   /response
  
  
   Indexed term = ROLE_DELETE
   Search Term = role*
   response
   lst name=responseHeader
   int name=status0/int
   int name=QTime4/int
   lst name=params
   str name=indenton/str
   str name=start0/str
   str name=qname : role*/str
   str name=version2.2/str
   str name=rows10/str
   /lst
   /lst
   result name=response numFound=1 start=0
   doc
   str name=creationDateMon May 30 13:09:14 BDST 2011/str
   str name=displayNameGlobal Role for Deletion/str
   str name=idrole:9223372036854775802/str
   str name=lastModifiedDateMon May 30 13:09:14 BDST 2011/str
   str name=nameROLE_DELETE/str
   /doc
   /result
   /response
  
  
   Indexed term = ROLE_DELETE
   Search Term = Role*
   response
   lst name=responseHeader
   int name=status0/int
   int name=QTime4/int
   lst name=params
   str name=indenton/str
   str name=start0/str
   str name=qname : Role*/str
   str name=version2.2/str
   str name=rows10/str
   /lst
   /lst
   result name=response numFound=0 start=0/
   /response
  
  
   Indexed term = ROLE_DELETE
   Search Term = ROLE_DELETE*
   response
   lst name=responseHeader
   int name=status0/int
   int name=QTime4/int
   lst name=params
   str name=indenton/str
   str name=start0/str
   str name=qname : ROLE_DELETE*/str
   str name=version2.2/str
   str name=rows10/str
   /lst
   /lst
   result name=response numFound=0 start=0/
   /response
   I am also adding a analysis html.
  
  
   On Mon, May 30, 2011 at 7:19 AM, Erick Erickson 
 erickerick...@gmail.com
  
   wrote:
  
   I'd start by looking at the analysis page from the Solr admin page.
 That
   will give you an idea of the transformations the various steps carry
  out,
   it's invaluable!
  
   Best
   Erick
   On May 26, 2011 12:53 AM, Saumitra Chowdhury 
   saumi...@smartitengineering.com wrote:
Hi all ,
In my schema.xml i am using WordDelimiterFilterFactory,
LowerCaseFilterFactory, StopFilterFactory for index

Re: Problem with caps and star symbol

2011-05-30 Thread Saumitra Chowdhury
I am sending some xml to understand the scenario.

Indexed term = ROLE_DELETE
Search Term = roledelete
response
lst name=responseHeader
int name=status0/int
int name=QTime4/int
lst name=params
str name=indenton/str
str name=start0/str
str name=qname : roledelete/str
str name=version2.2/str
str name=rows10/str
/lst
/lst
result name=response numFound=1 start=0


Indexed term = ROLE_DELETE
Search Term = role
response
lst name=responseHeader
int name=status0/int
int name=QTime5/int
lst name=params
str name=indenton/str
str name=start0/str
str name=qname : role/str
str name=version2.2/str
str name=rows10/str
/lst
/lst
result name=response numFound=1 start=0
doc
str name=creationDateMon May 30 13:09:14 BDST 2011/str
str name=displayNameGlobal Role for Deletion/str
str name=idrole:9223372036854775802/str
str name=lastModifiedDateMon May 30 13:09:14 BDST 2011/str
str name=nameROLE_DELETE/str
/doc
/result
/response
doc
str name=creationDateMon May 30 13:09:14 BDST 2011/str
str name=displayNameGlobal Role for Deletion/str
str name=idrole:9223372036854775802/str
str name=lastModifiedDateMon May 30 13:09:14 BDST 2011/str
str name=nameROLE_DELETE/str
/doc
/result
/response



Indexed term = ROLE_DELETE
Search Term = role*
response
lst name=responseHeader
int name=status0/int
int name=QTime4/int
lst name=params
str name=indenton/str
str name=start0/str
str name=qname : role*/str
str name=version2.2/str
str name=rows10/str
/lst
/lst
result name=response numFound=1 start=0
doc
str name=creationDateMon May 30 13:09:14 BDST 2011/str
str name=displayNameGlobal Role for Deletion/str
str name=idrole:9223372036854775802/str
str name=lastModifiedDateMon May 30 13:09:14 BDST 2011/str
str name=nameROLE_DELETE/str
/doc
/result
/response



Indexed term = ROLE_DELETE
Search Term = Role*

response
lst name=responseHeader
int name=status0/int
int name=QTime4/int
lst name=params
str name=indenton/str
str name=start0/str
str name=qname : Role*/str
str name=version2.2/str
str name=rows10/str
/lst
/lst
result name=response numFound=0 start=0/
/response



Indexed term = ROLE_DELETE
Search Term = ROLE_DELETE*

response
lst name=responseHeader
int name=status0/int
int name=QTime4/int
lst name=params
str name=indenton/str
str name=start0/str
str name=qname : ROLE_DELETE*/str
str name=version2.2/str
str name=rows10/str
/lst
/lst
result name=response numFound=0 start=0/
/response

I am also adding a analysis html.



On Mon, May 30, 2011 at 7:19 AM, Erick Erickson erickerick...@gmail.comwrote:

 I'd start by looking at the analysis page from the Solr admin page. That
 will give you an idea of the transformations the various steps carry out,
 it's invaluable!

 Best
 Erick
 On May 26, 2011 12:53 AM, Saumitra Chowdhury 
 saumi...@smartitengineering.com wrote:
  Hi all ,
  In my schema.xml i am using WordDelimiterFilterFactory,
  LowerCaseFilterFactory, StopFilterFactory for index analyzer and an extra
  SynonymFilterFactory for query analyzer. I am indexing a field name
  '*name*'.Now
  if a value with all caps like NAME_BILL is indexed I am able get this
 as
  search result with the term  *name_bill *,  *NAME_BILL *,  *namebill
 *,
  *namebill** ,  *nameb**  ... But for the term like following  *
  NAME_BILL** ,  *name_bill** ,  *namebill** ,  *NAME**  the result
  does mot show this document. Can anyone please explain why this is
  happening? .In fact star  *  is not giving any result in many
  cases specially if it is used after full value of a field.
 
  Portion of my schema is given below.
 
  fieldType name=text_ws class=solr.TextField
 positionIncrementGap=100
  -
  analyzer
  tokenizer class=solr.WhitespaceTokenizerFactory/
  /analyzer
  /fieldType
  -
  fieldType name=text class=solr.TextField positionIncrementGap=100
  -
  analyzer type=index
  tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.WordDelimiterFilterFactory generateWordParts=0
  generateNumberParts=0 catenateWords=1 catenateNumbers=1
  catenateAll=0/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.StopFilterFactory ignoreCase=true
  words=stopwords.txt enablePositionIncrements=true/
  /analyzer
  -
  analyzer type=query
  tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.WordDelimiterFilterFactory generateWordParts=0
  generateNumberParts=0 catenateWords=1 catenateNumbers=1
  catenateAll=0/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
  ignoreCase=true expand=true/
  filter class=solr.StopFilterFactory ignoreCase=true
  words=stopwords.txt enablePositionIncrements=true/
  /analyzer
  /fieldType
  -
  fieldType name=textTight class=solr.TextField
  positionIncrementGap=100
  -
  analyzer
  tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.WordDelimiterFilterFactory generateWordParts=0
  generateNumberParts=0 catenateWords=1 catenateNumbers=1
  catenateAll=0/
  filter class=solr.LowerCaseFilterFactory/
  filter class