Re: Spellchecking Escaped Queries

2011-04-04 Thread Colin Vipurs
Thanks Chris, 

The field used for indexing and spellcheck is the same and is configured
like this:..


fieldType name=title stored=true indexed=true multiValued=false 
class=solr.TextField 
   analyzer
  tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt 
ignoreCase=true expand=true/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.PatternReplaceFilterFactory
pattern=^([^!]+)\!([^!]+)$
replacement=$1i$2
replace=all/ 
 filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1 catenateWords=1 catenateNumbers=0 catenateAll=1 
splitOnCaseChange=1 preserveOriginal=1/
 filter class=solr.ASCIIFoldingFilterFactory/
   /analyzer
/fieldType


I use the pattern replace filter to swap all instances of ! within a
word to i.  I know this part is working correctly as performing a
search works correctly.

The spellcheck is initialized like this:


searchComponent name=spellcheck class=solr.SpellCheckComponent
   str name=queryAnalyzerFieldTypetitle/str
   lst name=spellchecker
  str name=namedefault/str
  str name=fieldsearchfield/str
  str name=spellcheckIndexDir./spellchecker/str
  str name=buildOnCommitfalse/str
   /lst
/searchComponent

And is attached to as a component to my search handler.

Thanks,

Colin


 : I'm having an issue performing a spellcheck on some information and
 : search of the archive isn't helping.
 
 For this type of quesiton, there's not much feedback anyone can offer w/o 
 knowing exactly what analyzers you have configured for hte various 
 fieldtypes (both the field you index/search and the fieldtype used for 
 spellchecking)
 
 it's also fairly critical to know how you have the spellcheck component 
 configured.
 
 off the cuff: i'd guess that maybe WordDelimiterFilter is being used in a 
 wonky way given your usecase -- but like i said: would need to see the 
 configs to make a guess.
 
 
 -Hoss
 
 __
 This email has been scanned by the MessageLabs Email Security System.
 For more information please visit http://www.messagelabs.com/email 
 __


-- 


Colin Vipurs
Server Team Lead

Shazam Entertainment Ltd   
26-28 Hammersmith Grove, London W6 7HA
m:   +44 (0)  000 000   t: +44 (0) 20 8742 6820
w:www.shazam.com

Please consider the environment before printing this document

This e-mail and its contents are strictly private and confidential. It
must not be disclosed, distributed or copied without our prior consent.
If you have received this transmission in error, please notify Shazam
Entertainment immediately on: +44 (0) 020 8742 6820 and then delete it
from your system. Please note that the information contained herein
shall additionally constitute Confidential Information for the purposes
of any NDA between the recipient/s and Shazam Entertainment. Shazam
Entertainment Limited is incorporated in England and Wales under company
number 3998831 and its registered office is at 26-28 Hammersmith Grove,
London W6 7HA. 




__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__

Re: Spellchecking Escaped Queries

2011-04-04 Thread Colin Vipurs
Thanks Chris, 

The field used for indexing and spellcheck is the same and is configured
like this:..


fieldType name=title stored=true indexed=true multiValued=false 
class=solr.TextField 
   analyzer
  tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt 
ignoreCase=true expand=true/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.PatternReplaceFilterFactory
pattern=^([^!]+)\!([^!]+)$
replacement=$1i$2
replace=all/ 
 filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1 catenateWords=1 catenateNumbers=0 catenateAll=1 
splitOnCaseChange=1 preserveOriginal=1/
 filter class=solr.ASCIIFoldingFilterFactory/
   /analyzer
/fieldType


I use the pattern replace filter to swap all instances of ! within a
word to i.  I know this part is working correctly as performing a
search works correctly.

The spellcheck is initialized like this:


searchComponent name=spellcheck class=solr.SpellCheckComponent
   str name=queryAnalyzerFieldTypetitle/str
   lst name=spellchecker
  str name=namedefault/str
  str name=fieldsearchfield/str
  str name=spellcheckIndexDir./spellchecker/str
  str name=buildOnCommitfalse/str
   /lst
/searchComponent


This is attached as a component to my search handler and spellchecking
is done inline with the queries.

Thanks,

Colin



 : I'm having an issue performing a spellcheck on some information and
 : search of the archive isn't helping.
 
 For this type of quesiton, there's not much feedback anyone can offer w/o 
 knowing exactly what analyzers you have configured for hte various 
 fieldtypes (both the field you index/search and the fieldtype used for 
 spellchecking)
 
 it's also fairly critical to know how you have the spellcheck component 
 configured.
 
 off the cuff: i'd guess that maybe WordDelimiterFilter is being used in a 
 wonky way given your usecase -- but like i said: would need to see the 
 configs to make a guess.
 
 
 -Hoss
 
 __
 This email has been scanned by the MessageLabs Email Security System.
 For more information please visit http://www.messagelabs.com/email 
 __


-- 


Colin Vipurs
Server Team Lead

Shazam Entertainment Ltd   
26-28 Hammersmith Grove, London W6 7HA
m:   +44 (0)  000 000   t: +44 (0) 20 8742 6820
w:www.shazam.com

Please consider the environment before printing this document

This e-mail and its contents are strictly private and confidential. It
must not be disclosed, distributed or copied without our prior consent.
If you have received this transmission in error, please notify Shazam
Entertainment immediately on: +44 (0) 020 8742 6820 and then delete it
from your system. Please note that the information contained herein
shall additionally constitute Confidential Information for the purposes
of any NDA between the recipient/s and Shazam Entertainment. Shazam
Entertainment Limited is incorporated in England and Wales under company
number 3998831 and its registered office is at 26-28 Hammersmith Grove,
London W6 7HA. 






__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__

Re: Spellchecking Escaped Queries

2011-04-04 Thread Colin Vipurs
Apologies for the duplicate post.  I'm having Evolution problems


 Thanks Chris, 
 
 The field used for indexing and spellcheck is the same and is
 configured like this:..
 
 
 fieldType name=title stored=true indexed=true multiValued=false 
 class=solr.TextField 
analyzer
   tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.SynonymFilterFactory synonyms=synonyms.txt 
 ignoreCase=true expand=true/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.PatternReplaceFilterFactory
   pattern=^([^!]+)\!([^!]+)$
   replacement=$1i$2
   replace=all/ 
  filter class=solr.WordDelimiterFilterFactory 
 generateWordParts=1 generateNumberParts=1 catenateWords=1 
 catenateNumbers=0 catenateAll=1 splitOnCaseChange=1 
 preserveOriginal=1/
  filter class=solr.ASCIIFoldingFilterFactory/
/analyzer
 /fieldType
 
 
 I use the pattern replace filter to swap all instances of ! within a
 word to i.  I know this part is working correctly as performing a
 search works correctly.
 
 The spellcheck is initialized like this:
 
 
 searchComponent name=spellcheck class=solr.SpellCheckComponent
str name=queryAnalyzerFieldTypetitle/str
lst name=spellchecker
   str name=namedefault/str
   str name=fieldsearchfield/str
   str name=spellcheckIndexDir./spellchecker/str
   str name=buildOnCommitfalse/str
/lst
 /searchComponent
 
 And is attached to as a component to my search handler.
 
 Thanks,
 
 Colin
 
 
  : I'm having an issue performing a spellcheck on some information and
  : search of the archive isn't helping.
  
  For this type of quesiton, there's not much feedback anyone can offer w/o 
  knowing exactly what analyzers you have configured for hte various 
  fieldtypes (both the field you index/search and the fieldtype used for 
  spellchecking)
  
  it's also fairly critical to know how you have the spellcheck component 
  configured.
  
  off the cuff: i'd guess that maybe WordDelimiterFilter is being used in a 
  wonky way given your usecase -- but like i said: would need to see the 
  configs to make a guess.
  
  
  -Hoss
  
  __
  This email has been scanned by the MessageLabs Email Security System.
  For more information please visit http://www.messagelabs.com/email 
  __
 
 
 -- 
 
 
 Colin Vipurs
 Server Team Lead
 
 Shazam Entertainment Ltd   
 26-28 Hammersmith Grove, London W6 7HA
 m:   +44 (0)  000 000   t: +44 (0) 20 8742 6820
 w:www.shazam.com
 
 Please consider the environment before printing this document
 
 This e-mail and its contents are strictly private and confidential. It
 must not be disclosed, distributed or copied without our prior
 consent. If you have received this transmission in error, please
 notify Shazam Entertainment immediately on: +44 (0) 020 8742 6820 and
 then delete it from your system. Please note that the information
 contained herein shall additionally constitute Confidential
 Information for the purposes of any NDA between the recipient/s and
 Shazam Entertainment. Shazam Entertainment Limited is incorporated in
 England and Wales under company number 3998831 and its registered
 office is at 26-28 Hammersmith Grove, London W6 7HA. 
 
 
 
 
 __
 This email has been scanned by the MessageLabs Email Security System.
 For more information please visit http://www.messagelabs.com/email 
 __
 
 __
 This email has been scanned by the MessageLabs Email Security System.
 For more information please visit http://www.messagelabs.com/email 
 __


-- 


Colin Vipurs
Server Team Lead

Shazam Entertainment Ltd   
26-28 Hammersmith Grove, London W6 7HA
m:   +44 (0)  000 000   t: +44 (0) 20 8742 6820
w:www.shazam.com

Please consider the environment before printing this document

This e-mail and its contents are strictly private and confidential. It
must not be disclosed, distributed or copied without our prior consent.
If you have received this transmission in error, please notify Shazam
Entertainment immediately on: +44 (0) 020 8742 6820 and then delete it
from your system. Please note that the information contained herein
shall additionally constitute Confidential Information for the purposes
of any NDA between the recipient/s and Shazam Entertainment. Shazam
Entertainment Limited is incorporated in England and Wales under company
number 3998831 and its registered office is at 26-28 Hammersmith Grove,
London W6 7HA. 




__
This email has been scanned by the MessageLabs Email 

Re: Spellchecking Escaped Queries

2011-04-02 Thread Chris Hostetter

: I'm having an issue performing a spellcheck on some information and
: search of the archive isn't helping.

For this type of quesiton, there's not much feedback anyone can offer w/o 
knowing exactly what analyzers you have configured for hte various 
fieldtypes (both the field you index/search and the fieldtype used for 
spellchecking)

it's also fairly critical to know how you have the spellcheck component 
configured.

off the cuff: i'd guess that maybe WordDelimiterFilter is being used in a 
wonky way given your usecase -- but like i said: would need to see the 
configs to make a guess.


-Hoss


Spellchecking Escaped Queries

2011-03-21 Thread Colin Vipurs
I'm having an issue performing a spellcheck on some information and
search of the archive isn't helping.

I'm indexing the word p!nk (yes, that's a bang in there), and have a
replacement filter setup so that the ! becomes i.  Looking at the
analyzer the right thing is happening with both the indexer and query
mapping to pink.  When I ask switch on spelling suggestions I get a
suggestion of p!pink which just seems odd.

When I make a request for something like rink, I get the correct
suggestion of pink, but asking for r!nk, I get a suggestion of r!
pink.  It seems like the spellcheck component isn't quite doing the
right thing somewhere.

I'm running 1.4.1 with the
https://issues.apache.org/jira/browse/SOLR-1553 patch applied for the
edismax query parser.

Thanks,

Colin
-- 


Colin Vipurs
Server Team Lead

Shazam Entertainment Ltd   
26-28 Hammersmith Grove, London W6 7HA
m:   +44 (0)  000 000   t: +44 (0) 20 8742 6820
w:www.shazam.com

Please consider the environment before printing this document

This e-mail and its contents are strictly private and confidential. It
must not be disclosed, distributed or copied without our prior consent.
If you have received this transmission in error, please notify Shazam
Entertainment immediately on: +44 (0) 020 8742 6820 and then delete it
from your system. Please note that the information contained herein
shall additionally constitute Confidential Information for the purposes
of any NDA between the recipient/s and Shazam Entertainment. Shazam
Entertainment Limited is incorporated in England and Wales under company
number 3998831 and its registered office is at 26-28 Hammersmith Grove,
London W6 7HA. 




__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__