[Solr Wiki] Update of "SchemaDesign" by Lance Norskog

Apache Wiki Mon, 02 Feb 2009 02:22:21 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change 
notification.


The following page has been changed by Lance Norskog:
http://wiki.apache.org/solr/SchemaDesign

------------------------------------------------------------------------------
  }}}
  There may be performance differences with this technique v.s. the Lucene 
sorting algorithm.
  
- '''Alternative Text Search Field types'''
+ '''Alternative Text Search Field types'''[[BR]]
  The "text" field type in the example schema.xml provides basic text search 
for English text. But, it has a surprise: the actual text given to this field 
is not indexed as-is, and therefore searching for the raw text may not work. If 
you store "To Be Or Not To Be" in a "text" field, none of these words will find 
this document, nor will the phrase in quotes.
  
  '''Phrase search'''[[BR]]
  If you want to have any phrase search work as well as individual words, you 
need to have two fields. Both should be processed similarly, but the phrase 
search field should not use "stemming" or "stopwords". Usually use can populate 
this field using the <copyField> directive.
  
- '''Phonemes'''
+ '''Phonemes'''[[BR]]
  Programmers are perfect spellers and expect the same of their users. A 
phoneme represents (roughly) the sound of one syllable. Phoneme-based searching 
can give users a better search experience. To support misspelled search words 
Phoneme filters cause the index to store phoneme-base representations of the 
text instead of the input. 
  
  To create a phoneme-based field, you need a text filter stack that does not 
include stemming or stopwords, and add the  solr.PhoneticFilterFactory (see 
[AnalyzersTokenizersTokenFilters]) with one of the available encoders. This 
must be in both the indexing and query stack. Of the several available the 
"Double Metaphone" filter is the most popular and does well with non-English 
text. There are as yet no language-specific phoneme encoders.

[Solr Wiki] Update of "SchemaDesign" by Lance Norskog

Reply via email to