On Feb 16, 2015, at 4:54 PM, Levy, Michael ml...@ushmm.org wrote:
I think you can accomplish what you want by using ICUFoldingFilterFactory
https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ICUFoldingFilterFactory
which should simply perform ICU (cf
I know the documents I’m indexing are written in Spanish, and adding the
following filters to my field definition, I believe I have resolved my problem:
filter class=solr.LowerCaseFilterFactory/
filter class=solr.SnowballPorterFilterFactory language=Spanish /
In other words, my searchable
@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] indexing word documents using solr [diacritics]
How do I retain diacritics in a Solr index, and how to I search for words
containing them?
I have extracted the plain text out of set of Word documents. I have then used
a Perl interface (WebService::Solr) to add