I meant to say. Now my analser chain looks like this.
<analyzer type="index">
<charFilter class="solr.PatternReplaceCharFilterFactory"
pattern="[-_]" replacement=" " />
<charFilter class="solr.PatternReplaceCharFilterFactory"
pattern="[^\p{L}\p{Nd}\p{Mn}\p{Mc}\s+]" replacement="" />
<tokenizer class="solr.WhitespaceTokenizerFactory" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.StopWordFilterFactory" ignoreCase="true"
words="words.txt" />
<filter
class="org.ctown.solr.analysis.CTConcatFilterFactory" />
</analyzer>
<analyzer type="query">
<charFilter class="solr.PatternReplaceCharFilterFactory"
pattern="[-_]" replacement=" " />
<charFilter class="solr.PatternReplaceCharFilterFactory"
pattern="[^\p{L}\p{Nd}\p{Mn}\p{Mc}\s+]" replacement="" />
<tokenizer class="solr.KeywordTokenizerFactory" />
</analyzer> But only my first document is getting indexed. Is there any logging I can enable to see what is going wrong? -- View this message in context: http://lucene.472066.n3.nabble.com/Writing-a-TokenConcatenateFilter-junk-characters-appearing-on-output-tp3383684p3384419.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
