Hi Daisy,

I can't see anything wrong with the regex or the XML syntax.

One possibility: if it's Arabic you're matching against, you may want to add 
ARABIC FULL STOP U+06D4 to the set you subtract from \p{Punct}.

If you give an example of your input and your expected output, I might be able 
to help more.

Steve

-----Original Message-----
From: Daisy [mailto:omnia.za...@gmail.com] 
Sent: Monday, September 24, 2012 5:08 AM
To: solr-user@lucene.apache.org
Subject: Solr - Remove specific punctuation marks

Hi;

I am working with apache-solr-3.6.0 on windows machine. I would like to
remove all punctuation marks before indexing except the colon and the
full-stop.

I tried:

<fieldType name="text_ar" class="solr.TextField" positionIncrementGap="100">
      <analyzer> 
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.PatternReplaceFilterFactory"
pattern="[\p{Punct}&&[^\.^\:]]" replacement="" replace="all"/>
      </analyzer>
    </fieldType>
But it didn't work. Any Ideas?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Remove-specific-punctuation-marks-tp4009795.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to