[ https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Koji Sekiguchi updated SOLR-1653: --------------------------------- Attachment: SOLR-1653.patch Excuse myself, because I tried to correct offset per group in a match when I started the first patch, I introduced my own syntax. But, yes, now I've implemented the offset correction per match, so I can use standard syntax. Here is the new patch. Usage: {code:title=schema.xml} <fieldType name="textCharNorm" class="solr.TextField" positionIncrementGap="100" > <analyzer> <charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([nN][oO]\.)\s*(\d+)" replaceWith="$1$2"/> <charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/> <tokenizer class="solr.WhitespaceTokenizerFactory"/> </analyzer> </fieldType> {code} If there is no objections, I'll commit later today. > add PatternReplaceCharFilter > ---------------------------- > > Key: SOLR-1653 > URL: https://issues.apache.org/jira/browse/SOLR-1653 > Project: Solr > Issue Type: New Feature > Components: Schema and Analysis > Affects Versions: 1.4 > Reporter: Koji Sekiguchi > Assignee: Koji Sekiguchi > Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1653.patch, SOLR-1653.patch > > > Add a new CharFilter that uses a regular expression for the target of replace > string in char stream. > Usage: > {code:title=schema.xml} > <fieldType name="textCharNorm" class="solr.TextField" > positionIncrementGap="100" > > <analyzer> > <charFilter class="solr.PatternReplaceCharFilterFactory" > groupedPattern="([nN][oO]\.)\s*(\d+)" > replaceGroups="1,2" blockDelimiters=":;"/> > <charFilter class="solr.MappingCharFilterFactory" > mapping="mapping-ISOLatin1Accent.txt"/> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > </analyzer> > </fieldType> > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.