[jira] Updated: (SOLR-1653) add PatternReplaceCharFilter

Koji Sekiguchi (JIRA) Mon, 14 Dec 2009 18:53:45 -0800

     [ 
https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Koji Sekiguchi updated SOLR-1653:
---------------------------------

    Attachment: SOLR-1653.patch

Excuse myself, because I tried to correct offset per group in a match when I 
started the first patch, I introduced my own syntax. But, yes, now I've 
implemented the offset correction per match, so I can use standard syntax. Here 
is the new patch.

Usage:
{code:title=schema.xml}
<fieldType name="textCharNorm" class="solr.TextField" 
positionIncrementGap="100" >
  <analyzer>
    <charFilter class="solr.PatternReplaceCharFilterFactory"
                pattern="([nN][oO]\.)\s*(\d+)"
                replaceWith="$1$2"/>
    <charFilter class="solr.MappingCharFilterFactory" 
mapping="mapping-ISOLatin1Accent.txt"/>
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
  </analyzer>
</fieldType>
{code}

If there is no objections, I'll commit later today.

> add PatternReplaceCharFilter
> ----------------------------
>
>                 Key: SOLR-1653
>                 URL: https://issues.apache.org/jira/browse/SOLR-1653
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>    Affects Versions: 1.4
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 1.5
>
>         Attachments: SOLR-1653.patch, SOLR-1653.patch
>
>
> Add a new CharFilter that uses a regular expression for the target of replace 
> string in char stream.
> Usage:
> {code:title=schema.xml}
> <fieldType name="textCharNorm" class="solr.TextField" 
> positionIncrementGap="100" >
>   <analyzer>
>     <charFilter class="solr.PatternReplaceCharFilterFactory"
>                 groupedPattern="([nN][oO]\.)\s*(\d+)"
>                 replaceGroups="1,2" blockDelimiters=":;"/>
>     <charFilter class="solr.MappingCharFilterFactory" 
> mapping="mapping-ISOLatin1Accent.txt"/>
>     <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>   </analyzer>
> </fieldType>
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1653) add PatternReplaceCharFilter

Reply via email to