GitHub user osma opened a pull request:

    https://github.com/apache/jena/pull/97

    JENA-1062: configurable Lucene analyzer for jena-text

    This is a configurable Analyzer implementation for jena-text / Lucene. It 
is similar to what can be achieved in [Solr 
configuration](https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters). 
The current implementation only supports a few basic tokenizers and filters 
included in Lucene. More can be added later if necessary, though some 
tokenizers and filters require extra configuration parameters and currently 
there is no mechanism for specifying these.
    
    Tokenizers:
    * StandardTokenizer
    * KeywordTokenizer
    * WhitespaceTokenizer
    * LetterTokenizer
    
    Filters:
    * StandardFilter
    * LowerCaseFilter
    * ASCIIFoldingFilter
    
    Configuration can be done in the assembler like this:
    ```
    text:analyzer [
    a text:ConfigurableAnalyzer ;
    text:tokenizer text:KeywordTokenizer ;
    text:filters (text:ASCIIFoldingFilter, text:LowerCaseFilter)
    ]
    ```
    
    When used directly from Java code, the ConfigurableAnalyzer accepts one 
String parameter specifying the Tokenizer name and another parameter which is a 
List<String> specifying (optional) filters in the order they should be applied.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/osma/jena jena-text-configurable-analyzer

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/jena/pull/97.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #97
    
----
commit 168165a4d2801734fe2551c3c585ba118327e863
Author: Osma Suominen <[email protected]>
Date:   2015-11-04T18:32:03Z

    JENA-1062: configurable Lucene analyzer for jena-text

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to