[jira] [Commented] (JENA-1062) add ConfigurableAnalyzer to jena-text

ASF GitHub Bot (JIRA) Wed, 04 Nov 2015 13:39:59 -0800

    [ 
https://issues.apache.org/jira/browse/JENA-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14990493#comment-14990493
 ]


ASF GitHub Bot commented on JENA-1062:
--------------------------------------

Github user rvesse commented on the pull request:

    https://github.com/apache/jena/pull/97#issuecomment-153874598
  
    Looks good to me
    
    One open question, how does this interact with past work for language 
specific indexing and multi-lingual indexing in general?
    
    It's been a while since I poked around Lucene but I seem to remember that 
it was often necessary to use alternative analysers particularly when you get 
into languages with compound words, non-Latin alphabets, symbolic alphabets 
etc.  This shouldn't be a requirement for merging this work but just wanted to 
check that the current design won't preclude support for this in the future?
    
    Looking over the code it looks like it should be relatively easy to add new 
analysers and filters as needed but just wanted to make sure I had understood 
the code correctly


> add ConfigurableAnalyzer to jena-text
> -------------------------------------
>
>                 Key: JENA-1062
>                 URL: https://issues.apache.org/jira/browse/JENA-1062
>             Project: Apache Jena
>          Issue Type: New Feature
>          Components: Text
>            Reporter: Osma Suominen
>            Assignee: Osma Suominen
>
> This is an alternative to JENA-1058 (which implemented a very specific Lucene 
> Analyzer for jena-text). The idea here, based on a comment by Claude Warren 
> on JENA-1058, is to provide a ConfigurableAnalyzer that can be configured 
> with a Tokenizer and (optionally) one or more TokenFilters, like this:
> text:analyzer [
>   a text:ConfigurableAnalyzer ;
>   text:tokenizer text:KeywordTokenizer ;
>   text:filters (text:ASCIIFoldingFilter, text:LowerCaseFilter)
> ]
> I have some code ready to implement this and will open a PR shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (JENA-1062) add ConfigurableAnalyzer to jena-text

Reply via email to