[jira] [Commented] (JENA-1058) add ASCIIFoldingLowerCaseKeywordAnalyzer to jena-text

Osma Suominen (JIRA) Thu, 29 Oct 2015 08:44:13 -0700

    [ 
https://issues.apache.org/jira/browse/JENA-1058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980658#comment-14980658
 ]


Osma Suominen commented on JENA-1058:
-------------------------------------

I agree in principle, but it would take a whole framework to provide this... 
Solr already has support for this, you can configure nested filters using the 
XML configuration file. But with plain Lucene it has to be done via Java code. 
So to implement this, we would need to make a special Assembler that is aware 
of all the relevant Lucene filters and able to string them together based on 
the assembler configuration.

> add ASCIIFoldingLowerCaseKeywordAnalyzer to jena-text
> -----------------------------------------------------
>
>                 Key: JENA-1058
>                 URL: https://issues.apache.org/jira/browse/JENA-1058
>             Project: Apache Jena
>          Issue Type: New Feature
>          Components: Text
>            Reporter: Osma Suominen
>            Assignee: Osma Suominen
>
> I'd like to have an Analyzer for jena-text which is otherwise like 
> LowerCaseKeywordAnalyzer that I've implemented before, but also includes the 
> ASCIIFoldingFilter from Lucene. This means that the comparison will ignore 
> accents, so that for example "deja vu" will match "déjà vu".
> For some background on why I need this, see 
> https://github.com/NatLibFi/Skosmos/issues/313
> I already have an implementation of this ready, will make a PR shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (JENA-1058) add ASCIIFoldingLowerCaseKeywordAnalyzer to jena-text

Reply via email to