Re: jena-text LowerCaseKeywordAnalyzer implementation

Osma Suominen Wed, 03 Sep 2014 03:08:06 -0700

Hi!

Any interest in my contribution? Anything I can do to help get this, orsomething like it, into jena-text?


-Osma

01.09.2014, 15:38, Osma Suominen kirjoitti:

Hi!

I liked the option to specify Analyzer for jena-text, as implemented in
JENA-654. But I'd like to use an analyzer that is otherwise like
KeywordAnalyzer but case-insensitive, for use in an
autocomplete/typeahead UI widget. Lucene doesn't include such an
analyzer, but there are several implementations of the same idea, e.g.
in neo4j [1] and stargate [2].

I created my own implementation of such an analyzer and added code to
use it from the assembler. Patch attached.

This analyzer is now in a new package
org.apache.jena.query.text.analyzer, in case other analyzers for
jena-text will appear in the future. If you don't like the new package,
the class can of course be moved to org.apache.jena.query.text.

I also added a test for case-insensitivity. To avoid lots of duplicate
boilerplate code, I slightly modified and subclassed the existing test
for KeywordAnalyzer.

I'd love to see this in the next version of jena-text and Fuseki. Of
course I'll rework the patch if necessary. I can also tweak the web
documentation to mention this analyzer.

-Osma


[1]
https://github.com/apatry/neo4j-lucene4-index/blob/master/src/main/java/org/neo4j/index/impl/lucene/LowerCaseKeywordAnalyzer.java


[2]
https://github.com/tuplejump/stargate-core/blob/master/src/main/java/com/tuplejump/stargate/lucene/CaseInsensitiveKeywordAnalyzer.java



--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Teollisuuskatu 23)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
[email protected]
http://www.nationallibrary.fi

Re: jena-text LowerCaseKeywordAnalyzer implementation

Reply via email to