Dawid Weiss created LUCENE-6833:
-----------------------------------
Summary: Upgrade morfologik to version 2.0.1, simplify
MorfologikFilter's dictionary lookup
Key: LUCENE-6833
URL: https://issues.apache.org/jira/browse/LUCENE-6833
Project: Lucene - Core
Issue Type: Improvement
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
Fix For: Trunk
This is a follow-up to Uwe's work on LUCENE-6774.
This patch updates the code to use Morfologik stemming version 2.0.1, which
removes the "automatic" lookup of classpath-relative dictionary resources in
favor of an explicit InputStream or URL. So the user code is explicitly
responsible to provide these resources, reacting to missing files, etc.
There were no other "default" dictionaries in Morfologik other than the Polish
dictionary so I also cleaned up the filter code from a number of attributes
that were, to me, confusing.
* {{MorfologikFilterFactory}} now accepts an (optional) {{dictionary}}
attribute which contains an explicit name of the dictionary resource to load.
The resource is loaded with a {{ResourceLoader}} passed to the {{inform(..)}}
method, so the final location depends on the resource loader.
* There is no way to load the dictionary and metadata separately (this isn't at
all useful).
* If the {{dictionary}} attribute is missing, the filter loads the Polish
dictionary by default (since most people would be using Morfologik for stemming
Polish anyway).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]