[ 
https://issues.apache.org/jira/browse/LUCENE-10098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17413752#comment-17413752
 ] 

ASF subversion and git services commented on LUCENE-10098:
----------------------------------------------------------

Commit 56968b762add8b51b329c92e617496b200837385 in lucene's branch 
refs/heads/main from Robert Muir
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=56968b7 ]

LUCENE-10098: add note/link to GermanAnalyzer for decompounding nouns. (#294)

LUCENE-10098: add note/link to GermanAnalyzer for decompounding nouns.

We can't do this out of box with the analyzer, due to incompatible
licenses. But we can make it easy on the user to do this, by linking to
repo that has sample code, documentation, and the required data files.


> Add note/link to GermanAnalyzer for decompounding nouns
> -------------------------------------------------------
>
>                 Key: LUCENE-10098
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10098
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Robert Muir
>            Priority: Major
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> The GermanAnalyzer doesn't split compound nouns.
> Doing this requires some auxiliary data files with strange licenses. But 
> [~uschindler] has documented and packaged everything up to make this easy: 
> https://github.com/uschindler/german-decompounder
> We added a Lucene API example (using CustomAnalyzer) to the README: 
> https://github.com/uschindler/german-decompounder/pull/6
> So I think it would be nice to link to this from the javadocs, it makes it 
> really easy to download the datafiles and configure an appropriate analyzer, 
> if you are OK with Latex and LGPL licenses for the data files (which many 
> folks might be).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to