[ 
https://issues.apache.org/jira/browse/LUCENE-10098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-10098.
----------------------------------
    Fix Version/s: 8.11
                   main (9.0)
       Resolution: Fixed

> Add note/link to GermanAnalyzer for decompounding nouns
> -------------------------------------------------------
>
>                 Key: LUCENE-10098
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10098
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Robert Muir
>            Priority: Major
>             Fix For: main (9.0), 8.11
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> The GermanAnalyzer doesn't split compound nouns.
> Doing this requires some auxiliary data files with strange licenses. But 
> [~uschindler] has documented and packaged everything up to make this easy: 
> https://github.com/uschindler/german-decompounder
> We added a Lucene API example (using CustomAnalyzer) to the README: 
> https://github.com/uschindler/german-decompounder/pull/6
> So I think it would be nice to link to this from the javadocs, it makes it 
> really easy to download the datafiles and configure an appropriate analyzer, 
> if you are OK with Latex and LGPL licenses for the data files (which many 
> folks might be).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to