[ 
https://issues.apache.org/jira/browse/LUCENE-10098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17413758#comment-17413758
 ] 

ASF subversion and git services commented on LUCENE-10098:
----------------------------------------------------------

Commit 87c5f591b86e0672dd9b6abb4ea42598851f3b99 in lucene-solr's branch 
refs/heads/branch_8x from Robert Muir
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=87c5f59 ]

LUCENE-10098: sync 8.11 CHANGES.txt with main


> Add note/link to GermanAnalyzer for decompounding nouns
> -------------------------------------------------------
>
>                 Key: LUCENE-10098
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10098
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Robert Muir
>            Priority: Major
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> The GermanAnalyzer doesn't split compound nouns.
> Doing this requires some auxiliary data files with strange licenses. But 
> [~uschindler] has documented and packaged everything up to make this easy: 
> https://github.com/uschindler/german-decompounder
> We added a Lucene API example (using CustomAnalyzer) to the README: 
> https://github.com/uschindler/german-decompounder/pull/6
> So I think it would be nice to link to this from the javadocs, it makes it 
> really easy to download the datafiles and configure an appropriate analyzer, 
> if you are OK with Latex and LGPL licenses for the data files (which many 
> folks might be).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to