Hi Tom,

I've been using this one for the Edinburgh WMT submission (EN-DE syntax-based) in the last 3 years: https://github.com/rsennrich/wmt2014-scripts/blob/master/hybrid_compound_splitter.py

It implements the hybrid (frequency-based and FST-based) algorithm by Fritzinger & Fraser 2010: "How to Avoid Burning Ducks: Combining Linguistic Analysis and Corpus Statistics for German Compound Processing"

best wishes,
Rico

On 24.08.2016 17:10, Tom Hoar wrote:
Does anyone recommend a German compound splitter? I know it's been
discussed here before. Thanks.

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to