Hi Mike,
here's a link to the tool Marcin mentioned:
https://github.com/rsennrich/subword-nmt
I haven't tried it on phrase-based MT myself, but feel free to give it a
try.
You could also try other unsupervised morpheme segmenters like
morfessor: https://github.com/aalto-speech/morfessor
I don't know if there's any segmentation methods specific for Cherokee.
best wishes,
Rico
On 01.02.2016 13:31, Marcin Junczys-Dowmunt wrote:
Hi Mike,
Maybe take a look at Rico's tool for handling unknown words in neural
machine translation. I have been playing around with that for
Russian-English and standard phrase-based SMT with some success. I am
just not sure if your small corpora will be enough to learn useful
segmentations though.
It's an unsupervised method for word segmentation. For Russian-English
I created a code dictionary of the 100,000 most-frequent segments per
language. Unseen tokens will get segmented. The segmentation is not
neccessarily similar to a linguisticly correct segmentation, though.
You will probably want to try smaller numbers.
Best,
Marcin
W dniu 2016-02-01 14:12, Michael Joyner napisaĆ(a):
I am trying to use Moses with Cherokee using the New Testament and
Genesis as primary corpus. I am feeding it the WEB, BBE as source
English texts at the moment.
As Cherokee uses bound pronouns and no articles and has almost nil
preposition analogues, (these features are mostly verb infixes), is
there a technique for corpus adjustment that can be done to improve
the phrase mapping between Cherokee and English?
I am currently doing Cherokee => English.
Thanks, Mike
--
WEB: World English Bible (Public Domain)
BBE: Basic English Bible (Public Domain)
* Learn to the Cherokee language: http://jalagigawoni.gnomio.com/
_______________________________________________
Moses-support mailing list
[email protected] <mailto:[email protected]>
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support