So how does that work?

it just takes all the words from the corpus and guesses "infix themes" ? Or
do I have to supply pre-tagged data?

On Mon, Feb 1, 2016 at 9:04 AM, Rico Sennrich <[email protected]> wrote:

> Hi Mike,
>
> here's a link to the tool Marcin mentioned:
> https://github.com/rsennrich/subword-nmt
>
> I haven't tried it on phrase-based MT myself, but feel free to give it a
> try.
>
> You could also try other unsupervised morpheme segmenters like morfessor:
> https://github.com/aalto-speech/morfessor
>
> I don't know if there's any segmentation methods specific for Cherokee.
>
> best wishes,
> Rico
>
>
> On 01.02.2016 13:31, Marcin Junczys-Dowmunt wrote:
>
> Hi Mike,
>
> Maybe take a look at Rico's tool for handling unknown words in neural
> machine translation. I have been playing around with that for
> Russian-English and standard phrase-based SMT with some success. I am just
> not sure if your small corpora will be enough to learn useful segmentations
> though.
>
> It's an unsupervised method for word segmentation. For Russian-English I
> created a code dictionary of the 100,000 most-frequent segments per
> language. Unseen tokens will get segmented. The segmentation is not
> neccessarily similar to a linguisticly correct segmentation, though. You
> will probably want to try smaller numbers.
>
> Best,
>
> Marcin
>
> W dniu 2016-02-01 14:12, Michael Joyner napisaƂ(a):
>
>  I am trying to use Moses with Cherokee using the New Testament and
> Genesis as primary corpus. I am feeding it the WEB, BBE as source English
> texts at the moment.
>
> As Cherokee uses bound pronouns and no articles and has almost nil
> preposition analogues, (these features are mostly verb infixes), is there a
> technique for corpus adjustment that can be done to improve the phrase
> mapping between Cherokee and English?
>
> I am currently doing Cherokee => English.
>
> Thanks, Mike
> --
>
> WEB: World English Bible (Public Domain)
> BBE: Basic English Bible (Public Domain)
>
>    - Learn to the Cherokee language:  <http://jalagigawoni.gnomio.com/>
>    http://jalagigawoni.gnomio.com/
>
>
> _______________________________________________
> Moses-support mailing 
> [email protected]http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>
>
> _______________________________________________
> Moses-support mailing 
> [email protected]http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>


-- 

   - Learn to the Cherokee language: http://jalagigawoni.gnomio.com/
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to