Hi Noe,

We had done translation between related languages using BPE with Moses
without using EMS. I did not face any problems in particular. A few things
that we did for our scenario:

-  Sentence length could increase, increasing decoding time. Since, we were
working on related languages we switched off reordering.
- To speed up decoding, we used cube pruning with  a small pop-limit (
https://www.cse.iitb.ac.in/~anoopk/publications/vardial2016_faster_subword.pdf
)
- Again, we used a small BPE size (~3000 words) since we were working with
similar languages and used a higher order LM (10 gram)

You can find more details here:
https://www.cse.iitb.ac.in/~anoopk/publications/sclem2017_bpe_related.pdf

Regards,
Anoop.

You can see details here:
https://www.cse.iitb.ac.in/~anoopk/publications/sclem2017_bpe_related.pdf

On Sat, Mar 16, 2019 at 5:16 PM Noe Casas <[email protected]> wrote:

> Dear Moses Community,
>
> I want to train Moses with byte-pair encoding tokenization (BPE,
> https://github.com/rsennrich/subword-nmt). I plan to do it "by hand"
> without the EMS.
>
> Is there any problem with the idea?
>
> Would it be Ok just to apply BPE after tokenization, truecasing, etc and
> then go on with the rest of the typical steps?
>
> Is there any gotcha I should take into account?
>
> I have only identified as potential pitfall that I have to clean the
> corpus with clean-corpus-n.perl after applying BPE in order not to reach
> the maximum fertility 9 for mgiza.
>
> Any success/failure experiences doing similar stuff are also very welcome.
>
> Thanks,
> Noe.
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>


-- 
I claim to be a simple individual liable to err like any other fellow
mortal. I own, however, that I have humility enough to confess my errors
and to retrace my steps.

http://flightsofthought.blogspot.com
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to