is it possible to get the data you used to train the recaser? There is no
encoding normalization step.

Hieu Hoang
http://www.hoang.co.uk/hieu

On 3 August 2016 at 14:14, Vito Mandorino <vito.mandor...@linguacustodia.com
> wrote:

> Dear all,
>
> I encountered a problem when training a recaser. When launching the command
>
> ./mosesdecoder/scripts/recaser/train-recaser.perl --first-step 3 --dir
> model --corpus corpus.en  --train-script
> ./mosesdecoder/scripts/training/train-model.perl
>
> the phrase-table ends up having several seemingly identical translation
> options:
>
> naţională ||| Naţională ||| 1 ||| 0-0 ||| 30 30 ||| 30 |||
> naţională ||| Naţională ||| 1 ||| 0-0 ||| 36 36 ||| 36 |||
> naţională ||| Naţională ||| 1 ||| 0-0 ||| 39 39 ||| 39 |||
> naţională ||| Naţională ||| 1 ||| 0-0 ||| 4 4 ||| 4 |||
>
> and a segmentation fault occurs when compressing to compact table using
> the processPhraseTableMin executable.
>
> Could that be due to a missing encoding normalization step somewhere in
> the procedure?
> Using a previous version of Moses, the same command above yields just the
> line
>
> naţională ||| Naţională ||| 1 1 1 1 ||| 0-0 ||| 109 109 109 ||| |||
>
>
> Thanks,
>
> Vito Mandorino
> --
> *M**. Vito MANDORINO -- Chief Scientist*
>
>
> [image: Description : Description : lingua_custodia_final full logo]
>
>  *The Translation Trustee*
>
> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
>
> *Tel : +33 1 30 44 04 23   Mobile : +33 6 84 65 68 89
> <%2B33%206%2084%2065%2068%2089>*
>
> *Email :*  *vito.mandor...@linguacustodia.com
> <massinissa.ah...@linguacustodia.com>*
>
> *Website :*
> *www.linguacustodia.finance <http://www.linguacustodia.com/>*
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to