I agree and would like to.
But this is tricky, look at the first 30 lines of my phrase table below.

and this happens a lot in the first line of tables where there are &apos or weird codes, EN/FR pairs do not match.




! ! ! ! ||| ! ! ! ! ||| 0.103413 0.132185 0.103413 0.401758 ||| 0-0 1-1 2-2 3-3 ||| 1 1 1 ||| ||| ! ! ! ) ||| ! ! ! ) ||| 0.339323 0.167884 0.508985 0.4246 ||| 0-0 1-0 2-0 2-1 2-2 3-3 ||| 3 2 2 ||| ||| ! ! ! ||| ! ! ! ||| 0.501834 0.219223 0.716905 0.50463 ||| 0-0 1-1 2-2 ||| 10 7 6 ||| ||| ! ! ! ||| budget ! ! ! ||| 0.0517067 0.219223 0.0147733 4.50635e-05 ||| 0-1 1-2 2-3 ||| 2 7 1 ||| ||| ! ! ) , ||| ! ! ) - , ||| 0.103413 0.111989 0.103413 0.00192967 ||| 0-0 1-1 2-2 3-3 3-4 ||| 1 1 1 ||| ||| ! ! ) ||| ! ! ) ||| 0.103413 0.278429 0.103413 0.533321 ||| 0-0 1-1 2-2 ||| 1 1 1 ||| ||| ! ! ||| ! ! ||| 0.625 0.363573 0.769231 0.633844 ||| 0-0 1-1 ||| 16 13 10 ||| ||| ! ! ||| . ||| 4.65922e-08 6.71089e-07 0.00795487 0.140779 ||| 0-0 1-0 ||| 2.21954e+06 13 1 ||| ||| ! ! ||| budget ! ! ||| 0.0517067 0.363573 0.00795487 5.66022e-05 ||| 0-1 1-2 ||| 2 13 1 ||| ||| ! ! ||| nécessaire ! ! ||| 0.103413 0.363573 0.00795487 0.000130572 ||| 0-1 1-2 ||| 1 13 1 ||| ||| ! [ never again ! ||| ! ||| 6.51628e-06 5.42074e-13 0.103413 0.796143 ||| 0-0 4-0 ||| 15870 1 1 ||| ||| ! ] this is ||| tel est ||| 7.38667e-05 9.16191e-11 0.103413 0.00147917 ||| 2-0 3-1 ||| 1400 1 1 ||| ||| ! ] this ||| tel ||| 1.09594e-05 1.44188e-10 0.103413 0.0035893 ||| 2-0 ||| 9436 1 1 ||| ||| ! ] ||| ! ] ||| 0.103413 0.352335 0.103413 0.472387 ||| 0-0 1-1 ||| 1 1 1 ||| ||| ! & quot ; ||| ! " . et ||| 0.0517067 2.36396e-12 0.0517067 1.88268e-05 ||| 0-0 1-1 2-1 3-3 ||| 2 2 1 ||| ||| ! & quot ; ||| ! " ||| 0.000222394 1.44515e-11 0.0517067 0.518419 ||| 0-0 2-1 ||| 465 2 1 ||| ||| ! & quot ||| ! " . ||| 0.000662906 8.30626e-09 0.0344711 0.00232791 ||| 0-0 1-1 2-1 ||| 156 3 1 ||| ||| ! & quot ||| ! " ||| 0.00218918 8.30626e-09 0.339323 0.518419 ||| 0-0 2-1 ||| 465 3 2 ||| ||| ! & ||| ! ||| 6.51628e-06 7.21755e-05 0.103413 0.796143 ||| 0-0 ||| 15870 1 1 ||| ||| ! ' ] , addressed ||| ! " adressé ||| 0.103413 3.70838e-07 0.103413 0.00596848 ||| 0-0 1-1 2-1 4-2 ||| 1 1 1 ||| ||| ! ' ] , ||| ! " ||| 0.000222394 2.49698e-06 0.103413 0.215573 ||| 0-0 1-1 2-1 ||| 465 1 1 ||| ||| ! ' ] ||| ! " ||| 0.000222394 3.57128e-05 0.103413 0.215573 ||| 0-0 1-1 2-1 ||| 465 1 1 ||| ||| ! ' ' Alstom shares ||| l' on constate un dysfonctionnement ||| 0.0344711 5.62605e-16 0.103413 1.03361e-14 ||| 1-0 2-0 1-1 3-4 4-4 ||| 3 1 1 ||| ||| ! ' ' ||| l' on constate un ||| 0.0147733 1.56906e-11 0.0129267 2.2766e-12 ||| 1-0 2-0 1-1 ||| 7 8 1 ||| ||| ! ' ' ||| l' on constate ||| 0.000984889 1.56906e-11 0.0129267 2.36929e-10 ||| 1-0 2-0 1-1 ||| 105 8 1 ||| ||| ! ' ' ||| l' on ||| 6.76656e-06 1.56906e-11 0.0129267 6.18613e-06 ||| 1-0 2-0 1-1 ||| 15283 8 1 ||| ||| ! ' ' ||| ou que l' on constate ||| 0.0344711 1.56906e-11 0.0129267 4.69534e-15 ||| 1-2 2-2 1-3 ||| 3 8 1 ||| ||| ! ' ' ||| ou que l' on ||| 0.00304157 1.56906e-11 0.0129267 1.22594e-10 ||| 1-2 2-2 1-3 ||| 34 8 1 ||| ||| ! ' ' ||| que l' on constate un ||| 0.0344711 1.56906e-11 0.0129267 4.56092e-14 ||| 1-1 2-1 1-2 ||| 3 8 1 ||| ||| ! ' ' ||| que l' on constate ||| 0.00323167 1.56906e-11 0.0129267 4.74661e-12 ||| 1-1 2-1 1-2 ||| 32 8 1 ||| |||



Le 23/09/2015 15:12, Tom Hoar a écrit :
Vincent,

If you suspect bad entries, isn't it better to address the root of the problem and prepare your training corpus better?


On 9/23/2015 6:46 PM, [email protected] wrote:
Date: Tue, 22 Sep 2015 20:24:02 +0200
From: Philipp Koehn<[email protected]>
Subject: Re: [Moses-support] is there a way to remove a bad entry in
        the phrase table ?
To: Vincent Nguyen<[email protected]>
Cc: moses-support<[email protected]>

Hi,

you can remove it manually (just edit the text file), there will be no
negative consequences.

However, it is not a realistic strategy to try to remove by hand every
offending phrase table entry.

-phi

On Tue, Sep 22, 2015 at 4:05 PM, Vincent Nguyen<[email protected]>  wrote:

>Hi,
>
>I was wondering if after an analysis of the BLEU-Annotation file we
>realize that there must be a bad entry in the phrase table,
>we could remove it manually or in some other ways ?
>
>Gracias.
>V.
>_______________________________________________
>Moses-support mailing list
>[email protected]
>http://mailman.mit.edu/mailman/listinfo/moses-support
>

--
Best regards,

Tom Hoar
Chief Executive Officer
/*Precision Translation Tools Pte Ltd*/
Singapore/Thailand
Web: www.precisiontranslationtools.com <http://www.precisiontranslationtools.com>
Thailand Mobile: +66 87 345-1875
Skype: tahoar


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to