I am not clear with the syntax of filter-model-given-input.pl
target-dir = where we want the filtered PT to go ?
moses.ini = if it is not in the above directory the script does accept it
input.txt = ? what is it in the case I just want to adjust the MinScore ?
Le 31/08/2015 16:54, Philipp Koehn a écrit :
Hi,
the filter script filters an existing phrase table. With the EMS
settings, it would build another phrase table.
Don't worry about the reordering table. It will have excess entries,
but they will not be used.
If you really care, you can used the script
scripts/training/remove-orphan-phrase-pairs-from-reordering-table.perl
-phi
On Mon, Aug 31, 2015 at 10:50 AM, Vincent Nguyen <[email protected]
<mailto:[email protected]>> wrote:
thanks, will try and post results.
just to be clear:
I can re-use the previous extract file
I have to rebuild the phrase-table with new min score (ie no way
to just filter the previous one ?)
do I have to rebuild the reordering table too ?
Vincent
Le 31/08/2015 16:44, Philipp Koehn a écrit :
hI,
0.0001 should have no impact on translation quality,
0.001 will have some impact
0.01 is probably a bit too drastic.
But that's the range you should explore.
-phi
On Mon, Aug 31, 2015 at 10:33 AM, Vincent Nguyen <[email protected]
<mailto:[email protected]>> wrote:
is there any benchmark on what value / what impact ?
what should I start with as a test 0.001 ?
the standard value 0.0001 seems really really low to me ....
maybe I am not getting what this probability exactly refers to.
where |FIELDn| is the position of the score (typically 2 for
the direct phrase probability p(e|f), or 0 for the indirect
phrase probability p(f|e)) and |THRESHOLD| the maximum
probability allowed. A good setting is |2:0.0001|, which
removes all rules, where the direct phrase translation
probability is below 0.0001.
Le 31/08/2015 16:14, Philipp Koehn a écrit :
Hi,
I would suspect that with beam sizes <500 the bulk of the
time is
spent on translation option collection, not decoding. You
could speed
that up with tighter threshold pruning of the phrase table.
See the script scripts/training/threshold-filter.perl or the
setting
score-settings = "--MinScore 2:0.0001"
in EMS.
-phi
On Mon, Aug 31, 2015 at 3:03 AM, Vincent Nguyen
<[email protected] <mailto:[email protected]>> wrote:
Hi,
Here are some results with several values with cube
pruning pop limit :
(pop limit / decoding time for 3000 sentences / BLEU score)
5000 - 15m45 - 29.59
1000 - 4m27 - 29.59
500 - 3m35 - 29.59
200 - 3m15 - 29.51
100 - 3m00 - 29.40
Therefore I took 400 - 3m19 - 29.58
If I am not mistaken the default value for Moses is 1000
[read in the
doc] but in the EMS
it is 5000 right now .... which makes the experience so
long .....
I suggest to change the EMS default value.
Is there a way to also use a cube pruning limit in the
decoder at Tuning
time ?
Now with this optimized setting I get a ration of 15
segments per second
in average.
What is the reason for online tools like Google / Bing
to be much much
faster.
it's not a machine issue, is it ?
Cheers
Vincent
_______________________________________________
Moses-support mailing list
[email protected] <mailto:[email protected]>
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support