Hi Vito
The tcmalloc message is normal.
Are you absolutely sure you are using the same model (and same pre- and
post-processing)? A difference of 5 or 14 bleu should be quite visible
in the output. What do the outputs look like?
cheers - Barry
On 26/11/15 09:58, Vito Mandorino wrote:
Hi Barry,
actually with OnDisk table there is virtually no difference (0.2
average difference no matter if re-tuning has been done or not).
With compact Phrase-table however the difference is larger. The latest
test this morning yields a loss of 14 Bleu score points without
re-tuning. I don't know which could be the cause.
Sometimes there is this message on loading the phrase-tables
tcmalloc: large alloc 1149427712 bytes == 0x28a54000 @
After re-tuning however the difference in BLEU score gets smaller even
with compact phrase-table.
Best regards,
Vito
2015-11-25 21:23 GMT+01:00 Barry Haddow <[email protected]
<mailto:[email protected]>>:
Hi Vito
The 0.2 difference is after retuning? That's normal then.
But a difference of 5 bleu without retuning suggests a bug. Did
you say that this only happens with PhraseDictionaryMultiModel?
cheers - Barry
On 25/11/15 13:53, Vito Mandorino wrote:
Thank you. In our tests it seems that with the OnDisk table the
quality is basically the same between the two versions of Moses
(average 0.2 difference in score Bleu) but for the
CompactPhraseTable the difference is larger (2 points Bleu loss
in average after re-tuning with the new version of Moses, and
more than 5 points Bleu in average without re-tuning).
Do you think a better quality would be obtained by running a
complete re-training of the model with the new version of Moses?
Best regards,
Vito
2015-11-24 16:31 GMT+01:00 Hieu Hoang <[email protected]
<mailto:[email protected]>>:
There was a change in the underlying datastructure for
stacks, it changed from std::set (ordered) to
boost::unordered_set.
https://github.com/moses-smt/mosesdecoder/commit/6b182ee5e987a5b2823aea7eaaa7ef0457c6a30d
This got some speed gains
1 5 10 15 20 25 30 35
56 real4m57.795s real1m19.005s real0m51.636s
real0m49.624s real0m49.869s real0m52.475s
real0m53.806s real 0m54.957s
13/10 baseline user4m41.255s user5m45.086s user6m34.053s
user8m12.430s user8m10.667s user8m16.486s
user8m10.592s user 8m13.859s
sys0m16.514s sys0m35.494s sys0m54.513s sys1m10.643s
sys1m18.449s sys1m21.738s sys1m23.133s sys 1m25.048s
57 real4m41.148s real1m16.002s real0m50.747s
real0m48.711s real0m49.130s real0m51.473s
real0m53.141s real 0m54.513s
(56) + unordered set stack user4m23.968s user5m30.356s
user6m26.167s user7m39.286s user7m56.229s user7m52.669s
user7m56.978s user 7m56.216s
sys0m17.231s sys0m35.063s sys0m54.081s sys1m10.137s
sys1m17.194s sys1m22.912s sys1m25.948s sys 1m26.247s
However, the hypotheses are now added to the stack in a
different order so there will be slight differences in results
On 24/11/2015 13:53, Vito Mandorino wrote:
Hi,
in some of our tests a recent version of Moses (pulled from
github last week) and an older one do not give the same
translations on the same source segment (with the same
moses.ini).
Here is the 5-best list for the translation of 'test' with
the last week version:
0 ||| test ||| LexicalReordering0= -1.1969 0 0 0 0 0
Distortion0= 0 LM0= -51.1788 WordPenalty0= -1
PhrasePenalty0= 1 PhraseDictionaryMultiModel0= -3.03811
-2.5834 -2.08503 -1.83075 ||| -1.27754
0 ||| testing ||| LexicalReordering0= 0 0 0 0 0 0
Distortion0= 0 LM0= -35.1495 WordPenalty0= -1
PhrasePenalty0= 1 PhraseDictionaryMultiModel0= -5.21045
-5.04877 -4.71131 -4.66382 ||| -1.70337
0 ||| funds ||| LexicalReordering0= -3.1355 0 0 0 0 0
Distortion0= 0 LM0= -11.3753 WordPenalty0= -1
PhrasePenalty0= 1 PhraseDictionaryMultiModel0= -10.8209
-10.6835 -5.14555 -5.73388 ||| -1.77009
0 ||| known as a ||| LexicalReordering0= -3.1355 0 0 0 0 0
Distortion0= 0 LM0= -58.8877 WordPenalty0= -3
PhrasePenalty0= 1 PhraseDictionaryMultiModel0= -4.42285
-11.9339 -5.14555 -18.0392 ||| -1.89152
0 ||| as a ||| LexicalReordering0= -3.1355 0 0 0 0 0
Distortion0= 0 LM0= -35.5353 WordPenalty0= -2
PhrasePenalty0= 1 PhraseDictionaryMultiModel0= -9.34698
-11.9339 -5.14555 -9.14874 ||| -1.89159
and with the older version of Moses:
0 ||| funds ||| LexicalReordering0= -3.1355 0 0 0 0 0
Distortion0= 0 LM0= -11.3753 WordPenalty0= -1
PhrasePenalty0= 1 PhraseDictionaryMultiModel0= -2.52548
-2.52544 -2.45544 -2.48609 ||| -0.815668
0 ||| as a ||| LexicalReordering0= -3.1355 0 0 0 0 0
Distortion0= 0 LM0= -35.5353 WordPenalty0= -2
PhrasePenalty0= 1 PhraseDictionaryMultiModel0= -2.52464
-2.52565 -2.45544 -2.5244 ||| -0.953799
0 ||| as ||| LexicalReordering0= -3.1355 0 0 0 0 0
Distortion0= 0 LM0= -34.1633 WordPenalty0= -1
PhrasePenalty0= 1 PhraseDictionaryMultiModel0= -2.5256
-2.52565 -2.45544 -2.48609 ||| -1.07254
0 ||| known as a ||| LexicalReordering0= -3.1355 0 0 0 0 0
Distortion0= 0 LM0= -58.8877 WordPenalty0= -3
PhrasePenalty0= 1 PhraseDictionaryMultiModel0= -2.38597
-2.52565 -2.45544 -2.52573 ||| -1.07536
0 ||| is known as a ||| LexicalReordering0= -3.1355 0 0 0 0
0 Distortion0= 0 LM0= -80.8518 WordPenalty0= -4
PhrasePenalty0= 1 PhraseDictionaryMultiModel0= -2.37158
-2.52565 -2.45544 -2.52573 ||| -1.18753
This looks very strange. The only difference is in the
phrase-table scores. Do you have any idea of what is going
on? The only possibility which come to mind is maybe a
different handling of the PhraseDictionaryMultiModel feature.
The moses.ini is in attachment.
Best regards,
Vito
--
*M**. Vito MANDORINO -- Chief Scientist*
Description : Description : lingua_custodia_final full logo
*/The Translation Trustee/*
*1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
*Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89
<tel:%2B33%206%2084%2065%2068%2089>*
*Email
:****<mailto:[email protected]>[email protected]
<mailto:[email protected]>***
*Website :****www.linguacustodia.com
<http://www.linguacustodia.com> -
www.thetranslationtrustee.com
<http://www.thetranslationtrustee.com/>*
_______________________________________________
Moses-support mailing list
[email protected] <mailto:[email protected]>
http://mailman.mit.edu/mailman/listinfo/moses-support
--
Hieu Hoang
http://www.hoang.co.uk/hieu
--
*M**. Vito MANDORINO -- Chief Scientist*
Description : Description : lingua_custodia_final full logo
*/The Translation Trustee/*
*1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
*Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89
<tel:%2B33%206%2084%2065%2068%2089>*
*Email
:****<mailto:[email protected]>[email protected]
<mailto:[email protected]>***
*Website :****www.linguacustodia.com
<http://www.linguacustodia.com/> - www.thetranslationtrustee.com
<http://www.thetranslationtrustee.com/>*
This body part will be downloaded on demand.
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
--
*M**. Vito MANDORINO -- Chief Scientist*
Description : Description : lingua_custodia_final full logo
*/The Translation Trustee/*
*1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
*Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89*
*Email :****[email protected]
<mailto:[email protected]>***
*Website :****www.linguacustodia.com
<http://www.linguacustodia.com/> - www.thetranslationtrustee.com
<http://www.thetranslationtrustee.com/>*
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support