As Hiue requested, I performed the following:

1.       Filtered my existing europarl-fe-en model for the one wiki sentence

2.       Converted it to compact format

3.       Tested these two filtered model on one sentence - translations were 
identical
So the filtered model does not reproduce the problem.
If this helps: my model was trained with quite old version of Moses (July 2012).

From: [email protected] [mailto:[email protected]] On 
Behalf Of Marcin Junczys-Dowmunt
Sent: Thursday, July 04, 2013 4:47 PM
Cc: [email protected]
Subject: Re: [Moses-support] Compact phrase table produces different 
ranslations than original binary

Confirmed, something is indeed not working with the most current version. 
Phrase tables built with the current version work with an older version, but 
the current version produces false negatives during querying. I will take a 
closer look today in the evening. Thanks for posting this, Alexander.

W dniu 04.07.2013 13:40, Hieu Hoang pisze:
Should be working exactly the same, or better model score. If not, i'll be very 
worried

Alex - is it possible to create a small model for us to reproduce this problem? 
Perhaps filtering the phrase table and reordering table with the 1 input 
sentence
On 4 July 2013 12:30, Marcin Junczys-Dowmunt 
<[email protected]<mailto:[email protected]>> wrote:
OK. Then I am going to investigate with the newest version. I have to admit I 
haven't tried it yet after Hieu's recent refactoring work. Should have some 
results shortly.

W dniu 04.07.2013 13:26, Fishkov, Alexander pisze:
Hi, Marcin!
I have rather new version of Moses. To double-check I pulled the recent version 
from GitHub and rebuild everything from scratch to repeat my experiments. 
Everything stays the same - different translations.

From: [email protected]<mailto:[email protected]> 
[mailto:[email protected]] On Behalf Of Marcin Junczys-Dowmunt
Sent: Thursday, July 04, 2013 3:00 PM
To: [email protected]<mailto:[email protected]>
Subject: Re: [Moses-support] Compact phrase table produces different 
ranslations than original binary

Hi Alexander,
both phrase tables should produce the same translations (they may differ once 
every many thousands sentences), so there is definitly something off. Your ini 
files look fine. How recent is your moses version? Does this happen with the 
most recent one?
Best,
Marcin


W dniu 04.07.2013 12:24, Fishkov, Alexander pisze:
Hi!

I recently wanted to convert existing Moses translation models into compact 
format to reduce memory usage. As far as I understood, compact phrase table 
representation is a 'lossless compression' of original phrase table that 
requires a lot of time. However, while testing existing French-English model 
trained on Europarl corpus I encountered disagreements in translation:

Original sentence: Wikipédia est un projet d'encyclopédie collective établie 
sur Internet, universelle, multilingue et fonctionnant sur le principe du wiki.
Compact translation:  Wikip?dia un project is a collective encyclopaedia 
?tablie sur Internet universal, multilingual and operating sur the principle of 
wiki.
Old translation: Wikip?dia is a project of collective encyclopaedia established 
on the Internet, universal, multilingual and operating on the principle of wiki.

As can be seen, compact translation model seems to know less words or phrases. 
Is it a bug or is my understanding of compact representation as 'lossless' 
incorrect?

Best regards, Alexander.




_______________________________________________

Moses-support mailing list

[email protected]<mailto:[email protected]>

http://mailman.mit.edu/mailman/listinfo/moses-support



_______________________________________________
Moses-support mailing list
[email protected]<mailto:[email protected]>
http://mailman.mit.edu/mailman/listinfo/moses-support



--
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to