Hi!

I recently wanted to convert existing Moses translation models into compact 
format to reduce memory usage. As far as I understood, compact phrase table 
representation is a 'lossless compression' of original phrase table that 
requires a lot of time. However, while testing existing French-English model 
trained on Europarl corpus I encountered disagreements in translation:

Original sentence: Wikipédia est un projet d'encyclopédie collective établie 
sur Internet, universelle, multilingue et fonctionnant sur le principe du wiki.
Compact translation:  Wikip?dia un project is a collective encyclopaedia 
?tablie sur Internet universal, multilingual and operating sur the principle of 
wiki.
Old translation: Wikip?dia is a project of collective encyclopaedia established 
on the Internet, universal, multilingual and operating on the principle of wiki.

As can be seen, compact translation model seems to know less words or phrases. 
Is it a bug or is my understanding of compact representation as 'lossless' 
incorrect?

Best regards, Alexander.

Attachment: moses-compact.ini
Description: moses-compact.ini

Attachment: moses-original.ini
Description: moses-original.ini

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to