Hi! I recently wanted to convert existing Moses translation models into compact format to reduce memory usage. As far as I understood, compact phrase table representation is a 'lossless compression' of original phrase table that requires a lot of time. However, while testing existing French-English model trained on Europarl corpus I encountered disagreements in translation:
Original sentence: Wikipédia est un projet d'encyclopédie collective établie sur Internet, universelle, multilingue et fonctionnant sur le principe du wiki. Compact translation: Wikip?dia un project is a collective encyclopaedia ?tablie sur Internet universal, multilingual and operating sur the principle of wiki. Old translation: Wikip?dia is a project of collective encyclopaedia established on the Internet, universal, multilingual and operating on the principle of wiki. As can be seen, compact translation model seems to know less words or phrases. Is it a bug or is my understanding of compact representation as 'lossless' incorrect? Best regards, Alexander.
moses-compact.ini
Description: moses-compact.ini
moses-original.ini
Description: moses-original.ini
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
