As Hiue requested, I performed the following: 1. Filtered my existing europarl-fe-en model for the one wiki sentence
2. Converted it to compact format 3. Tested these two filtered model on one sentence - translations were identical So the filtered model does not reproduce the problem. If this helps: my model was trained with quite old version of Moses (July 2012). From: [email protected] [mailto:[email protected]] On Behalf Of Marcin Junczys-Dowmunt Sent: Thursday, July 04, 2013 4:47 PM Cc: [email protected] Subject: Re: [Moses-support] Compact phrase table produces different ranslations than original binary Confirmed, something is indeed not working with the most current version. Phrase tables built with the current version work with an older version, but the current version produces false negatives during querying. I will take a closer look today in the evening. Thanks for posting this, Alexander. W dniu 04.07.2013 13:40, Hieu Hoang pisze: Should be working exactly the same, or better model score. If not, i'll be very worried Alex - is it possible to create a small model for us to reproduce this problem? Perhaps filtering the phrase table and reordering table with the 1 input sentence On 4 July 2013 12:30, Marcin Junczys-Dowmunt <[email protected]<mailto:[email protected]>> wrote: OK. Then I am going to investigate with the newest version. I have to admit I haven't tried it yet after Hieu's recent refactoring work. Should have some results shortly. W dniu 04.07.2013 13:26, Fishkov, Alexander pisze: Hi, Marcin! I have rather new version of Moses. To double-check I pulled the recent version from GitHub and rebuild everything from scratch to repeat my experiments. Everything stays the same - different translations. From: [email protected]<mailto:[email protected]> [mailto:[email protected]] On Behalf Of Marcin Junczys-Dowmunt Sent: Thursday, July 04, 2013 3:00 PM To: [email protected]<mailto:[email protected]> Subject: Re: [Moses-support] Compact phrase table produces different ranslations than original binary Hi Alexander, both phrase tables should produce the same translations (they may differ once every many thousands sentences), so there is definitly something off. Your ini files look fine. How recent is your moses version? Does this happen with the most recent one? Best, Marcin W dniu 04.07.2013 12:24, Fishkov, Alexander pisze: Hi! I recently wanted to convert existing Moses translation models into compact format to reduce memory usage. As far as I understood, compact phrase table representation is a 'lossless compression' of original phrase table that requires a lot of time. However, while testing existing French-English model trained on Europarl corpus I encountered disagreements in translation: Original sentence: Wikipédia est un projet d'encyclopédie collective établie sur Internet, universelle, multilingue et fonctionnant sur le principe du wiki. Compact translation: Wikip?dia un project is a collective encyclopaedia ?tablie sur Internet universal, multilingual and operating sur the principle of wiki. Old translation: Wikip?dia is a project of collective encyclopaedia established on the Internet, universal, multilingual and operating on the principle of wiki. As can be seen, compact translation model seems to know less words or phrases. Is it a bug or is my understanding of compact representation as 'lossless' incorrect? Best regards, Alexander. _______________________________________________ Moses-support mailing list [email protected]<mailto:[email protected]> http://mailman.mit.edu/mailman/listinfo/moses-support _______________________________________________ Moses-support mailing list [email protected]<mailto:[email protected]> http://mailman.mit.edu/mailman/listinfo/moses-support -- Hieu Hoang Research Associate University of Edinburgh http://www.hoang.co.uk/hieu
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
