Dears, When I make tokenization on files it replaces the apostrophes with “'” which make sense, but in the other side it crashes the meaning and the order of the words at all, for example:
Sentence before tokenization : Src : keep your notification's payload under 5 kb. Trg: اجعل حمولة الإعلام أقل من 5 كيلوبايت. Sentence after tokenization : Src: keep your notification ' s payload under 5 kb . Trg: اجعل حمولة الإعلام أقل من 5 كيلوبايت . If I translate “keep” without using tokenization it will generates “اجعل” which Is correct but after using tokenization moses generates “الإعلام” which means that the alignment is crashed do I make something wrong? do I miss something or just it is a natural behavior when I use tokenization Thanks Best Regards Ihab Ramadan| Senior Developer| <http://www.saudisoft.com/> Saudisoft - Egypt | Tel +2 02 330 320 37 Ext- 0 | Mob+201007570826 | Fax+20233032036 | Follow us on <http://www.linkedin.com/company/77017?trk=vsrp_companies_res_name&trkInfo=V SRPsearchId%3A1489659901402995947155%2CVSRPtargetId%3A77017%2CVSRPcmpt%3Apri mary> linked | <https://www.facebook.com/pages/Saudisoft-Co-Ltd/289968997768973?ref_type=bo okmark> ZA102637861 | <https://twitter.com/Saudisoft> ZA102637858
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
