Hi hwidong, This link lists a table of problematic character and character sequences that must be removed or escaped before training a translation model, and before translating your new work.
DoMY includes two plugin modules, replace-escape-control.py and replace-unescape-control.py, that escape and un-escape these characters. http://www.precisiontranslationtools.com/index.php?option=com_content&view=article&id=94:are-there-characters-that-cause-problems-in-moses&catid=30:key-concepts&Itemid=57 Regards, Tom On Mon, 14 Feb 2011 13:29:54 +0800, Hieu Hoang <[email protected]> wrote: > Hi hwidong > > You probably have to preprosess the corpus to get rid of < and > > symbols, as well as [ and ] symbols > > Hieu > Sent from my flying horse > > On 14 Feb 2011, at 11:30 AM, Hwidong Na <[email protected]> wrote: > >> Hi, >> >> When I extract hierarchical phrases using the EMS. The extraction >> step >> step crashed, and it seems to identify xml tags during the >> extraction. >> For example, one of the error messages is >> >> ERROR: malformed XML: It was kept in the ice bath for 30 min , at >> ambient temperature for 2 h and at < 0 " C for 18 h . It was then >> diluted with CH2Cl2 , washed with water and brine , dried ( MgSO4 ) >> and >> concentrated . >> no target (0) or source (43) words << end insentence 993688 >> T: It was kept in the ice bath for 30 min , at ambient temperature >> for 2 >> h and at < 0 " C for 18 h . It was then diluted with CH2Cl2 , washed >> with water and brine , dried ( MgSO4 ) and concentrated . >> S: 将 其 在 冰浴 中 放置 30 分钟 , 室温 放置 2 小时 , 然后 在 < 0 ℃ >> 下 放置 18 小时 。 将 其 用 CH2Cl2 稀释 , 用水 和 盐 水 洗涤 , 干燥 >> ( MgSO4 ) 并 浓缩 。 >> >> The revision number is 3729. Should I update to the newest revision? >> >> Best regards, >> -- >> Hwidong Na <[email protected]> >> KLE lab, POSTECH, KOREA >> >> >> >> >> >> >> _______________________________________________ >> Moses-support mailing list >> [email protected] >> http://mailman.mit.edu/mailman/listinfo/moses-support _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
