Hi hwidong You probably have to preprosess the corpus to get rid of < and > symbols, as well as [ and ] symbols
Hieu Sent from my flying horse On 14 Feb 2011, at 11:30 AM, Hwidong Na <[email protected]> wrote: > Hi, > > When I extract hierarchical phrases using the EMS. The extraction step > step crashed, and it seems to identify xml tags during the extraction. > For example, one of the error messages is > > ERROR: malformed XML: It was kept in the ice bath for 30 min , at > ambient temperature for 2 h and at < 0 " C for 18 h . It was then > diluted with CH2Cl2 , washed with water and brine , dried ( MgSO4 ) and > concentrated . > no target (0) or source (43) words << end insentence 993688 > T: It was kept in the ice bath for 30 min , at ambient temperature for 2 > h and at < 0 " C for 18 h . It was then diluted with CH2Cl2 , washed > with water and brine , dried ( MgSO4 ) and concentrated . > S: 将 其 在 冰浴 中 放置 30 分钟 , 室温 放置 2 小时 , 然后 在 < 0 ℃ > 下 放置 18 小时 。 将 其 用 CH2Cl2 稀释 , 用水 和 盐 水 洗涤 , 干燥 > ( MgSO4 ) 并 浓缩 。 > > The revision number is 3729. Should I update to the newest revision? > > Best regards, > -- > Hwidong Na <[email protected]> > KLE lab, POSTECH, KOREA > > > > > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
