Hi hwidong

You probably have to preprosess the corpus to get rid of < and >
symbols, as well as [ and ] symbols

Hieu
Sent from my flying horse

On 14 Feb 2011, at 11:30 AM, Hwidong Na <[email protected]> wrote:

> Hi,
>
> When I extract hierarchical phrases using the EMS. The extraction step
> step crashed, and it seems to identify xml tags during the extraction.
> For example, one of the error messages is
>
> ERROR: malformed XML: It was kept in the ice bath for 30 min , at
> ambient temperature for 2 h and at < 0 " C for 18 h . It was then
> diluted with CH2Cl2 , washed with water and brine , dried ( MgSO4 ) and
> concentrated .
> no target (0) or source (43) words << end insentence 993688
> T: It was kept in the ice bath for 30 min , at ambient temperature for 2
> h and at < 0 " C for 18 h . It was then diluted with CH2Cl2 , washed
> with water and brine , dried ( MgSO4 ) and concentrated .
> S: 将 其 在 冰浴 中 放置 30 分钟 , 室温 放置 2 小时 , 然后 在 < 0 ℃
> 下 放置 18 小时 。 将 其 用 CH2Cl2 稀释 , 用水 和 盐 水 洗涤 , 干燥
> ( MgSO4 ) 并 浓缩 。
>
> The revision number is 3729. Should I update to the newest revision?
>
> Best regards,
> --
> Hwidong Na <[email protected]>
> KLE lab, POSTECH, KOREA
>
>
>
>
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to