Hi all,

I managed to extract rules from a parsed parallel corpus (thanks to
Hieu!), but some of them contain XML strings that I believe should not
be there, for example:

( [pu] ||| <tree [pu] |||  ||| 0.0526316 1 0.025 1 2.718 ||| 19 40
, [pu] ||| </tree> [pu] |||  ||| 0.357143 1 0.0028169 1 2.718 ||| 14 1775
, [pu] ||| <tree [pron-pers] |||  ||| 0.166667 1 0.00056338 1 2.718 ||| 6 1775
</tree> <tree label="np"> <tree [np] ||| a base de cimento [np] |||
||| 1 1 1 1 2.718 ||| 0.5 0.5

I've checked the XML and it seems to be alright.

Is this something expected?

Thanks a lot,

Lucia
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to