the POS taggers will probably produce
    POS1 POS2 POS3 POS4 ....
which corresponding to the sentence
   word1 word2 word3 word4

to get the format
   word1|POS1 word2|POS2 word3|POS3 word4|POS4

Use the script
  scripts/training/combine_factors.pl

On 28/03/2014 22:40, Viktor Pless wrote:
Hi everyone,
How can I produce factored training data like below?

"You will have to provide training data in the format
word0factor0|word0factor1|word0factor2 word1factor0|word1factor1|word1factor2 ..."

Could you please tell me the what program produces this format? Neither MXPOST nor treetagger does it for me.

Thank you in advance.
Viktor


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to