the POS taggers will probably produce
POS1 POS2 POS3 POS4 ....
which corresponding to the sentence
word1 word2 word3 word4
to get the format
word1|POS1 word2|POS2 word3|POS3 word4|POS4
Use the script
scripts/training/combine_factors.pl
On 28/03/2014 22:40, Viktor Pless wrote:
Hi everyone,
How can I produce factored training data like below?
"You will have to provide training data in the format
word0factor0|word0factor1|word0factor2
word1factor0|word1factor1|word1factor2 ..."
Could you please tell me the what program produces this format?
Neither MXPOST nor treetagger does it for me.
Thank you in advance.
Viktor
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support