Hi,
What script do you guys use to generate sgm sets based on txt file ?
I have tried makemteval.py in contrib
but there are a few issues.
I think these lines:
lines =
[l.replace('"','\"').replace(''','\'').replace('>','>').replace('<','<').replace('&','&')
for l in filein.read().splitlines()]
filein.close()
lines =
[l.replace('&','&').replace('<','<').replace('>','>').replace('\'',''').replace('\"','"')
for l in lines]
are not 100% bullet proof.
in the output I still get ' and such
it does not handle the
it does not handle the \r\n sequence I think since the output has more
lines than in the txt file.
Maybe there is another script.
thanks.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support