Hello everyone, The eng-spa parallel corpora I am using(http://www.statmt.org/europarl/, http://www.statmt.org/wmt13/training-parallel-nc-v8.tgz), have empty lines in either languages due to splitting of a sentence into two or merging of two sentences after the translation, which is causing errors during lexical-training. Is it common in parallel corpora? or is there any clean parallel corpus out there? Right now, I am translating the sentences around[up and below] the empty lines and manually merging/splitting them. Is there any better way to do this? Regards, Vivek Vardhan Adepu IRC: vivekvelda*/naan_dhaan*
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff