Thank you Best regards
Cyrine 2013/7/17 Philipp Koehn <[email protected]> > Hi, > > the corpus filtering script that you are using expects a parallel corpus > in the > format of two files, with corresponding lines referring to parallel > sentences. > Hence, they need to have the same number of lines. > > You will get the quoted error message, if the two files have different > number > of lines, which is not the right starting point for this process. This > may be bad > data, or you have to run a sentence aligner first. > > -phi > > On Tue, Jul 16, 2013 at 6:48 AM, Cyrine NASRI <[email protected]> > wrote: > > > > Hello, > > > > I'm trying to filter out long sentences using clean-corpus-n.pl, it dies > > after a while saying "europarl.tok.fr is too short!" > > > > this what i do : > > > > clean-corpus-n.perl corpus.tok.low de en clean 1 50 > > > > Could someone please tell me if there is something obvious that I'm > missing? > > Regards, > > > > Cyrine > > > > > > -- > > Cyrine NASRI > > Ph.D. Student in Computer Science > > > > _______________________________________________ > > Moses-support mailing list > > [email protected] > > http://mailman.mit.edu/mailman/listinfo/moses-support > > > -- *Cyrine NASRI Ph.D. Student in Computer Science*
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
