I see... Thanks, Tom :) El 14 de julio de 2011 17:23, Tom Hoar <[email protected] > escribió:
> Hi Rubén, > > Are you talking about MMM's Windows TMX extractor? If so, check your TMX > file(s) to make sure they're not corrupted in some way, like not having > equal TUV's within each TU. > > "ready-to-use" is always a subjective measure. TMX extractors have various > degrees of support for the TMX format. Plus, the extractor can't improve on > data that's inappropriate for SMT. TM's that come from translation memory > systems contain a lot of data that's not relevant to SMT training. > > Tom > > > > On Thu, 14 Jul 2011 16:39:41 +0200, Rubén de la Fuente < > [email protected]> wrote: > > Dear all, > I am playing around with Moses for mere mortals (0.991). I have a very > large EN>ES TM. When I try to extract the corpus with the Windows add-in, it > turns out EN file has 255 559 lines as opposed to 262 858 in the ES one. > This will cause problems in the training phase. Any idea how to tackle it? I > thought the Windows add-in would provide ready-to-use corpora... Thanks in > advance for your help. Cheers, > > > -- > Rubén de la Fuente > EN/FR>ES translator > [email protected] > +34 686 33 59 97 > skype: rudelafuente > twitter: rubendelafuente <http://twitter.com/rubendelafuente> > <http://twitter.com/rubendelafuente> > http://es.linkedin.com/in/rubendelafuente > www.wordbonds.es > Wordbonds in Facebook<http://www.facebook.com/#!/pages/wordbonds/127510957570> > > -- Rubén de la Fuente EN/FR>ES translator [email protected] +34 686 33 59 97 skype: rudelafuente twitter: rubendelafuente <http://twitter.com/rubendelafuente> <http://twitter.com/rubendelafuente> http://es.linkedin.com/in/rubendelafuente www.wordbonds.es Wordbonds in Facebook<http://www.facebook.com/#!/pages/wordbonds/127510957570>
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
