Hi Kevin, a toy corpus such as the Block World Corpus is quite useful for getting an idea of how different translation systems behave. Later on you will likely move on to more meaningful test corpora.
It would of course be interesting to construct such a corpus. I've got some ideas on how to proceed. For real testing purposes you would need a large corpus. The problem is that in such a case you would have to rely on automatic measures of translations quality. And the BLEU score, for instance, tends to get a bit too low on rule based systems compared to statistical MT. Yours, Per Tunedal On Fri, Aug 30, 2013, at 18:03, Kevin Brubeck Unhammer wrote: > Per Tunedal <[email protected]> > writes: > > [...] > > > Right, you seldom translate a Block World Text, but most words and > > grammatical intrinsics are useful. The verbs take and put are in fact > > very frequent. > > Of course they are, they're the only two verbs in your corpus. > > Or did you mean they're frequent outside the block world corpus? How did > you come by that information? If you've already got a real frequency > list, it's a waste of time to make a new one from text that you know is > not natural. > > The same argument goes for grammatical constructions. > > > -- > Kevin Brubeck Unhammer > > http://matt.might.net/articles/how-to-email/ > ------------------------------------------------------------------------------ > Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! > Discover the easy way to master current and previous Microsoft > technologies > and advance your career. Get an incredible 1,500+ hours of step-by-step > tutorial videos with LearnDevNow. Subscribe today and save! > http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk > _______________________________________________ > Apertium-stuff mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/apertium-stuff > Email had 1 attachment: > + Attachment1.2 > 1k (application/pgp-signature) ------------------------------------------------------------------------------ Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
