Yes, it might be a good idea to test on a more meaningful text. In fact that's what I've planned for the next step.
I don't like that story, though. I found some of the translations not accurate when I looked at it long ago. And that made me suspect that some other translations in languages I didn't know very well were bad as well. So I simply skipped it. The Block World Corpus is different. It's conceived to illustrate some interesting translation problems. When they are mastered, it's time to move on to more meaningful texts. That's why I believe it would be a help in developing new language pairs. You can imagine the context as a kind of game with 4 players, two men and two women. The playing board has a lot of (sometimes) overlapping circles in different colours. Each player has got some markers in the form of three-dimensional objects like blocks, cones and arrows. A marker can be put in any circle, and the form of the blocks admits that other objects can be put on top of them. Yours, Per Tunedal On Fri, Aug 30, 2013, at 11:12, Francis Tyers wrote: > El dv 30 de 08 de 2013 a les 11:04 +0200, en/na Per Tunedal va escriure: > > Hi again, > > Thank you. I will dig into this. > > > > You didn't answer my question about what's wrong with the English > > version of the Block World Corpus? It might be a good idea to improve > > the language: > > It's not worth improving the language, it's worth using a different type > of corpus. > > > > The English data for the corpus is kind of weird (borderline > > > ungrammatical) in some places. > > > she puts my arrow on my red circle on my circle > she puts a cone on the red circle on the red circle > she puts the blue cone on the red circle on a circle > > These are weird. > > > Feel free to improve the English data. All improvements are welcome! > > The improvement would be to use some real data. Or if not real data, at > least some made up data that can at least be more or less understood. > How about trying your IBM Model 1 trained on the blockword corpus on the > story we often translate: > > https://svn.code.sf.net/p/apertium/svn/nursery/apertium-dan-nor/dev/cuento.da.txt > > Here is the version in Danish. > > Fran > > > ------------------------------------------------------------------------------ > Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! > Discover the easy way to master current and previous Microsoft > technologies > and advance your career. Get an incredible 1,500+ hours of step-by-step > tutorial videos with LearnDevNow. Subscribe today and save! > http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk > _______________________________________________ > Apertium-stuff mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/apertium-stuff ------------------------------------------------------------------------------ Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
