Yes,
it might be a good idea to test on a more meaningful text. In fact
that's what I've planned for the next step.

I don't like that story, though. I found some of the translations not
accurate when I looked at it long ago. And that made me suspect that
some other translations in languages I didn't know very well were bad as
well. So I simply skipped it.

The Block World Corpus is different. It's conceived to illustrate some
interesting translation problems. When they are mastered, it's time to
move on to more meaningful texts. That's why I believe it would be a
help in developing new language pairs.

You can imagine the context as a kind of game with 4 players, two men
and two women. The playing board has a lot of (sometimes) overlapping
circles in different colours. Each player has got some markers in the
form of three-dimensional objects like blocks, cones and arrows. A
marker can be put in any circle, and the form of the blocks admits that
other objects can be put on top of them.

Yours,
Per Tunedal

On Fri, Aug 30, 2013, at 11:12, Francis Tyers wrote:
> El dv 30 de 08 de 2013 a les 11:04 +0200, en/na Per Tunedal va escriure:
> > Hi again,
> > Thank you. I will dig into this.
> > 
> > You didn't answer my question about what's wrong with the English
> > version of the Block World Corpus? It might be a good idea to improve
> > the language:
> 
> It's not worth improving the language, it's worth using a different type
> of corpus.
> 
> > > The English data for the corpus is kind of weird (borderline
> > > ungrammatical) in some places. 
> 
> 
>  she puts my arrow on my red circle on my circle
>  she puts a cone on the red circle on the red circle
>  she puts the blue cone on the red circle on a circle
> 
> These are weird. 
> 
> > Feel free to improve the English data. All improvements are welcome!
> 
> The improvement would be to use some real data. Or if not real data, at
> least some made up data that can at least be more or less understood.
> How about trying your IBM Model 1 trained on the blockword corpus on the
> story we often translate:
> 
> https://svn.code.sf.net/p/apertium/svn/nursery/apertium-dan-nor/dev/cuento.da.txt
> 
> Here is the version in Danish. 
> 
> Fran
> 
> 
> ------------------------------------------------------------------------------
> Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
> Discover the easy way to master current and previous Microsoft
> technologies
> and advance your career. Get an incredible 1,500+ hours of step-by-step
> tutorial videos with LearnDevNow. Subscribe today and save!
> http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
> _______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff

------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to