El dv 30 de 08 de 2013 a les 11:04 +0200, en/na Per Tunedal va escriure:
> Hi again,
> Thank you. I will dig into this.
> 
> You didn't answer my question about what's wrong with the English
> version of the Block World Corpus? It might be a good idea to improve
> the language:

It's not worth improving the language, it's worth using a different type
of corpus.

> > The English data for the corpus is kind of weird (borderline
> > ungrammatical) in some places. 


 she puts my arrow on my red circle on my circle
 she puts a cone on the red circle on the red circle
 she puts the blue cone on the red circle on a circle

These are weird. 

> Feel free to improve the English data. All improvements are welcome!

The improvement would be to use some real data. Or if not real data, at
least some made up data that can at least be more or less understood.
How about trying your IBM Model 1 trained on the blockword corpus on the
story we often translate:

https://svn.code.sf.net/p/apertium/svn/nursery/apertium-dan-nor/dev/cuento.da.txt

Here is the version in Danish. 

Fran


------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to