El dv 30 de 08 de 2013 a les 11:04 +0200, en/na Per Tunedal va escriure: > Hi again, > Thank you. I will dig into this. > > You didn't answer my question about what's wrong with the English > version of the Block World Corpus? It might be a good idea to improve > the language:
It's not worth improving the language, it's worth using a different type of corpus. > > The English data for the corpus is kind of weird (borderline > > ungrammatical) in some places. she puts my arrow on my red circle on my circle she puts a cone on the red circle on the red circle she puts the blue cone on the red circle on a circle These are weird. > Feel free to improve the English data. All improvements are welcome! The improvement would be to use some real data. Or if not real data, at least some made up data that can at least be more or less understood. How about trying your IBM Model 1 trained on the blockword corpus on the story we often translate: https://svn.code.sf.net/p/apertium/svn/nursery/apertium-dan-nor/dev/cuento.da.txt Here is the version in Danish. Fran ------------------------------------------------------------------------------ Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
