Hi Francis, I really like the idea "Make a program which tests Apertium data files for suspicious or unrecommended constructs (likely to be bugs). " For someone like me it's very easy to make a minor mistake when editing those bloody XML-files :-) It's quite easy to miss a quotation mark ( ") or some other symbols (<>) that aren't all that important in ordinary language. Or omitting some closing symbols at the right side of the expression (/>).
One way of improved checking would be not to just have separate programs like Jimmy O'Regan's lint-tool for tsx-files, but also make the make script be more explicit about errors. Some helpful hints about common errors. Print the offending line with explicit info. Or rather the offending expression? This applies to make scripts for dictionaries as well as for tagger training. The advantage of this is that everyone has to run the make script, but it's easy to forget running a special tool or simply not be aware of it's existence. Regarding the make scripts for tagger training, it would be very welcome if they would work with comments in the tsx-files. Working without comments complicates the work considerably. That's the main reason why I abandoned the work on retraining the tagger for the pair Swedish-Danish. Yours, Per Tunedal On Thu, Feb 12, 2015, at 01:08, Francis Tyers wrote: > Hello all, > > We've added some new ideas for GSOC: > > * Weighted transfer rules > * Automatic blank handling > * Integration and debugging tools for Grammatical Framework > * Weights in lttoolbox > * Improvements to the Apertium website > > Please don't feel shy about fleshing out the ideas and improving the > descriptions. :D > > http://wiki.apertium.org/wiki/Ideas_for_Google_Summer_of_Code > > We currently have thirteen ideas and could do with a few more. Something > around seven or eight more would be good. > > Entry level: 3 > Medium: 5 > Hard: 5 > > It would be good to have a mix, so 4 more entry level ones and two each > medium and hard or so. > > Fran > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming. The Go Parallel Website, > sponsored by Intel and developed in partnership with Slashdot Media, is > your > hub for all things parallel software development, from weekly thought > leadership blogs to news, videos, case studies, tutorials and more. Take > a > look and join the conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Apertium-stuff mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/apertium-stuff ------------------------------------------------------------------------------ Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
