We were given some clearer guidelines about the tasklist for GCI.
First of all, the tasks have to be as specific as possible. GCI,
because it's a programme for kids under 18, has to be structured as a
competition, and has to have tasks that are relatively easy to do (to
not put off the younger kids, who are assumed to have lower attention
spans). Open-ended descriptions are not helpful.
Bad:
Area Difficulty Title Description Time (hours) People
code 2. Medium Cross a language pair Take two language pairs, use
apertium-crossdics and clean up the resulting bilingual dictionary.
For instance, build a dictionary for Occitan-French from
Occitan-Catalan and Catalan-French. 4—10 Francis Tyers, Jimregan
Good:
Area Difficulty Title Description Time (hours)
People
code 2. Medium Cross a language pair: Occitan-French Using
apertium-crossdics, build a dictionary for Occitan-French from
Occitan-Catalan and Catalan-French, and clean up the result.
4—10 Francis Tyers
code 2. Medium Cross a language pair: Aragonese-Catalan Using
apertium-crossdics, build a dictionary for Aragonese-Catalan from
Aragonese-Spanish and Spanish-Catalan, and clean up the result.
4—10 Jimregan
What you can do to help: go through the list of language pairs we
have, find a path where crossdics can be used for a dictionary you'd
be interested in, and sign up to mentor for it.
In fact, I think that this task set is one where just about everyone
on this list can help, so we'll probably compile a list of
possibilities in this thread and let you choose which ones you want to
take.
To mentor this taskset, you need to:
1) Be able to verify that the dictionary compiles
2) Be able to check for common crossing errors:
For example, when crossing the en-es entries:
<e><p><l>brother<s n="n"/></l><r>hermano<s n="n"/><s n="m"/></r></p></e>
<e><p><l>sister<s n="n"/></l><r>hermano<s n="n"/><s n="f"/></r></p></e>
with the es-pt entry:
<e><p><l>hermano<s n="n"/></l><r>irmão<s n="n"/></r></p></e>
you will get the en-pt entries:
<e><p><l>brother<s n="n"/></l><r>irmão<s n="n"/><s n="m"/></r></p></e>
<e><p><l>sister<s n="n"/></l><r>irmão<s n="n"/><s n="f"/></r></p></e>
<e><p><l>brother<s n="n"/></l><r>irmão<s n="n"/><s n="f"/></r></p></e>
<e><p><l>sister<s n="n"/></l><r>irmão<s n="n"/><s n="m"/></r></p></e>
In this case, you can use:
grep 'o<s n="n"/><s n="f"/></r>' apertium-en-es.en-es.dix
to find the set of potentially problematic entries. Others will not be
as straightforward.
Similarly, any 'extended' attributes (slr, srl, v, etc.) will cause
similar problems.
--
<Leftmost> jimregan, that's because deep inside you, you are evil.
<Leftmost> Also not-so-deep inside you.
------------------------------------------------------------------------------
Nokia and AT&T present the 2010 Calling All Innovators-North America contest
Create new apps & games for the Nokia N8 for consumers in U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store
http://p.sf.net/sfu/nokia-dev2dev
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff