Jim and all: I am preparing a letter to Erik as Apertium Prez. Fran[cis Tyers] showed me his post. We need to be part of it.
Mikel Al 05/01/2013 07:20 PM, En/na Jimmy O'Regan ha escrit: > On 1 May 2013 16:59, David Cuenca <[email protected]> wrote: >> Dear all, >> >> Erik Möller, head of Engineering and Product Development in the Wikimedia >> Foundation, started a thread on the Wikimedia mailing list about the >> convenience or not of supporting open source machine translation. Original >> thread: >> http://lists.wikimedia.org/pipermail/wikimedia-l/2013-April/125350.html >> >> I suggested using software like Omegawiki or Wikidata as a frontend for >> building grammar and language pair files that software like Apertium uses: >> http://lists.wikimedia.org/pipermail/wikimedia-l/2013-April/125642.html >> > I guess the good news is that it's *already* feasible for us to build > translators using Wikipedia... we do it all the time :) See, for > example, the case of Spanish-Aragonese > (http://www.lrec-conf.org/proceedings/lrec2012/pdf/326_Paper.pdf). > > We have a tool for extracting dictionaries from OmegaWiki, but it goes > unused because of licence incompatibilities. We could wait and see if > CC-BY-SA 4 adds the GPL as a compatible licence, but it might be > better all round if we were to switch to CC-BY-SA for the dictionaries > - the GPL is not a particularly suitable licence for dictionaries, and > in particular has no waiver of database rights which could be used (in > Europe, at least) to make modified dictionaries proprietary. > > But that's beside the point. Wikidata looks promising to me; the last > time I had considered returning to education, I was going to propose > as my project 'macro domain machine translation', which would have > involved extracting translation rules specific to infobox properties > (the simplest example would be for eye colour, where the translation > ought to be in the plural, rather than the singular, translating from > English), and changing Apertium to accept two sets of rules, unifying > pattern matching (similarly to analysis), and choosing the second set > of rules in the event of conflict. Wikidata looks set to provide more, > and cleaner, data for such a task. > > What would be really excellent would be Wikidata integration into > Wiktionary. I've been tinkering with DBpedia's Wiktionary extraction > for a while now, and the data extracted is still quite noisy. It would > be great if it wasn't necessary. > -- Mikel L. Forcada (http://www.dlsi.ua.es/~mlf/) Departament de Llenguatges i Sistemes Informà tics Universitat d'Alacant E-03071 Alacant, Spain Phone: +34 96 590 9776 Fax: +34 96 590 9326 ------------------------------------------------------------------------------ Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET Get 100% visibility into your production application - at no cost. Code-level diagnostics for performance bottlenecks with <2% overhead Download for free and get started troubleshooting in minutes. http://p.sf.net/sfu/appdyn_d2d_ap1 _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
