Re: [Apertium-stuff] MT and Wikipedia

Mikel Forcada Wed, 01 May 2013 11:47:16 -0700

Jim and all:
I am preparing a letter to Erik as Apertium Prez. Fran[cis Tyers] showed 
me his post. We need to be part of it.


Mikel

  Al 05/01/2013 07:20 PM, En/na Jimmy O'Regan ha escrit:
> On 1 May 2013 16:59, David Cuenca <[email protected]> wrote:
>> Dear all,
>>
>> Erik Möller, head of Engineering and Product Development in the Wikimedia
>> Foundation, started a thread on the Wikimedia mailing list about the
>> convenience or not of supporting open source machine translation. Original
>> thread:
>> http://lists.wikimedia.org/pipermail/wikimedia-l/2013-April/125350.html
>>
>> I suggested using software like Omegawiki or Wikidata as a frontend for
>> building grammar and language pair files that software like Apertium uses:
>> http://lists.wikimedia.org/pipermail/wikimedia-l/2013-April/125642.html
>>
> I guess the good news is that it's *already* feasible for us to build
> translators using Wikipedia... we do it all the time :) See, for
> example, the case of Spanish-Aragonese
> (http://www.lrec-conf.org/proceedings/lrec2012/pdf/326_Paper.pdf).
>
> We have a tool for extracting dictionaries from OmegaWiki, but it goes
> unused because of licence incompatibilities. We could wait and see if
> CC-BY-SA 4 adds the GPL as a compatible licence, but it might be
> better all round if we were to switch to CC-BY-SA for the dictionaries
> - the GPL is not a particularly suitable licence for dictionaries, and
> in particular has no waiver of database rights which could be used (in
> Europe, at least) to make modified dictionaries proprietary.
>
> But that's beside the point. Wikidata looks promising to me; the last
> time I had considered returning to education, I was going to propose
> as my project 'macro domain machine translation', which would have
> involved extracting translation rules specific to infobox properties
> (the simplest example would be for eye colour, where the translation
> ought to be in the plural, rather than the singular, translating from
> English), and changing Apertium to accept two sets of rules, unifying
> pattern matching (similarly to analysis), and choosing the second set
> of rules in the event of conflict. Wikidata looks set to provide more,
> and cleaner, data for such a task.
>
> What would be really excellent would be Wikidata integration into
> Wiktionary. I've been tinkering with DBpedia's Wiktionary extraction
> for a while now, and the data extracted is still quite noisy. It would
> be great if it wasn't necessary.
>


-- 
Mikel L. Forcada (http://www.dlsi.ua.es/~mlf/)
Departament de Llenguatges i Sistemes Informàtics
Universitat d'Alacant
E-03071 Alacant, Spain
Phone: +34 96 590 9776
Fax: +34 96 590 9326


------------------------------------------------------------------------------
Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET
Get 100% visibility into your production application - at no cost.
Code-level diagnostics for performance bottlenecks with <2% overhead
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap1
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Re: [Apertium-stuff] MT and Wikipedia

Reply via email to