On 7 March 2011 16:28, Antonio Toral <[email protected]> wrote: > Hi, > > I'd like to add this idea: > > > task: dictionary induction from wikis > difficulty: 3. medium > description: Extract dictionaries from linguistic wikis > rationale: Wiki dictionaries and encyclopedias (e.g. omegawiki, > wiktionary, wikipedia) contain information (e.g. bilingual equivalences, > morphological features, conjugations, etc) that could be exploited to > speed up the development of dictionaries for Apertium. This task aims at > automatically building dictionaries by extracting different pieces of > information from wiki structures such as interlingual links, infoboxes, > etc. > requirements: SQL, mediawiki syntax, perl, maybe C++ or Java >
FWIW, there's a branch of dbpedia that has started to do something similar: http://dbpedia.hg.sourceforge.net/hgweb/dbpedia/extraction_framework/file/e406efd61660 At the moment, it only targets de.wiktionary, but (where possible) it's being designed to have templates independent of the code, so it should be about as easy to adapt to new wiktionaries as dbpedia proper is to new wikipedias (i.e., there are (unavoidably) some things that need to be added to the code, but most information comes from the templates). -- mostly Scala, some Java There's also a Freedict-related project to extract TEI dictionaries from the Russian wiktionary: http://wiktionary-export.nataraj.su/en/about.html -- Perl There's also a Java-based parser for wikimedia-style wikis: http://code.google.com/p/jwpl/ > > Would anyone else be interested as mentor? How about you? :) If someone's interested in working on the dbpedia framework, I've done some work on it, and would be happy to mentor that. -- <Leftmost> jimregan, that's because deep inside you, you are evil. <Leftmost> Also not-so-deep inside you. ------------------------------------------------------------------------------ What You Don't Know About Data Connectivity CAN Hurt You This paper provides an overview of data connectivity, details its effect on application quality, and explores various alternative solutions. http://p.sf.net/sfu/progress-d2d _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
