On 6 March 2014 10:24, Gaurav Vaidya <gau...@ggvaidya.com> wrote: > Hi everybody! >
Hi! > I’m really interested in the project to extract data from the Wikimedia > Commons (http://wiki.dbpedia.org/gsoc2014/ideas#h359-6). If I understand this > project correctly, the goal is to look at all the different ways in which > metadata can be represented in the Commons — as categories, as templates, as > links to articles in the Commons, on the language Wikipedias, and as external > links to museum catalogues, Flickr and other sources of metadata — and then > build a set of RDF structured data for each Commons file and category. This > sounds like an excellent idea to me, and I’d like to help in any way I can! > There are also things like image annotations, such as: http://commons.wikimedia.org/wiki/File:Solvay_conference_1927.jpg -- the ImageNotes contain links to Wikipedia for languages that are not otherwise represented. > Please let me know if you have any questions for me! I tried to look for > previous work on extracting data from the Commons on the developers’ mailing > lists, and was unable to find anything — if there’s a thread in there that I > missed, please let me know! Otherwise, I’ll keep poking around with the > Extraction Framework and start coming up with a project plan for tackling > this goal that I think might be doable in three months. > As far as I'm aware, it has not been discussed so far. You may also want to take a look at the Wiktionary extraction module, as it seems a better fit to the way information is represented on Commons. -- <Sefam> Are any of the mentors around? <jimregan> yes, they're the ones trolling you ------------------------------------------------------------------------------ Subversion Kills Productivity. Get off Subversion & Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk _______________________________________________ Dbpedia-gsoc mailing list Dbpedia-gsoc@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc