On 11/25/2010 4:13 AM, José Emilio Mori Recio wrote:
> A real database implementation could cost a great effort, so maybe an
> easier "global templates" solution (in the same way Commons is available
> for the rest of the projects) should be considered, as there could be
> useful global templates apart from the data templates. Anyway, I think
> Wikidata is definitely something we must have. Answering to Michael Peel,
> if some concrete definition is needed about what Wikidata should be, I'd
> be glad to help in the process. I wrote more about that subject in the WMF
> list a few months ago
> (http://lists.wikimedia.org/pipermail/foundation-l/2010-May/058688.html),
> but with no luck.
>
     Ugh.  Dbpedia has 50% or so recall extracting people and cities 
from templates from wikipedia.  The trouble with people is mostly that a 
lot of people don't have infoboxes at all,  whereas dbpedia's ruleset 
isn't complete enough to handle the hodge-podge of different infoboxes 
that are used for locations all over the place.  And don't get me 
started on all the nonstandard infoboxes for representing geographic 
coordinates.  I've written my own extraction systems that eat infoboxes 
and other templates,  and it's always the same story,   it's pretty easy 
to get about 50% recall,   but you've got to fight hard for every % you 
get past that.

     So,  when I hear talk about using mediawiki templates for something 
like this,  it's like popping a paper bag in back of the head of a 
Vietnam Vet.   This kind of project needs a database if it's going to be 
useful.

     The other "elephant in the room" is Freebase.  Freebase,  more or 
less,  is already a "data wiki" that's linked with Wikipedia.  Freebase 
provides a reasonable interface for hand edits,  and uses crowdsourcing 
and machine learning techniques for data cleaning and autotyping.  
Although there are many things dbpedia does better (having unique titles 
for topics and good RDF),  I almost always tell people who want to get 
started with dbpedia to use Freebase instead...  One time I was able to 
solve a problem in 40 minutes with Freebase that I'd spent two weeks 
trying to do with dbpedia.


_______________________________________________
Commons-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/commons-l

Reply via email to