Hi all,

In the PageLinks extraction code there are references to wikilinks
being extracted from the text. The semantics that MediaWiki follows
for this process is that it capitalises the first letter of any of
these wikilinks when forming the equivalent URL, however, DBpedia
isn't currently doing this. This results in both
http://dbpedia.org/resource/stage_fright and
http://dbpedia.org/resource/Stage_fright being assigned separate
identifiers although MediaWiki treats them the same. See [1] and [2]
that integrate the PageLinks versions into the rest of the datasets.

This issue doesn't show up in the non-PageLinks datasets so there
might be code already implemented for this somewhere else.

Cheers,

Peter

[1] http://qut.bio2rdf.org/page/dbpedia:stage_fright
[2] http://qut.bio2rdf.org/page/dbpedia:Stage_fright

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to