Hi all, In the PageLinks extraction code there are references to wikilinks being extracted from the text. The semantics that MediaWiki follows for this process is that it capitalises the first letter of any of these wikilinks when forming the equivalent URL, however, DBpedia isn't currently doing this. This results in both http://dbpedia.org/resource/stage_fright and http://dbpedia.org/resource/Stage_fright being assigned separate identifiers although MediaWiki treats them the same. See [1] and [2] that integrate the PageLinks versions into the rest of the datasets.
This issue doesn't show up in the non-PageLinks datasets so there might be code already implemented for this somewhere else. Cheers, Peter [1] http://qut.bio2rdf.org/page/dbpedia:stage_fright [2] http://qut.bio2rdf.org/page/dbpedia:Stage_fright ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Dbpedia-discussion mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
