Thanks for the report Paul,

These kind of cases are haunting us since the beginning of DBpedia.

There are two cases here
1) flexibility of the mappings wiki to define fine grained extraction rules
2) representation uniformity of the data in Wikipedia for parsing them
correctly

For (1) we are working with Gent University on moving the mappings to RML
which will give us great flexibility
Fixing (1) will not help us much with (2) though since Wikipedia users
might not put the right data in the right place and proper format

for (2) we are working on fixing this problem by integrating data from
multiple sources (inc Wikidata) and trying to resolve conflicts etc

wrt Wikidata, the fact that Wikidata has this right and the value is taken
from WIkipedia can have different interpretations.
It could have been entered manually or there is a video game bot that is
tweaked to parse these "awful blobs". Either way, it is good that it is
there :)

Cheers,
Dimitris


On Tue, Jan 24, 2017 at 6:14 PM, Paul Houle <paul.ho...@ontology2.com>
wrote:

> Here is another report of quality problems.  How to solve them is worth
> discussion.
>
> I'm making something simple that,  given a person,  looks up creative
> works they are responsible for,  looks up the dates of those creative
> works,  then subtracts the birth date of those to get the age at which
> they created something and then makes a report.
>
> When I GET the prolific character designer
>
> http://dbpedia.org/resource/Tsunako
>
> I can find the games she did art for by following the backward
> dbo:gameArtist links to the games she designed,  which are returned by
> the GET request.
>
> Then I can GET the games,  but when I do so,  the release dates are
> often incorrect,  for instance
>
> http://dbpedia.org/resource/Hyperdimension_Neptunia_Victory
>
> has a release date back in the 1930's,  which of course predates Tsunako
> and is invalid.  The root cause is that there is an awful blob in the
> infobox that contains multiple release dates in various geographic
> regions.  In my case,  the standard of quality is that I want the
> earliest release date but I'm not too excited if I am off by ±1 year.
> (She might have done the illustrations in the prior year,  etc.)
>
> The above one is obviously absurd and easy to catch,  but
>
> http://dbpedia.org/page/Hyperdimension_Neptunia_(video_game)
>
> exhibits a much more insidious error where it gets the 2015 release date
> of the Windows edition of a remade and heavily modified version
> (different combat system,   different world travel,  new voice acting,
> new music, ...) which might pass by you if you're not the kind of person
> who drinks Nep Bull.
>
> Interstingly,  Wikidata gets the release date right (by my definition)
>
> https://www.wikidata.org/wiki/Q1207525
>
> and claims it got it from Wikipedia.
>
> It's a run-of-the-mill kind of quality problem that affects users,  but
> it gets into all the questions of "what exactly do you want to model?"
> as clearly the Wikipedia editors are trying to model it at a very fine
> grain but users might want a spectrum of different granularities.
>
> --
>   Paul Houle
>   paul.ho...@ontology2.com
>
>   Try the Ontology2 Edition of DBpedia 2016-04:
>   https://aws.amazon.com/marketplace/pp/B01HMUNH4Q/
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> _______________________________________________
> DBpedia-discussion mailing list
> DBpedia-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>



-- 
Kontokostas Dimitris
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to