Dear Hamid,

> Am 22.03.2017 um 17:09 schrieb Hamid Ghofrani <ghof...@gmail.com>:
> 
> Dear All,
> We collect additional entries from dbpedia live from time to time. I noticed 
> the number of entries has shrunk.

What do you mean by „entries“, rdf:type-relationships?

> For example for Actor , we used to get about 40K actors from live updates 
> (http://dbpedia.org/ontology/Actor) 
>  
> now it is just down to 6500.

This might be because Wikipedia editors tend to use more generic templates, 
e.g. Person instead of Actor. DBpedia uses these templates to map the resource 
to a class from the DBpedia ontology.
This is truly an issue, a lot of interesting stuff is removed from infoboxes in 
a concerted way, e.g. notable works, influenced persons, and more.
It would be interesting to check, whether (and which) relevant information from 
past versions could be perpetuated to be continuously part of current releases.

> Another issue regarding entries
> We used to get
> <http://dbpedia.org/resource/Tom_Cruise> 
> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
> <http://dbpedia.org/ontology/Actor>
> 
> from live Updates. It does not exist anymore, why?

I cannot say when this was the case. https://en.wikipedia.org/wiki/Tom_Cruise 
uses the Person template at least since January 2013 [1]. So it hardly should 
have been typed an actor since then. Nevertheless, this type could have been 
part of the releases, introduced by data cleansing and type inference post 
processing steps.

> Also why previous liveUpdate results were not added to the 2016 release? 

DBpedia Live and the Releases are not published together and results are not 
mixed. DBpedia Live extracts the latest state from Wikipedia as soon as the 
article changes, so there may be little date time differences between the 
extraction times of individual articles. The releases on the opposite side are 
all based on a single dump from Wikipedia, i.e. all having the same extraction 
time. Furthermore, the release cycle allows to do more complex extractions, 
e.g. internationalization, page links, and post processing steps such as type 
inference, page rank, etc. These are not feasible with the constant flux date 
nature of DBpedia Live.

> Thank you very much

Best regards
Magnus

[1] 
https://en.wikipedia.org/w/index.php?title=Tom_Cruise&action=edit&oldid=530693566

-- 
Magnus Knuth

Hasso-Plattner-Institut für Softwaresystemtechnik GmbH
Prof.-Dr.-Helmert-Str. 2-3
14482 Potsdam

Amtsgericht Potsdam, HRB 12184
Geschäftsführung: Prof. Dr. Christoph Meinel

tel:     +49 331 5509 547
email:   magnus.kn...@hpi.de
web:     http://www.hpi.de/
webID:   http://magnus.13mm.de/


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to