Hi Christopher, A curiosity:
On 4/21/2014 3:05 AM, Jona Christopher Sahnwaldt wrote: > On 20 April 2014 18:58, Volha Bryl <[email protected]> wrote: >> In fact, >> SELECT COUNT(*) WHERE {?x ?y ?z} >> executed against DBpedia SPARQL endpoint returns 825,761,509 at the moment. >> And actually I am not sure that all datasets available at [5] are loaded >> into the endpoint > No, only certain datasets are loaded. They are listed here: > http://wiki.dbpedia.org/DatasetsLoaded39 > >> so the total number for English can be even bigger. >> >> Summarizing, [1,2] are good sources for getting numbers of things/instances. >> For the number of triples - depends on what you want to count. For types and >> properties refer to [1,2], for total number of triples - refer to SPARQL >> endpoints for English and some other languages for which the endpoints >> exist. Or go through the dumps and count :) > The number of lines in each dataset file is listed in this file: > > https://github.com/dbpedia/extraction-framework/blob/master/scripts/src/main/data/lines-bytes-packed.txt > > There are a few comment lines in each file, so the number of triples > is slightly lower, but not by much. > > I just counted the lines in all English NT files by the following > command. (grep -v is necessary to remove a few files that contain > almost the same triples as other files.) > > grep 'en/.*\.nt' lines-bytes-packed.txt | grep -vE > 'unredirected|same_as|see_also|chapters|cleaned' | awk '{sum+=$3} END > {print sum}' > > Result for en: 488 million triples. > For all languages: 3.1 billion triples Why then the triple count according to the endpoint (see the query above) is more than 800 mln? From your explanations (not all triples are loaded) it should be the other way around. Cheers, Volha ------------------------------------------------------------------------------ Start Your Social Network Today - Download eXo Platform Build your Enterprise Intranet with eXo Platform Software Java Based Open Source Intranet - Social, Extensible, Cloud Ready Get Started Now And Turn Your Intranet Into A Collaboration Platform http://p.sf.net/sfu/ExoPlatform _______________________________________________ Dbpedia-discussion mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
