Hi

You may be interested in the rich dataset statistics that are reported as part 
of the Health Care and Life Sciences Community Profile for dataset 
descriptions; these extend the properties given in the VoID vocabulary.
https://www.w3.org/TR/hcls-dataset/#s6_6
The linked section gives a description of the statistic reported and the SPARQL 
query that is used to generate the values.

Best regards,

Alasdair

On 13 Jul 2016, at 17:05, Jean-Claude Moissinac 
<jean-claude.moissi...@telecom-paristech.fr<mailto:jean-claude.moissi...@telecom-paristech.fr>>
 wrote:

Many thanks John for the elegant solution.

My perception is that
select count(distinct ?r) where { ?r ?p ?l }
is semantically equivalent to
select (count(?s) as ?c) where { select distinct ?s where { ?s ?p []} }
It gives the count of distinct nodes in the graph, so the difference is only a 
result of the internal implementation. So, it seems necessary to know a lot 
about implementation to know how to get the result.
Am I wrong?



--
Jean-Claude Moissinac


2016-07-06 15:55 GMT+02:00 John Walker 
<john.wal...@semaku.com<mailto:john.wal...@semaku.com>>:
How about reformulating as:

select (count(?s) as ?c) where { select distinct ?s where { ?s ?p []} }

Which gives a result of 10515620 resources [1].

Regards,
John

[1] 
http://fr.dbpedia.org/sparql?default-graph-uri=&query=select+%28count%28%3Fs%29+as+%3Fc%29+where+%7B+select+distinct+%3Fs+where+%7B+%3Fs+%3Fp+%5B%5D%7D+%7D&format=text%2Fhtml&timeout=0&debug=on


-----Original Message-----
From: Hugh Williams 
[mailto:hwilli...@openlinksw.com<mailto:hwilli...@openlinksw.com>]
Sent: Wednesday, July 06, 2016 3:15 PM
To: Jean-Claude Moissinac 
<jean-claude.moissi...@telecom-paristech.fr<mailto:jean-claude.moissi...@telecom-paristech.fr>>
Cc: public-lod <public-lod@w3.org<mailto:public-lod@w3.org>>
Subject: Re: Size a linked open data set

Hi Jean-Claude,

The "select count(distinct ?r) where { ?r ?p ?l }” query is expensive in terms 
of database resources and would result in a huge hash table being creating to 
try and service it which is causing it to timeout based on the settings on the 
instance by whoever maintains it.

On http://dbpedia.org/sparql the original canonical English DBpedia endpoint 
OpenLink Software hosts, we provide preloaded VOID datasets, such that they 
don’t have to be queried each time, see http://dbpedia.org/void/Dataset , but 
the French DBpedia instance does not appear to have this ie 
http://fr.dbpedia.org/void/Dataset

Best Regards
Hugh Williams
Professional Services
OpenLink Software, Inc.      //              http://www.openlinksw.com/
Weblog   -- http://www.openlinksw.com/blogs/
LinkedIn -- http://www.linkedin.com/company/openlink-software/
Twitter  -- http://twitter.com/OpenLink
Google+  -- http://plus.google.com/100570109519069333827/
Facebook -- http://www.facebook.com/OpenLinkSoftware
Universal Data Access, Integration, and Management Technology Providers

> On 6 Jul 2016, at 12:49, Jean-Claude Moissinac 
> <jean-claude.moissi...@telecom-paristech.fr<mailto:jean-claude.moissi...@telecom-paristech.fr>>
>  wrote:
>
> Hello
>
> In my work, I need to know the number of distinct resources in a dataset.
> For example, with dbpedia-fr, I'm trying
> select count(distinct ?r) where { ?r ?p ?l }
>
> And I'm always getting a timeout error message
> While with
> select count(?r) where { ?r ?p ?l }
> I'm getting
> 185404575
>
> Is it a good way to know about such size?
>
> --
> Jean-Claude Moissinac
>



Alasdair J G Gray
Fellow of the Higher Education Academy
Assistant Professor in Computer Science,
School of Mathematical and Computer Sciences
(Athena SWAN Bronze Award)
Heriot-Watt University, Edinburgh UK.

Email: a.j.g.g...@hw.ac.uk<mailto:a.j.g.g...@hw.ac.uk>
Web: http://www.macs.hw.ac.uk/~ajg33
ORCID: http://orcid.org/0000-0002-5711-4872
Office: Earl Mountbatten Building 1.39
Twitter: @gray_alasdair










________________________________

Founded in 1821, Heriot-Watt is a leader in ideas and solutions. With campuses 
and students across the entire globe we span the world, delivering innovation 
and educational excellence in business, engineering, design and the physical, 
social and life sciences.

The contents of this e-mail (including any attachments) are confidential. If 
you are not the intended recipient of this e-mail, any disclosure, copying, 
distribution or use of its contents is strictly prohibited, and you should 
please notify the sender immediately and then delete it (including any 
attachments) from your system.

Reply via email to