Re: group by is very slow

Andy Seaborne Wed, 19 Sep 2012 12:38:25 -0700

On 19/09/12 18:51, Yuhan Zhang wrote:

Hi all,


I kept categories of videos as triples in a tdb in the format of (?video_id
?category ?score)
I'd like to find videos with similar categories given one video id.

select ?video_2 COUNT(*)
where {
  <http://onescreen.com/video/2901760> ?c ?score_1 .
  ?video_2 ?c ?score_2 .
}
group by ?video_2
  limit 100


Illegal syntax?

However, this query with a group by was really slow and never completed.
There are about 21M triples in the same tdb.
The response was pretty fast when querying without a group by.

How could I make thie query faster? Is SPARQL the right tool for this?

You data modelling looks somewhat unusual. A join across the predicate(?c) is likely to cause an explosion in possibilities.


The LIMIT 100 applies after grouping - and the groups are likely huge.

What is

select (count(distinct ?p) AS ?pCount) { ?s ?p ?o }

select distinct ?p { ?s ?p ?o } limit 10

        Andy



Thank you.

Yuhan

Re: group by is very slow

Reply via email to