The query on the large dataset also went through after 30 minutes. The
difference is that I set JVM_ARGS=${JVM_ARGS:--Xmx2400M} inside
./fuseki-server   not sure whether this fixed it.

Thanks Andy.

Yuhan

On Wed, Sep 19, 2012 at 1:03 PM, Yuhan Zhang <[email protected]> wrote:

> oh! the query went through after a few minutes when I tested with a
> smaller dataset. (7M triples)
>
> Yuhan
>
>
> On Wed, Sep 19, 2012 at 1:00 PM, Yuhan Zhang <[email protected]> wrote:
>
>> Hi Andy,
>>
>> looks like the fuseki server accepted that query without syntax error in
>> my case.. I'm running fuseki 0.2.4.
>>
>>
>> The other two queries returns fast within a few seconds:
>>
>>
>> select (count(distinct ?p) AS ?pCount) { ?s ?p ?o }
>>
>> ----------
>> | pCount |
>> ==========
>> | 10401  |
>> ----------
>>
>>
>> select distinct ?p { ?s ?p ?o } limit 10
>>
>> -----------------------------------------------------
>> | p                                                 |
>> =====================================================
>> | </award>                                          |
>> | </award/award_nominated_work>                     |
>> | </award/ranked_item>                              |
>> | </base>                                           |
>> | </base/animemanga>                                |
>> | </base/animemanga/anime_manga_character>          |
>> | </base/animemanga/topic>                          |
>> | </base/argumentmaps>                              |
>> | </base/argumentmaps/possibly_correlated_thing>    |
>> -----------------------------------------------------
>>
>>
>> Might be the query is too big.. it is trying to pair the given item with
>> other item and count the number of categories in common.
>>
>> Thanks.
>>
>> Yuhan
>>
>>
>> On Wed, Sep 19, 2012 at 12:37 PM, Andy Seaborne <[email protected]> wrote:
>>
>>> On 19/09/12 18:51, Yuhan Zhang wrote:
>>>
>>>> Hi all,
>>>>
>>>> I kept categories of videos as triples in a tdb in the format of
>>>> (?video_id
>>>> ?category ?score)
>>>> I'd like to find videos with similar categories given one video id.
>>>>
>>>> select ?video_2 COUNT(*)
>>>> where {
>>>>   
>>>> <http://onescreen.com/video/**2901760<http://onescreen.com/video/2901760>>
>>>> ?c ?score_1 .
>>>>   ?video_2 ?c ?score_2 .
>>>> }
>>>> group by ?video_2
>>>>   limit 100
>>>>
>>>
>>> Illegal syntax?
>>>
>>>
>>>  However, this query with a group by was really slow and never completed.
>>>> There are about 21M triples in the same tdb.
>>>> The response was pretty fast when querying without a group by.
>>>>
>>>> How could I make thie query faster? Is SPARQL the right tool for this?
>>>>
>>>
>>> You data modelling looks somewhat unusual.  A join across the predicate
>>> (?c) is likely to cause an explosion in possibilities.
>>>
>>> The LIMIT 100 applies after grouping - and the groups are likely huge.
>>>
>>> What is
>>>
>>> select (count(distinct ?p) AS ?pCount) { ?s ?p ?o }
>>>
>>> select distinct ?p { ?s ?p ?o } limit 10
>>>
>>>         Andy
>>>
>>>>
>>>>
>>>> Thank you.
>>>>
>>>> Yuhan
>>>>
>>>>
>>>
>>
>>
>> --
>> Yuhan Zhang
>> Senior Software Engineer
>> OneScreen Inc.
>> [email protected] <[email protected]>
>> www.onescreen.com
>> (949) 525-4825 Ext: 177
>>
>>
>> The information contained in this e-mail is for the exclusive use of the
>> intended recipient(s) and may be confidential, proprietary, and/or legally
>> privileged. Inadvertent disclosure of this message does not constitute a
>> waiver of any privilege.  If you receive this message in error, please do
>> not directly or indirectly print, copy, retransmit, disseminate, or
>> otherwise use the information. In addition, please delete this e-mail and
>> all copies and notify the sender.
>>
>
>
>
> --
> Yuhan Zhang
> Senior Software Engineer
> OneScreen Inc.
> [email protected] <[email protected]>
> www.onescreen.com
> (949) 525-4825 Ext: 177
>
>
> The information contained in this e-mail is for the exclusive use of the
> intended recipient(s) and may be confidential, proprietary, and/or legally
> privileged. Inadvertent disclosure of this message does not constitute a
> waiver of any privilege.  If you receive this message in error, please do
> not directly or indirectly print, copy, retransmit, disseminate, or
> otherwise use the information. In addition, please delete this e-mail and
> all copies and notify the sender.
>

Reply via email to