On 5/9/14 7:11 AM, Bart Vandewoestyne wrote:
On 2014-05-07 12:52, Kingsley Idehen wrote:>On 5/7/14 4:37 AM, Bart Vandewoestyne wrote:>>Hello list, >> >>I'm confused. I have a SPARQL-query of the following form (slightly >>obfuscated because of NDA-restrictions): >> >>SELECT ?val (COUNT(?id) as ?vc) >>WHERE >>{ >> ?id<http://foo.bar/description> ?val. >> ?id<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> >><http://foo.bar/SomeType>. >>} >>GROUP BY ?val >>ORDER BY DESC(?vc) >>LIMIT 15 >> >>When running this query multiple times, it returns different values for >>the ?vc counts... so it seems not deterministic. I don't see why. Is >>this query really not deterministic??? >> >>I'm running this on Virtuoso Version 7.1.1-dev.3208-pthreads as of Apr >>17 2014. >> >>If this is not the right mailinglist to pose this question, please let >>me know the appropriate channel for this type of question. >> >>Kind regards, >>Bart> >Please check the query timeout settings. This could be in the /sparql UI >or in the INI (see the [SPARQL] section). > >Basically, Virtuoso has an "Anytime Query" feature whereby query >solutions are produced relative to query timeouts. Thus, you can >increase the timeout to ensure the SPARQL solution isn't comprised of >what would appear to be partial results. > >Remember, unlike SQL, SPARQL is about propositions in an Open World etc..Hello Kingsley,In the [SPARQL] section of my virtuoso.ini, I have the MaxQueryCostEstimationTime commented out, and the MaxQueryExecutionTime is set to 600 (seconds, according to the docs). For as far as I know, I am not using a command like set result_timeout == <expression>; as described in the documentation in section "Anytime queries". The database I'm working with is 142 GB large and stored locally on my disk. I am not changing or updating any data, just doing select queries. My full config file is online at https://www.dropbox.com/sh/hm0nj8q0j6pnx1k/AABaXTFHDy5JrZUTenJuGqqva/virtuoso_ini.txt The query that I'm talking about, takes about 14 to 18 seconds, way below the 600 seconds MaxQueryExecutionTime, so I don't think that's the problem. I am now testing with the latest development branch, being Version 07.10.3208-pthreads (commit bea4a6da40258afeebf4be3e18f299ec8f11967c). Furthermore if I test with an older version (07.00.3203-pthreads for Linux as of Mar 26 2014, commit 48f0ef879b913c5d3b306c1f83390079c5416fe6) then different runs of the same query*do* return the same ?vc counts. Any suggestion? Kind regards, Bart
If you have a static DBMS i.e., one in which there are no real-time alteration of triples, a SPARQL aggregate solution should be consistently the same. The aforementioned statement holds true if the time it takes to produce the solution is less than the query timeout.
"Any time" query is like a quiz, you are asked a question, but given a specific amount of time to answer, so in the case of Virtuoso you can get a partial solution in regards to aggregates. If you retry (i.e., your opponent couldn't answer and the question comes back to you) the solution can be different.
Are you using /sparql or isql command-line to perform these queries? How many triples to you have in the Virtuoso Quad Store? -- Regards, Kingsley Idehen Founder & CEO OpenLink Software Company Web: http://www.openlinksw.com Personal Weblog: http://www.openlinksw.com/blog/~kidehen Twitter Profile: https://twitter.com/kidehen Google+ Profile: https://plus.google.com/+KingsleyIdehen/about LinkedIn Profile: http://www.linkedin.com/in/kidehen
smime.p7s
Description: S/MIME Cryptographic Signature
------------------------------------------------------------------------------ Is your legacy SCM system holding you back? Join Perforce May 7 to find out: • 3 signs your SCM is hindering your productivity • Requirements for releasing software faster • Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce
_______________________________________________ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users