Re: [d2rq-dev] D2RQ bad performance in Jena

Richard Cyganiak Wed, 03 Apr 2013 02:36:56 -0700

How do you set up the model you're querying?

Richard



On 2 Apr 2013, at 12:44, Anastasiya Goncharova <slmn...@gmail.com> wrote:

> I think, I need to provide query that I run.
> 
> SPARQl query is:
> 
> PREFIX vocab: <http://www.vocab.de/vocab/> SELECT ?x ?y ?z { ?id 
> vocab:port39390_facts_relation ?z. ?id vocab:port39390_facts_arg1 ?x. ?id 
> vocab:port39390_facts_arg2 ?y.  FILTER (?z = 'isLeaderOf')}
> 
> D2RQ rewrites this query into 2 SQL (as it is written to console):
> 
> 11:46:01 INFO  SQLIterator          :: SELECT DISTINCT 
> "T2_port39390_facts"."arg2", "T1_port39390_facts"."id", 
> "T3_port39390_facts"."arg1" FROM "port39390"."facts" AS "T1_port39390_facts", 
> "port39390"."facts" AS "T2_port39390_facts", "port39390"."facts" AS 
> "T3_port39390_facts" WHERE ("T1_port39390_facts"."id" = 
> "T2_port39390_facts"."id" AND "T1_port39390_facts"."relation" = 'isLeaderOf' 
> AND "T1_port39390_facts"."relation" IS NOT NULL AND 
> "T2_port39390_facts"."arg2" IS NOT NULL AND "T2_port39390_facts"."id" = 
> "T3_port39390_facts"."id" AND "T3_port39390_facts"."arg1" IS NOT NULL)
> 
> 11:46:01 INFO  SQLIterator          :: SELECT DISTINCT 
> "T1_port39390_facts"."id", "T4_port39390_facts"."arg2", 
> "T2_port39390_facts"."relation", "T3_port39390_facts"."arg1" FROM 
> "port39390"."facts" AS "T1_port39390_facts", "port39390"."facts" AS 
> "T4_port39390_facts", "port39390"."facts" AS "T2_port39390_facts", 
> "port39390"."facts" AS "T3_port39390_facts" WHERE ("T1_port39390_facts"."id" 
> = "T4_port39390_facts"."id" AND "T2_port39390_facts"."id" = 
> "T4_port39390_facts"."id" AND "T2_port39390_facts"."relation" IS NOT NULL AND 
> "T3_port39390_facts"."arg1" IS NOT NULL AND "T3_port39390_facts"."id" = 
> "T4_port39390_facts"."id" AND "T4_port39390_facts"."arg2" IS NOT NULL)
> 
> So, second query considers 4 copies of the table and for 3 copies does a full 
> scan on one of their columns (arg1, arg2, relation) and for 4th copy a full 
> scan on all columns. Is it possible to prevent such behaviour and what the 
> difference between query evaluation from Jena and in command line using 
> d2r-query?
> 
> 
> 
> 2013/4/2 Anastasiya Goncharova <slmn...@gmail.com>
>> Hello everyone, 
>> 
>> I have a large dataset that contains about 650 millions rows. I try to 
>> evaluate query that returns about 8000 rows. When I run this query from 
>> command line using d2r-query function, the result is returned fast enough. 
>> But when I evaluate the same query from Jena using 
>> 
>> ResultSet rs = QueryExecutionFactory.create(query, RDFModel).execSelect();
>> 
>> it takes too long. I was waiting for several hours and then have terminated 
>> application without waiting the end of evaluation. Why does it happen and 
>> how to improve a runtime?
>> 
>> Best,
>> 
>> Anastasiya
> 
> ------------------------------------------------------------------------------
> Minimize network downtime and maximize team effectiveness.
> Reduce network management and security costs.Learn how to hire 
> the most talented Cisco Certified professionals. Visit the 
> Employer Resources Portal
> http://www.cisco.com/web/learning/employer_resources/index.html
> _______________________________________________
> d2rq-map-devel mailing list
> d2rq-map-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/d2rq-map-devel

------------------------------------------------------------------------------
Minimize network downtime and maximize team effectiveness.
Reduce network management and security costs.Learn how to hire 
the most talented Cisco Certified professionals. Visit the 
Employer Resources Portal
http://www.cisco.com/web/learning/employer_resources/index.html

_______________________________________________
d2rq-map-devel mailing list
d2rq-map-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/d2rq-map-devel

Re: [d2rq-dev] D2RQ bad performance in Jena

Reply via email to