Re: [Virtuoso-users] Literals, STR() in FILTER and query plan

Hugh Williams Sun, 09 Mar 2014 14:58:15 -0700

Hi Nicola,

You observation does seem odd, as the first time a query is run and hence has 
to be compiled you would expect it to take time compile in addiiton to be 
executed, thus subsequent times it would typically take less time and the 
compilation does not have to be performed again.  This is the behaviour I see 
in the simple test case below:


SQL> SPARQL INSERT INTO <test> {<1> <http://example.com#relation> "Some text" . 
<1> <http://example.com#relation> "Some other text" };

Done. -- 4 msec.
SQL> SPARQL SELECT * WHERE {  ?id <http://example.com#relation> "Some text" .  
?id <http://example.com#relation> "Some other text" . };
id
VARCHAR
_______________________________________________________________________________

1

1 Rows. -- 9 msec.
SQL> SPARQL SELECT * WHERE {  ?id <http://example.com#relation> "Some text" .  
?id <http://example.com#relation> "Some other text" . };
id
VARCHAR
_______________________________________________________________________________

1

1 Rows. -- 1 msec.
SQL> SPARQL SELECT * WHERE {  ?id <http://example.com#relation> "Some text" .  
?id <http://example.com#relation> "Some other text" . };
id
VARCHAR
_______________________________________________________________________________

1

1 Rows. -- 1 msec.
SQL> SPARQL SELECT * WHERE {  ?id <http://example.com#relation> "Some text" .  
?id <http://example.com#relation> ?o . FILTER(STR(?o) = "Some other text") };
id                                                                              
  o
VARCHAR                                                                         
  VARCHAR
_______________________________________________________________________________

1                                                                               
  Some other text

1 Rows. -- 1 msec.
SQL> SPARQL SELECT * WHERE {  ?id <http://example.com#relation> "Some text" .  
?id <http://example.com#relation> ?o . FILTER(STR(?o) = "Some other text") };
id                                                                              
  o
VARCHAR                                                                         
  VARCHAR
_______________________________________________________________________________

1                                                                               
  Some other text

1 Rows. -- 0 msec.
SQL> 

In fact even in the sample profile output you provided you can see there 
compilation time for the first query is 13msec vs 1msec for the subsequent run, 
thus it would be interesting to see the complete compilation times of each 
query run which would also show the execution time ?

Best Regards
Hugh Williams
Professional Services
OpenLink Software, Inc.      //              http://www.openlinksw.com/
Weblog   -- http://www.openlinksw.com/blogs/
LinkedIn -- http://www.linkedin.com/company/openlink-software/
Twitter  -- http://twitter.com/OpenLink
Google+  -- http://plus.google.com/100570109519069333827/
Facebook -- http://www.facebook.com/OpenLinkSoftware
Universal Data Access, Integration, and Management Technology Providers

On 6 Mar 2014, at 11:04, Nicola Vitucci <nicola.vitu...@gmail.com> wrote:

> Sorry, I mentioned the wrong build. The correct build is the one having
> SHA = b68d9d9c93c1dfe38f7afa23998b894772c195c0 (26/02/2014).
> 
> Nicola
> 
> Il 06/03/2014 10:21, Nicola Vitucci ha scritto:
>> Hi all,
>> 
>> I am using Virtuoso build having SHA =
>> b86c2a70106138325ed53c60690d7c8f79bc5446 and I found a weird behaviour
>> in the queries making use of literals. Basically, the problem is that
>> the query plan changes with no apparent reason depending on whether the
>> same query is run immediately after starting the server or some time
>> (and queries) later, and such change causes a difference in the
>> execution time (the first time the query is faster).
>> 
>> The query has the following pattern:
>> 
>> SELECT *
>> WHERE {
>>  ?id <http://example.com#relation> "Some text" .
>>  ?id <http://example.com#relation> "Some other text" .
>> }
>> 
>> You can find the two different query plans in the bottom, the former
>> being the "fast" one and the latter being the "slow" one. What's even
>> more strange is that, if the literals are moved in a FILTER clause and
>> "cast" using the STR() function, the query is always executed in "fast
>> mode". In other words, if the query is written in this way, the plan
>> does not change (last query plan in the bottom):
>> 
>> SELECT *
>> WHERE {
>>  ?id <http://example.com#relation> "Some text" .
>>  ?id <http://example.com#relation> ?o . FILTER(STR(?o) = "Some other text")
>> }
>> 
>> It looks like the main difference, besides the FILTER addition (the
>> "After test" block), is the presence of the "inlined" keyword and the
>> use of the RDF_QUAD index rather than the POGS index. Can you explain
>> what's happening with the indices and what does the "inlined" keyword mean?
>> 
>> Thank you,
>> 
>> Nicola
>> 
>> -------------------------------------
>> 
>> First ("fast") query plan:
>> 
>> {
>> time     0.038% fanout         1 input         1 rows
>> time         1% fanout       902 input         1 rows
>> 
>> Precode:
>>      0: __rdflit := Call __rdflit (rdflit6325)
>>      5: __rdflit := Call __rdflit (rdflit1206)
>>      10: BReturn 0
>> RDF_QUAD_POGS     9e+02 rows(s_9_4_t0.S)
>> P =  ##relation  ,  O = rdflit6325
>> time        99% fanout 0.00776053 input       902 rows
>> RDF_QUAD      0.01 rows()
>> inlined  P =  ##relation  ,  S = s_9_4_t0.S ,  O = rdflit1206
>> 
>> After code:
>>      0: id := Call __id2in (s_9_4_t0.S)
>>      5: BReturn 0
>> time     0.016% fanout         0 input         7 rows
>> Select (id)
>> }
>> 
>> 8 msec 40% cpu,       917 rnd      1810 seq         0% same seg
>> 80.8279% same pg
>> 513 disk reads, 508 read ahead,    29.745% wait
>> Compilation: 13 msec 702 reads   25.6666% read 0 messages         0% clw
>> 
>> ------------
>> 
>> Second ("slow") query plan:
>> 
>> {
>> time   4.6e-05% fanout         1 input         1 rows
>> time     0.001% fanout       902 input         1 rows
>> 
>> Precode:
>>      0: __rdflit := Call __rdflit (rdflit6325)
>>      5: __rdflit := Call __rdflit (rdflit1206)
>>      10: BReturn 0
>> RDF_QUAD_POGS     0.065 rows(s_9_4_t0.S)
>> P =  ##relation  ,  O = rdflit6325
>> time     1e+02% fanout 0.00776053 input       902 rows
>> RDF_QUAD_POGS   4.6e-14 rows()
>> P =  ##relation  ,  O = rdflit1206 S = s_9_4_t0.S
>> 
>> After code:
>>      0: id := Call __id2in (s_9_4_t0.S)
>>      5: BReturn 0
>> time     2e-05% fanout         0 input         7 rows
>> Select (id)
>> }
>> 
>> 5995 msec 99% cpu,       903 rnd 1.37377e+08 seq         0% same seg
>>      0% same pg
>> Compilation: 1 msec 0 reads         0% read 0 messages         0% clw
>> 
>> -------------
>> 
>> Plan with FILTER and STR():
>> 
>> {
>> time      0.12% fanout         1 input         1 rows
>> time       3.1% fanout       902 input         1 rows
>> 
>> Precode:
>>      0: __rdflit := Call __rdflit (rdflit6325)
>>      5: BReturn 0
>> RDF_QUAD_POGS     0.065 rows(s_5_4_t0.S)
>> P =  ##relation  ,  O = rdflit6325
>> time        47% fanout   1.01663 input       902 rows
>> RDF_QUAD      0.46 rows(s_5_4_t1.O)
>> inlined  P =  ##relation  ,  S = s_5_4_t0.S
>> time        50% fanout 0.00763359 input       917 rows
>> END Node
>> After test:
>>      0: __rdf_strsqlval := Call __rdf_strsqlval (s_5_4_t1.O,  0 )
>>      5: if (<c Some other text> = __rdf_strsqlval) then 9 else 10 unkn 10
>>      9: BReturn 1
>>      10: BReturn 0
>> 
>> After code:
>>      0: id := Call __id2in (s_5_4_t0.S)
>>      5: o := Call __ro2sq (s_5_4_t1.O)
>>      10: BReturn 0
>> time    0.0084% fanout         0 input         7 rows
>> Select (id, o)
>> }
>> 
>> 2 msec 67% cpu,      1827 rnd      2461 seq         0% same seg
>> 40.5908% same pg
>> Compilation: 1 msec 0 reads         0% read 0 messages         0% clw
>> 
> 
> ------------------------------------------------------------------------------
> Subversion Kills Productivity. Get off Subversion & Make the Move to Perforce.
> With Perforce, you get hassle-free workflows. Merge that actually works. 
> Faster operations. Version large binaries.  Built-in WAN optimization and the
> freedom to use Git, Perforce or both. Make the move to Perforce.
> http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk
> _______________________________________________
> Virtuoso-users mailing list
> Virtuoso-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/virtuoso-users

------------------------------------------------------------------------------
Subversion Kills Productivity. Get off Subversion & Make the Move to Perforce.
With Perforce, you get hassle-free workflows. Merge that actually works. 
Faster operations. Version large binaries.  Built-in WAN optimization and the
freedom to use Git, Perforce or both. Make the move to Perforce.
http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk

_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Re: [Virtuoso-users] Literals, STR() in FILTER and query plan

Reply via email to