Re: [Virtuoso-users] Query performance with R2RML remote RDBMS SQL Server

2016-11-07 Thread Kingsley Idehen
On 11/4/16 8:56 AM, CHARBEL EL KAED wrote:
>
> Thank you Kingsley,
>
>  
>
> Is there an approximate date for the patch?
>
We can get it to you in about a week or less. Your case will be updated
accordingly.

Kingsley
>
> Thank you,
>
> Best/Cordialement,
>
> Charbel El Kaed, PhD
> Business Architect
> Digital Services Platform
> Global Solutions
> Schneider Electric
>
>   
>
> D  +1 (978) 975-9361 x59361
> M  +1 (978) 289-3190
> E  charbel.el-k...@schneider-electric.com
> <mailto:charbel.el-k...@schneider-electric.com>
>
>   
>
> 800 Federal Street
> Andover
> United States
>
>  
>
> http://www.apcmedia.com/emailsignature/images/998-19615502_BrandBusinessBanners_620x80_1.jpg
> <http://www.schneider-electric.com/b2b/en/solutions/index.jsp#xtor=CS4-110-[Print]-[Event]-[LIO]-[Hong_Kong]>
>
>  
>
>
> *Please consider the environment before printing this e-mail
>
>   
>
> FB <http://blog.schneider-electric.com/> FB
> <https://www.facebook.com/SchneiderElectric> FB
> <https://www.twitter.com/SchneiderElec> FB
> <https://www.linkedin.com/company/schneider-electric> FB
> <https://www.youtube.com/schneidercorporate> FB
> <https://instagram.com/schneiderelectric> FB
> <https://plus.google.com/+schneiderelectric> 
>
>
>   
>
>  
>
>  
>
> *From:*Kingsley Idehen [mailto:kide...@openlinksw.com]
> *Sent:* Thursday, November 03, 2016 9:30 PM
> *To:* virtuoso-users@lists.sourceforge.net
> *Subject:* Re: [Virtuoso-users] Query performance with R2RML remote
> RDBMS SQL Server
>
>  
>
> On 10/14/16 12:41 PM, CHARBEL EL KAED wrote:
>
> Hello,
>
>  
>
> I would like to have your opinion on the following:
>
>  
>
> I installed Virtuoso Enterprise on a Windows VM with:
>
> 8 cores
>
> 28 GB RAM
>
> 56 GB SSD
>
>  
>
> I attached virtuoso configuration file.
>
>  
>
> On virtuoso I have a local RDF Table with 8221 triples and a
> remote MS SQL Azure with 50 million records. A record is simply an
> id, param1 and param2, with (id and param1) as a composite primary
> key.
>
> I have the following query:
>
>  
>
> Select ?id ?param1 ?param2 {
>
>Select ?id ?param1 ?param2 FROM
> <http://localhost:8890/sqldb50m# <http://localhost:8890/sqldb50m>>
>
>  Where {
>
> [] <http://localhost:8890/schemas/sqldb50m/id> ?id;
>
> <http://localhost:8890/schemas/ sqldb50m /param1
> <http://localhost:8890/schemas/%20sqldb50m%20/param1>> ? param1;
>
> <http://localhost:8890/schemas/ sqldb50m /param2
> <http://localhost:8890/schemas/%20sqldb50m%20/param2%20>> ? param2. 
>
> filter (?param1 < 5)
>
> {
>
> SELECT DISTINCT (strafter(str(?id1), "#") AS ?trimId) from
> <http://localhost:8890/BOC# <http://localhost:8890/BOC>>
>
> WHERE
>
> {
>
> ?server qt:hasId ?id1. //returns 42 ids
>
> }
>
> }
>
> filter( ?trimId = ?id)
>
> }
>
>  };
>
>  
>
> The query is expected to return 1 million record result.
>
> I initiated the query yesterday through iSQL, it is still running
> since, more than 12 hours. On the Windows VM, the task manager
> shows that Virtuoso consumes 10 GB of RAM and less then 5% of CPU.
>
> On the MS SQL monitoring tool, you can notice the load is constant
> using only 10% of the CPU and 10% of the DTU.
>
>  
>
> Any recommendations to improve the performance?
>
>  
>
> Thank you
>
>  
>
> There should be an update in your support case about this matter.
> Fundamentally, there was a SPARQL Optimizer bug that lead to the
> shared variable effect not kicking in i.e., executing the remote part
> of the query. Here are examples to explain the gist of the matter:
>
>  
>
> ## Problematic  due to missing {} around dataset modifier fragments
> that contain shared variables between local and remote
>
> ## Data Sources
>
> SELECT ?s ?p (sql:BEST_LANGMATCH (?o, "ru, en-gb;q=0.8, en;q=0.7,
> *;q=0.1", "")) as ?o_filtered
> WHERE {
> ?s a foaf:Person .
> # Virtuoso Extension for setting Named Graph scope on a Remote
> SPARQL service
> SERVICE <http://dbpedia.org/sparql>
> <http://dbpedia.org/sparql> from <http://dbpedia.org>
> <http://dbpedia.org>
>   {
>  ?s ?p ?o .

Re: [Virtuoso-users] Query performance with R2RML remote RDBMS SQL Server

2016-11-07 Thread CHARBEL EL KAED
Thank you Kingsley,

Is there an approximate date for the patch?
Thank you,

Best/Cordialement,
Charbel El Kaed, PhD
Business Architect
Digital Services Platform
Global Solutions
Schneider Electric

D  +1 (978) 975-9361 x59361
M  +1 (978) 289-3190
E  
charbel.el-k...@schneider-electric.com<mailto:charbel.el-k...@schneider-electric.com>

800 Federal Street
Andover
United States


[http://www.apcmedia.com/emailsignature/images/998-19615502_BrandBusinessBanners_620x80_1.jpg]<http://www.schneider-electric.com/b2b/en/solutions/index.jsp#xtor=CS4-110-[Print]-[Event]-[LIO]-[Hong_Kong]>



*Please consider the environment before printing this e-mail

[FB]<http://blog.schneider-electric.com/> [FB] 
<https://www.facebook.com/SchneiderElectric>  [FB] 
<https://www.twitter.com/SchneiderElec>  [FB] 
<https://www.linkedin.com/company/schneider-electric>  [FB] 
<https://www.youtube.com/schneidercorporate>  [FB] 
<https://instagram.com/schneiderelectric>  [FB] 
<https://plus.google.com/+schneiderelectric>




From: Kingsley Idehen [mailto:kide...@openlinksw.com]
Sent: Thursday, November 03, 2016 9:30 PM
To: virtuoso-users@lists.sourceforge.net
Subject: Re: [Virtuoso-users] Query performance with R2RML remote RDBMS SQL 
Server

On 10/14/16 12:41 PM, CHARBEL EL KAED wrote:
Hello,

I would like to have your opinion on the following:

I installed Virtuoso Enterprise on a Windows VM with:
8 cores
28 GB RAM
56 GB SSD

I attached virtuoso configuration file.

On virtuoso I have a local RDF Table with 8221 triples and a remote MS SQL 
Azure with 50 million records. A record is simply an id, param1 and param2, 
with (id and param1) as a composite primary key.
I have the following query:

Select ?id ?param1 ?param2 {
   Select ?id ?param1 ?param2 FROM 
<http://localhost:8890/sqldb50m#<http://localhost:8890/sqldb50m>>
 Where {
[] <http://localhost:8890/schemas/sqldb50m/id> ?id;
<http://localhost:8890/schemas/ sqldb50m 
/param1<http://localhost:8890/schemas/%20sqldb50m%20/param1>> ? param1;
<http://localhost:8890/schemas/ sqldb50m /param2 
<http://localhost:8890/schemas/%20sqldb50m%20/param2%20> > ? param2.
filter (?param1 < 5)
{
SELECT DISTINCT (strafter(str(?id1), "#") AS ?trimId) from 
<http://localhost:8890/BOC#<http://localhost:8890/BOC>>
WHERE
{
?server qt:hasId ?id1. //returns 42 ids
}
}
filter( ?trimId = ?id)
}
 };

The query is expected to return 1 million record result.
I initiated the query yesterday through iSQL, it is still running since, more 
than 12 hours. On the Windows VM, the task manager shows that Virtuoso consumes 
10 GB of RAM and less then 5% of CPU.
On the MS SQL monitoring tool, you can notice the load is constant using only 
10% of the CPU and 10% of the DTU.

Any recommendations to improve the performance?

Thank you


There should be an update in your support case about this matter. 
Fundamentally, there was a SPARQL Optimizer bug that lead to the shared 
variable effect not kicking in i.e., executing the remote part of the query. 
Here are examples to explain the gist of the matter:



## Problematic  due to missing {} around dataset modifier fragments that 
contain shared variables between local and remote

## Data Sources

SELECT ?s ?p (sql:BEST_LANGMATCH (?o, "ru, en-gb;q=0.8, en;q=0.7, *;q=0.1", 
"")) as ?o_filtered
WHERE {
?s a foaf:Person .
# Virtuoso Extension for setting Named Graph scope on a Remote SPARQL 
service
SERVICE <http://dbpedia.org/sparql><http://dbpedia.org/sparql> from 
<http://dbpedia.org><http://dbpedia.org>
  {
 ?s ?p ?o .
 FILTER (?p != 
<http://dbpedia.org/property/abstract><http://dbpedia.org/property/abstract>)
  }

OPTIONAL { ?p rdfs:label ?lbl }
  }
ORDER BY ASC (COUNT (?o))


## Revised query using {} to group dataset modifier fragments to set scope for 
shared variable identifiers

SELECT ?s ?p (sql:BEST_LANGMATCH (?o, "ru, en-gb;q=0.8, en;q=0.7, *;q=0.1", 
"")) as ?o_filtered
WHERE {

  {  ## shared variable block start ##
?s a foaf:Person .
# Virtuoso Extension for setting Named Graph scope on a Remote 
SPARQL service
SERVICE <http://dbpedia.org/sparql><http://dbpedia.org/sparql> from 
<http://dbpedia.org><http://dbpedia.org>
  {
 ?s ?p ?o .
 FILTER (?p != 
<http://dbpedia.org/property/abstract><http://dbpedia.org/property/abstract>)
  }
## shared variable block end ## }
OPTIONAL { ?p rdfs:label ?lbl }
  }
ORDER BY ASC (COUNT (?o))


## Another example that failed where I am passing values using BIND. This 
failed and will work properly when the fix is in ##

PREFIX

Re: [Virtuoso-users] Query performance with R2RML remote RDBMS SQL Server

2016-11-03 Thread Kingsley Idehen
On 10/14/16 12:41 PM, CHARBEL EL KAED wrote:
>
> Hello,
>
>  
>
> I would like to have your opinion on the following:
>
>  
>
> I installed Virtuoso Enterprise on a Windows VM with:
>
> 8 cores
>
> 28 GB RAM
>
> 56 GB SSD
>
>  
>
> I attached virtuoso configuration file.
>
>  
>
> On virtuoso I have a local RDF Table with 8221 triples and a remote MS
> SQL Azure with 50 million records. A record is simply an id, param1
> and param2, with (id and param1) as a composite primary key.
>
> I have the following query:
>
>  
>
> Select ?id ?param1 ?param2 {
>
>Select ?id ?param1 ?param2 FROM
> >
>
>  Where {
>
> []  ?id;
>
>  > ? param1;
>
>  > ? param2. 
>
> filter (?param1 < 5)
>
> {
>
> SELECT DISTINCT (strafter(str(?id1), "#") AS ?trimId) from
> >
>
> WHERE
>
> {
>
> ?server qt:hasId ?id1. //returns 42 ids
>
> }
>
> }
>
> filter( ?trimId = ?id)
>
> }
>
>  };
>
>  
>
> The query is expected to return 1 million record result.
>
> I initiated the query yesterday through iSQL, it is still running
> since, more than 12 hours. On the Windows VM, the task manager shows
> that Virtuoso consumes 10 GB of RAM and less then 5% of CPU.
>
> On the MS SQL monitoring tool, you can notice the load is constant
> using only 10% of the CPU and 10% of the DTU.
>
>  
>
> Any recommendations to improve the performance?
>
>  
>
> Thank you
>

There should be an update in your support case about this matter.
Fundamentally, there was a SPARQL Optimizer bug that lead to the shared
variable effect not kicking in i.e., executing the remote part of the
query. Here are examples to explain the gist of the matter:


## Problematic  due to missing {} around dataset modifier fragments that
contain shared variables between local and remote

## Data Sources

SELECT ?s ?p (sql:BEST_LANGMATCH (?o, "ru, en-gb;q=0.8, en;q=0.7,
*;q=0.1", "")) as ?o_filtered
WHERE {
?s a foaf:Person .
# Virtuoso Extension for setting Named Graph scope on a Remote
SPARQL service
SERVICE  from 
  {
 ?s ?p ?o .
 FILTER (?p != )
  }
 
OPTIONAL { ?p rdfs:label ?lbl }
  }
ORDER BY ASC (COUNT (?o))


## Revised query using {} to group dataset modifier fragments to set
scope for shared variable identifiers

SELECT ?s ?p (sql:BEST_LANGMATCH (?o, "ru, en-gb;q=0.8, en;q=0.7,
*;q=0.1", "")) as ?o_filtered
WHERE {*
*

*  {  **## shared variable block start ##*
?s a foaf:Person .
# Virtuoso Extension for setting Named Graph scope on a
Remote SPARQL service
SERVICE  from 
  {
 ?s ?p ?o .
 FILTER (?p != )
  }
 *   ***## shared variable block end ## *}*
OPTIONAL { ?p rdfs:label ?lbl }
  }
ORDER BY ASC (COUNT (?o))


## Another example that failed where I am passing values using BIND.
This failed and will work properly when the fix is in ##

PREFIX csv:


SELECT ?w ?p ?o2
FROM NAMED


WHERE {
GRAPH

  {?s csv:wikidata ?o . }
BIND (IRI(?o) AS ?w)
SERVICE  { SELECT * WHERE {?w ?p ?o2.
} LIMIT 50 }
 
}

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature

[Virtuoso-users] Query performance with R2RML remote RDBMS SQL Server

2016-10-14 Thread CHARBEL EL KAED
Hello,

I would like to have your opinion on the following:

I installed Virtuoso Enterprise on a Windows VM with:
8 cores
28 GB RAM
56 GB SSD

I attached virtuoso configuration file.

On virtuoso I have a local RDF Table with 8221 triples and a remote MS SQL 
Azure with 50 million records. A record is simply an id, param1 and param2, 
with (id and param1) as a composite primary key.
I have the following query:

Select ?id ?param1 ?param2 {
   Select ?id ?param1 ?param2 FROM 
>
 Where {
[]  ?id;
> ? param1;
 > ? param2.
filter (?param1 < 5)
{
SELECT DISTINCT (strafter(str(?id1), "#") AS ?trimId) from 
>
WHERE
{
?server qt:hasId ?id1. //returns 42 ids
}
}
filter( ?trimId = ?id)
}
 };

The query is expected to return 1 million record result.
I initiated the query yesterday through iSQL, it is still running since, more 
than 12 hours. On the Windows VM, the task manager shows that Virtuoso consumes 
10 GB of RAM and less then 5% of CPU.
On the MS SQL monitoring tool, you can notice the load is constant using only 
10% of the CPU and 10% of the DTU.

Any recommendations to improve the performance?

Thank you

This message was scanned by Exchange Online Protection Services.



virtuoso.ini
Description: virtuoso.ini
--
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users