On Jul 7, 2014, at 11:11 PM, Arsham Mesbah <ars...@uga.edu> wrote:

> Hi Ted, 
> 
> Thanks for your kind reply. 
> 
> I am running Virtuoso Open Source Edition on all instances. 
> The instance that holds the entire data runs  
>  Virtuoso Version 7.0.0.3203-pthreads as of Sep 30 2013
>  
> The servers involved in the federated query run the following versions: 
> Version 7.1.1-dev.3209-pthreads    ( 1 instance)    
> Version 7.1.1-dev.3208-pthreads   (3 instances)

OK.  

My gut sense is that there's an issue in the SPARQL processor 
in at least one of these builds.  I can't say immediately 
which one is exhibiting the issue -- because I don't know 
which of the result sets you're getting is correct, though
I am guessing it's the one from the unfederated query.

Toward figuring out which engine version has the issue, 
I would suggest new test runs of unfederated queries against 
the existing 7.0.0.3203 and a new 7.1.1-dev.3209 (and maybe 
also a new 7.1.1-dev.3208) loaded with the full data set.

(Note that you will probably need to change out only the main 
Virtuoso binary for these tests, with a clean checkpoint and
shutdown before each executable swap.  Switching between these 
versions, there should be no DB compatibility issues.  Still, 
it's worth taking a backup copy of each DB, just to ease any 
reversion/recovery.)

It may also be worth trying the federated query against a set
of four instances all running with 7.1.1-dev.3208 (as you only
need to switch one), and/or four instances of 7.1.1-dev.3209 --
and even four instances of 7.0.0.3203.

Other possibly useful tests include runs of each subquery 
from the federated query (or at least COUNTs of these result
sets) directly on each endpoint -- that is, on the endpoint 
with the full data set and on the endpoint with the relevant 
sub-set.

On general principles, I'd also be interested in hearing what
happens when all instances are running the latest version --
possibly built after a source code refresh from github.

Toward our ability to reproduce and test locally -- it would
be helpful if you could provide details of how you partitioned
the data, and any other test prep.  

The more we know, the better and faster we can move forward to 
full resolution of this issue.

Regards,

Ted


> All servers are setup identically and there is no difference in their INI 
> file or hosting environment. 
> The query is exactly the same (involving 3 instances. The instance query is 
> ran from (local end point) and 2 other end points queried through the service 
> clauses). For security reasons I didn't want to expose IP addresses therefore 
> just replaced them with "IPADD". However, I see now how that might be 
> misleading. The first and second service clause in the query are querying 
> different endpoints. I have tested each endpoint separately to make sure they 
> are functional before running the federated query.
> 
> I hope this answers all the questions so far. 
> 
> 
> Thanks again for the help. 
> 
> Kind Regards,
> 
> 
> 
> 
> On Mon, Jul 7, 2014 at 9:42 PM, Ted Thibodeau Jr <tthibod...@openlinksw.com> 
> wrote:
> Hi, Arsham --
> 
> On Jul 1, 2014, at 02:53 PM, Arsham Mesbah <ars...@uga.edu> wrote:
> 
> > I have synthetic data generated using Lehigh University
> > Benchmark (LUBM) data generator (version 1.7), this
> > generates about 402 .owl file including the schema.
> > I sorted and distributed the triples based on the
> > predicate to 4 different servers. All of these servers
> > run Virtuoso.
> 
> First question, key to everything that follows...
> 
> Are you running Virtuoso Commercial Edition, or Open Source?
> 
> Please check and provide the complete version string(s) for all
> instances (no need to repeat when it's identical) -- that is,
> the first paragraph of output from the relevant command --
> 
>    Commercial Edition     virtuoso-t -?
>    Open Source Edition    virtuoso-iodbc-t -?
> 
> Are all servers involved in the federated query running
> the same version?
> 
> What about the one that holds the complete data set?
> 
> Are they all configured identically?  Presuming not, what are
> the differences, in INI file and/or hosting environments?
> 
> 
> 
> > I also have another server that holds the entire data set.
> > When I run the following query on the server that holds
> > the entire data, 59 triples are returned. This query is
> > query number 2 of the test queries provided with the LU
> > benchmark.
> >
> >
> > PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> > PREFIX ub: <http://cs.uga.edu#>
> > SELECT ?X, ?Y, ?Z
> > FROM <http://www.cs.uga.edu#>
> > WHERE
> > {
> >     ?X rdf:type ub:GraduateStudent .
> >     ?Y rdf:type ub:University .
> >     ?Z rdf:type ub:Department .
> >     ?X ub:memberOf ?Z .
> >     ?Z ub:subOrganizationOf ?Y .
> >     ?X ub:undergraduateDegreeFrom ?Y
> > }
> >
> > I wrote a federated version of this query based on my
> > triple distribution and ran in on the distributed version.
> > However only 1 triple is returned. I grabbed few of the
> > triples returned from the single server version and
> > checked if data needed for these triple to be returned
> > exist on the distributed version of the data and it does,
> > however it is not returned as a result. The federated
> > query is the following:
> >
> >
> > PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> > PREFIX ub: <http://cs.uga.edu#>
> > SELECT *
> > WHERE
> > {
> >     SERVICE <http://IPADD/sparql> {
> >     ?X rdf:type ub:GraduateStudent .
> >     ?Y rdf:type ub:University .
> >     ?Z rdf:type ub:Department .
> >     ?X ub:undergraduateDegreeFrom ?Y
> > }
> >     SERVICE <http://IPADD/sparql> {
> >         ?X ub:memberOf ?Z .
> >     }
> >     ?Z ub:subOrganizationOf ?Y .
> > }
> 
> Is this exactly the query you've been running?
> 
> As I read it, this query will involve 2 Virtuoso instances,
> in three phases.  Both remote subqueries go to the same
> endpoint -- <http://IPADD/sparql> -- and the third is on
> the local endpoint.
> 
> Given your description of your test, I think you *probably*
> meant for the remote portions to target *different* service
> endpoints, even if no other change is needed.
> 
> 
> 
> > Also, if I move the last triple pattern on the federated
> > query ?Z ub:subOrganizationOf ?Y . up to I get the
> > following error: Virtuoso RDFZZ Error DB.DBA.SPARQL_REXEC
> > ('IPADD/sparql', ...) has received result with unexpected
> > variable name 'stubvar15' My understanding of the federated
> > query is that it gets all the triples returned by triple
> > patterns in the query and by service clauses in the query
> > and joins them if possible, therefore moving a triple
> > pattern up or down should not affect the result of the
> > query. Am I missing anything here?
> 
> I think we come back to this after the above sections.
> 
> Regards,
> 
> Ted
> 
> 
> > --
> > Kind Regards,
> > ~A
> 
> 
> 
> 
> --
> A: Yes.                      http://www.guckes.net/faq/attribution.html
> | Q: Are you sure?
> | | A: Because it reverses the logical flow of conversation.
> | | | Q: Why is top posting frowned upon?
> 
> Ted Thibodeau, Jr.           //               voice +1-781-273-0900 x32
> Senior Support & Evangelism  //        mailto:tthibod...@openlinksw.com
>                              //              http://twitter.com/TallTed
> OpenLink Software, Inc.      //              http://www.openlinksw.com/
>          10 Burlington Mall Road, Suite 265, Burlington MA 01803
>      Weblog   -- http://www.openlinksw.com/blogs/
>      LinkedIn -- http://www.linkedin.com/company/openlink-software/
>      Twitter  -- http://twitter.com/OpenLink
>      Google+  -- http://plus.google.com/100570109519069333827/
>      Facebook -- http://www.facebook.com/OpenLinkSoftware
> Universal Data Access, Integration, and Management Technology Providers
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> -- 
> ~A
> 
> ------------------------------------------------------------------------------
> Open source business process management suite built on Java and Eclipse
> Turn processes into business applications with Bonita BPM Community Edition
> Quickly connect people, data, and systems into organized workflows
> Winner of BOSSIE, CODIE, OW2 and Gartner awards
> http://p.sf.net/sfu/Bonitasoft_______________________________________________
> Virtuoso-users mailing list
> Virtuoso-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/virtuoso-users

--
A: Yes.                      http://www.guckes.net/faq/attribution.html
| Q: Are you sure?
| | A: Because it reverses the logical flow of conversation.
| | | Q: Why is top posting frowned upon?

Ted Thibodeau, Jr.           //               voice +1-781-273-0900 x32
Senior Support & Evangelism  //        mailto:tthibod...@openlinksw.com
                             //              http://twitter.com/TallTed
OpenLink Software, Inc.      //              http://www.openlinksw.com/
         10 Burlington Mall Road, Suite 265, Burlington MA 01803
     Weblog   -- http://www.openlinksw.com/blogs/
     LinkedIn -- http://www.linkedin.com/company/openlink-software/
     Twitter  -- http://twitter.com/OpenLink
     Google+  -- http://plus.google.com/100570109519069333827/
     Facebook -- http://www.facebook.com/OpenLinkSoftware
Universal Data Access, Integration, and Management Technology Providers








------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Reply via email to