Hi Karima, 2014-11-04 21:47 GMT+01:00 Karima Rafes <[email protected]>:
> > >I have fixed many of the > > conformity issues for the development snapshot of Marmotta, maybe you can > > update to the latest version and run the tests again? > > Each night, the system runs the tests with your last version in github. > Great, I already saw another improvement ;-) > > > Additionally, I have some more comments: > > - some of your queries are actually illegal, particularly those querying > > named graphs where the named graph is given as a file name instead of a > > proper URI (e.g. exists02) > > Yes, it's a problem also in the other databases like 4Store. I cannot > change the tests without consequences. I already tried... > But you can propose to change these tests in the mailing-list ( > [email protected] ) and if nobody is against, I will fix these > tests. > > Of course you can change the tests without consequences, as long as you document it and say why. If you want to do a realistic scoring for triple stores, you indeed have to or otherwise it won't be accepted by those implementing the triple stores. Collect all these issues and then report them to the W3C WG (with arguments), because here the tests are clearly in violation with the specs. I have found another such case: http://sparqlscore.com/ajax_testResultNode.php?graph=https%3A%2F%2Fci.inria.fr%2Fgo3%2Fjob%2FMarmotta_Kiwistore_Dev%2F14%2F&node=http%3A%2F%2Fwww.w3.org%2F2009%2Fsparql%2Fdocs%2Ftests%2Fdata-sparql11%2Ffunctions%2Fmanifest%23bnode01%2FResponse Here, the expected result for BNODE(?b2) is to create a new blank node with the value bound to b2, while the result you use for comparison creates a blank node with identifier "b2". This is clearly an error wrt the SPARQL spec. > > - it would be nice to also add performance metrics so we can see how fast > > the query evaluation in the different triple stores actually is > > yes I can print the time for each test but often, it's not very clear > for the novice (example virtuoso is more long the first time and quick > after...) > we think to build a real benchmark about the performance metrics with > the same hardware. > If you have an idea of good benchmark, it's the moment to send me it. > My suggestion is to load in addition to your tests a rather large set of triples into the triple store and THEN run the queries. Run each query at least 10 times, and then compute the average. Best would also be to randomize the evaluation so that queries are not called 10 times in sequence, but with other queries inbetween (otherwise you can easily cheat with a cache). > > > - some of the queries use quite sophisticated features (e.g. OWL-DL > > entailment); maybe it makes sense to score in a way that users can see > how > > the triple stores perform for basic SPARQL queries and how they perform > for > > more sophisticated features? > > There is a problem because the possibility to switch between the > features is not in the SPARQL protocol. If I want to do it, I have to > develop a specific code for each database or I have to install many > instances of each database with different options. > May be in Sparql 1.2... it will be a good idea to add an option to > switch between the features. > I was not referring to switching features, just to grouping them in your presentation like you already do for some kinds. You could e.g. add groups like "Property Paths", "OPTIONAL", ... Additionally it would be cool to see WHY a test failed, because often it is just minor issues like the wrong datatype. > > >another thing I noticed is with respect to aggregation functions. > Marmotta fails on many of them because it tries to be more >precise here, > e.g.: > >- for AVG Marmotta always returns xsd:double, because this is the > precision it computes; > > your tests expect xsd:decimal here > >- for COUNT, FLOOR, CEIL, Marmotta always returns xsd:integer, because > these functions are actually supposed to >return xsd:integer; your tests > expect the original datatype here (e.g. xsd:decimal) which for me does not > really make sense > >As far as I can see the SPARQL specification does not provide any details > here, do you have any further reference why the >testsuite expects > different datatypes here? > > Sometimes there is in the manifest.ttl the link to the test's reference. > Remark : we can read in the doc of tests : "the current tests ... may > be incomplete with respect to the current state of the standards." > http://www.w3.org/2009/sparql/docs/tests/ > You can propose to change these tests in the mailing-list ( > [email protected] ). > I will have some problems if I change the tests without consensus... I > am not legitimate. > > Of course it is legitimate, and I can promise you there won't be any problems. Just collect and document all such issues that you encounter with several triple stores and send them a collection. The W3C working group is not all-knowing and can also produce mistakes ;-) > Greetings, > > Karima > Greetings, Sebastian
