Hi all,

slightly off-topic, but given the ongoing ESWC 2023 conference, I want to share two papers that might be interesting for the one or the other:

1.  Join Ordering of SPARQL Property Path Queries

SPARQL property path queries provide a succinct way to write complex navigational queries over RDF knowledge graphs. However, their evaluation remains difficult as they may involve the execution of transitive closures. As a result, many property path queries just timeout when executed on public online RDF knowledge graphs. One solution to speed up their execution is to find optimal join orders. Although the join ordering problem has been extensively studied for traditional SPARQL queries, the presence of property path patterns biases existing approaches. In this paper we focus on C2RP QUF queries (conjunctive SPARQL property path queries with UNION and FILTER), and we present a query optimizer that is able to capture the cost of C2RP QUF queries using an appropriate cost model and a sampling-based cardinality estimator. On the latest Wikidata Query Benchmark, we empirically demonstrate that our approach finds significantly better join orders than Virtuoso and BlazeGraph.

Paper: https://2023.eswc-conferences.org/wp-content/uploads/2023/05/paper_Aimonier-Davat_2023_Join.pdf

Not directly related to Jena, but interesting anyways.


2. Evaluation of a Representative Selection of SPARQL Query Engines using Wikidata

In this paper, we present an evaluation of the performance of five representative RDF triplestores, including GraphDB, Jena Fuseki, Neptune, RDFox, and Stardog, and one experimental SPARQL query engine, QLever. We compare importing time, loading time, and exporting time using a complete version of the knowledge graph Wikidata, and we also evaluate query performances using 328 queries defined by Wikidata users. To put this evaluation into context with respect to previous evaluations, we also analyze the query performances of these systems using a prominent synthetic benchmark: SP2Bench. We observed that most of the systems we considered for the evaluation were able to complete the execution of almost all the queries defined by Wikidata users before the timeout we established. We noticed, however, that the time needed by most systems to import and export Wikidata might be longer than required in some industrial and academic projects, where information is represented, enriched, and stored using different representation means.

Paper: https://2023.eswc-conferences.org/wp-content/uploads/2023/05/paper_Lam_2023_Evaluation.pdf

In the second paper Jena TDB2 (v4.4.0) has been used during the benchmark.


Cheers,

Lorenz


Reply via email to