Author: sschaffert
Date: Fri Sep 19 10:50:11 2014
New Revision: 1626176
URL: http://svn.apache.org/r1626176
Log:
updated documentation for SPARQL (MARMOTTA-537)
Modified:
marmotta/site/trunk/content/markdown/kiwi/sparql.md.vm
Modified: marmotta/site/trunk/content/markdown/kiwi/sparql.md.vm
URL:
http://svn.apache.org/viewvc/marmotta/site/trunk/content/markdown/kiwi/sparql.md.vm?rev=1626176&r1=1626175&r2=1626176&view=diff
==============================================================================
--- marmotta/site/trunk/content/markdown/kiwi/sparql.md.vm (original)
+++ marmotta/site/trunk/content/markdown/kiwi/sparql.md.vm Fri Sep 19 10:50:11
2014
@@ -7,39 +7,17 @@
The KiWi SPARQL module offers optimized [SPARQL 1.1][1] query support for
typical cases by translating parts of a SPARQL query directly into SQL.
-Currently, the following SPARQL constructs are translated:
+As of Marmotta 3.3, most SPARQL constructs are directly translated. Also,
result
+iterators of an optimized query operate directly on database cursors, so they
+will be very efficient in case only a few results will be retrieved.
+
+Note that translating SPARQL into SQL is a challenging task, and probably the
+most complex part of code contained in Apache Marmotta. Even though the syntax
+seems similar, the semantics of both languages have small differences. In some
+border cases, we therefore deliberately deviate from the SPARQL standard. These
+are documented below. If you want full standard compliance, you can still use
SPARQL
+without the native support, at the expense of query performance (factor up to
1000).
-* **JOIN** between statement patterns: in case a part of a SPARQL query is a
- join between statement patterns, this part will be optimized by translating
the
- whole JOIN into a single SQL query involving all patterns
-* **FILTER** for a statement pattern or join of statement patterns: in this
- case, the filter conditions are translated into SQL WHERE conditions on the
- nodes occurring in the patterns; most SPARQL constructs are supported
(including
- regular expressions), including (starting with Marmotta 3.2) most XPath
- functions defined in the SPARQL specification
-* **full-text search** (Marmotta 3.2 and above): adds additional full-text
- search functions to SPARQL that can be used in the FILTER part of a query
(see
- below)
-
-Also, result iterators of an optimized query operate directly on database
-cursors, so they will be very efficient in case only a few results will be
-retrieved.
-
-Note that KiWi SPARQL does not translate the complete query to SQL. Instead, it
-walks through the abstract syntax tree of a query and optimizes those parts
-where it can reliably do so and where it makes sense. This allows us to make
-efficient use of the performance of the underlying database while at the same
-time retaining the flexibility of full SPARQL 1.1. Specifically, the following
-popular constructs are currently *not* completely translated:
-
-* **OPTIONAL** (left join): the SPARQL OPTIONAL has a slighly different
- semantics than its SQL left join counterpart, so OPTIONAL can at the moment
- not be optimized (but those parts of a query that are normal joins will still
- be optimized)
-* **DISTINCT**, **ORDER BY**, **GROUP BY**: since KiWi SPARQL currently only
- optimizes the query part and not the projection, expressions operating on the
- query result like those mentioned will not be translated and instead
evaluated
- in-memory;
[1]: http://www.w3.org/TR/sparql11-query/
@@ -58,17 +36,21 @@ dependency to your Maven project:
</dependency>
-Full-Text Search (3.2 and above)
---------------------------------
+Marmotta Extensions
+-------------------
Starting with the development version of Apache Marmotta 3.2, there is also
-full-text search support in SPARQL queries. Full-text search works over the
-literal values of nodes and differs from normal literal queries or regexp
-filters in that it applies language-specific lingustic processing (e.g.
stemming
-and stop-word elimination). The KiWi SPARQL module comes with its own namespace
-for SPARQL extensions:
+extended function support in SPARQL. The KiWi SPARQL module comes with its own
+namespace for SPARQL extensions:
+
+PREFIX mm: <http://marmotta.apache.org/vocabulary/sparql-functions#>
+
+### Full-Text Search (3.2 and above)
+
+Full-text search works over the literal values of nodes and differs from
normal literal
+queries or regexp filters in that it applies language-specific lingustic
processing (e.g.
+stemming and stop-word elimination).
- PREFIX mm: <http://marmotta.apache.org/vocabulary/sparql-functions#>
Full-text search currently offers two SPARQL functions that can be used in the
FILTER part of a query and return boolean values (found or not found):
@@ -108,6 +90,13 @@ dc:description:
FILTER( mm:fulltext-search(str(?desc), "software", lang(?desc)) )
}
+### Aggregation Functions
+
+Beyond the standard SPARQL functions, Apache Marmotta also supports the
following additional aggregation
+functions when using the PostgreSQL backend:
+
+* `mm:stddev(number)` (statistics) returns the standard deviation of its
arguments
+* `mm:variance(number)` (statistics) returns the variance of its arguments
@@ -123,13 +112,48 @@ SPARQL performance, try to follow the fo
* **avoid DISTINCT, ORDER BY, GROUP BY**: filtering out duplicates is a
performance killer, as it requires to first load all results into memory; if
you do not strictly need it, do not use it
-* **avoid OPTIONAL**: optional queries are currently not optimized, as the
- semantics of OPTIONAL in SPARQL slightly differs from the semantics of an SQL
- left join
-* **avoid subselects**: a join with a subselect currently cannot be optimized,
- because KiWi SPARQL does not work on the results of a SPARQL query, only on
the
- conditions
+* **use LIMIT**: limiting the number of results helps the underlying SQL query
planner
+ to create better query plans, so your query will perform better
* **use FILTER**: conditions in the FILTER part of a query will be translated
into WHERE conditions in SQL; the more precise your filter conditions are,
the
better your query will perform
+Differences from SPARQL Standard
+--------------------------------
+
+The KiWi native SPARQL implementation differs from the semantics of the SPARQL
standard
+in the following known cases. Implementing these cases according to the
standard would produce signficantly
+more complex SQL queries for solving border cases with non-intuitive
semantics, so we decided not to
+support them instead.
+
+### OPTIONAL and significant order
+
+There is a special border case where according to the SPARQL standard the
position of
+the OPTIONAL part gives different semantics. Consider the following two SPARQL
queries:
+
+ SELECT * WHERE {
+ ?s :p1 ?o1 .
+ OPTIONAL { ?s :p2 ?o2 } .
+ ?s :p3 ?o2
+ }
+
+vs
+
+ SELECT * WHERE {
+ ?s :p1 ?o1 .
+ ?s :p3 ?o2 .
+ OPTIONAL { ?s :p2 ?o2 }
+ }
+
+According to the SPARQL standard, the first query only yields results when the
values for :p2 and :p3 are the same,
+while the second query essentially ignores the OPTIONAL. Since SQL has a
declarative semantics where the order of
+statements does not matter, we do not support this case. We always implement
the semantics of the second
+SPARQL query.
+
+
+### MINUS, EXISTS and inner FILTER variable scope
+
+According to the SPARQL standard, variables occurring in the left and right
argument of a MINUS are scoped
+to their individual arguments. Since we translate this case into a NOT EXISTS,
our implementation does not support
+this case, which can yield unexpected results for certain queries. These cases
should be solved by proper variable
+renaming. All other differences between MINUS and NOT EXISTS are implemented
according to the standard.
\ No newline at end of file