Property paths - upcoming changes.

Andy Seaborne Fri, 09 Mar 2012 12:51:38 -0800

The SPARQL-WG is seriously considering making some changes to propertypaths for the arbitrary length operators * (zero or more), + (one ofmore). The results of some queries may change.


There is no formal decision yet so it is not definite it will happen.

The changes affect cardinality. Currently, * and + match all possibleso if there are multiple ways to get from A to B, there will be one rowsin the each possible path.

Where paths included common elements, duplicates occur. In highlyconnected graphs, it can be a lot of duplicates. For example in aclique of 6 nodes, :p* has 326 solutions, while it has 6 if unique. Itgets worse for larger N.

(A clique is a graph in which every node is connection to every other -it's the most extreme for of highly connected).

In a FOAF graph, you usually want to know if A and B are connection, nothow many times (and if you did want that you'd probably want the lengthas well and SPARQL 1.1 doesn't give you that).


rdfs:subclassOf* is example:

{ :thing rdf:type/rdfs:subclassOf* ?class } is the class and allsuperclasses of :thing. An RDFS schema isn't a tree - it can haveacyclic shapes in it as well (directed cycles really would not makesense!). The app probably wants the classes once.


But sometimes duplicates do matter.

See for example:
http://people.apache.org/~andy/property-paths.html

which is adding up a purchase order to get the total cost. Just becauseitems have the same price (same literal or same structured node in thegraph) doesn't mean they can be considered the same.

The current plan is to have two sets of operators, one counting(duplicates), one distinct matches.


There are:
  counting: {*} and {+}
  distinct: * and +

as being the common usage on way round and anyway {...} can generateduplicates in other forms.

But that is a change in semantics to * and + from the current SPARQL 1.1where * and + are counting.

For some (many?) uses of * and + this makes no difference. For example,accessing lists:


   { ?list rdf:rest*/rdf:first ?member }

Also, SPARQL 1.1 would have a general path operator DISTINCT(..path..)to turn duplicates into distinct results for that path segment.


path* == DISTINCT(path{*})
path+ == DISTINCT(path{+})

If you have any comments, please make them soon.

The development ARQ (2.9.1-SNAPSHOT) will have these changes in it verysoon.


        Andy

Property paths - upcoming changes.

Reply via email to