== Property Tables

Property tables are a technique for speeding queries up by additional ways of access the data other than the triple table.

They can be used for:

+ data that is reasonably regular
+ caching partial query evaluations ahead of time
+ efficient inference for subclass/subproperty relationships.

A property table is a table where there is a column denoting a variable
in part of a SPARQL pattern. It may be the subject and one or more columns for properties of that subject but theer are oither possibilities.

A row in a property table matches a SPARQL basic graph pattern.

Example:

Suppose a dataset includes information about people, and that each person always has first name, last name and formal address form:

A property table might be:

subject URI                 first    last       Formal
                             name     name       name

(http://example/person#afs,  "Fred", "Smith",  "Frederick Smith")

and matches the the SPARQL patttern

{ ?person foaf:familyName ?fName ;
          foaf:givenName  ?gName ;
          ex:formalName   ?formal
}

but it can also be used to efficiently answer both partial occurrences of that patterns and ones where some terms are fixed:

  [] foaf:familyName ?fName ;
     foaf:givenName  ?gName ;
     ex:formalName   "Frederick Smith" .

This is a simple example of only 3 properties. In the real world, one resources may have 10's of properties so reducing the number of databases accesses may be significant and improve caching.

The basic pattern matched doesn't have to be "same subject" - it might be a complex query: such as:

    SELECT (count(*) AS ?c) { ?s ?p ?o } GROUP BY ?s

with a table of (?s, ?c)

"property table" is just conventional name for this approach because the first systems here were just basic graph patterns for RDQL.

A query compiler could spot parts of a query pattern to access a precomputed additional table instead of accessing the conventional triple table many times.

This project would apply this idea to Jena TDB.

It involves:
1/ spotting a query patterns
2/ building the property table
3/ maintain the table as data changes

A focus on primarily read-only data for publication means that (3) can be a process that runs in the background at regular intervals.

Reply via email to