On 08/02/11 13:08, Benson Margulies wrote:
So, this is TDB-independent. Is the idea here that I'd use, say,
fuseki and concoct some sort of an assembly to glue it together?
Yes, it's TDB independent and can be combined with TDB in the same way
LARQ is.
An assembler can include general initization code using ja:loadClass.
e.g
[] ja:loadClass "com.hp.hpl.jena.tdb.TDB"
Andy
On Sun, Jan 30, 2011 at 12:44 PM, Andy Seaborne
<[email protected]> wrote:
On 28/01/11 15:50, Benson Margulies wrote:
At the day job, one of our lead technologies is a device that can
decide that 'Barak Obama' and 'Barack Obama' are probably the same
thing, or even that 歐巴馬 is another spelling. Is there an extension
model for SPARQL queries? In this case, it wouldn't really work to
just live in the FILTER, since the fundamental selection would be
something like:
?s something:hasName "Barak Obama"
and we want to tamper with how the literal string gets compared. We
have one API that says "how similar are these strings" and another
more complex model in which we build an index that rapidly returns all
the strings that are within some distance of a query. We could, of
course, build our own index by mining TDB, make our own query, and
then get busy SPARQL-ing starting from a set of URI's thus derived,
but I just wondered about a more integrated approach.
Benson,
ARQ provides "property functions" where a property is matched by calling
custom code, not the storage-level matching
http://openjena.org/ARQ/extension.html#propertyFunctions
One example is free-text matching, using Lucene:
http://openjena.org/ARQ/lucene-arq.html
A property function can provide the access to another index such as your
example of similar literals. You could either index literal to literal by
similarity or literal to resource it relates to. The similarity can return
multiple possible matches (one of the reasons for extending via properties
is that it gives a framework multiple matches unlike FILTERs).
(Property functions do not work in all property paths situations currently -
not clear what it means in {0} and *, nor the interaction with the
backtracking search)
Andy