So, this is TDB-independent. Is the idea here that I'd use, say, fuseki and concoct some sort of an assembly to glue it together?
On Sun, Jan 30, 2011 at 12:44 PM, Andy Seaborne <[email protected]> wrote: > > > On 28/01/11 15:50, Benson Margulies wrote: >> >> At the day job, one of our lead technologies is a device that can >> decide that 'Barak Obama' and 'Barack Obama' are probably the same >> thing, or even that 歐巴馬 is another spelling. Is there an extension >> model for SPARQL queries? In this case, it wouldn't really work to >> just live in the FILTER, since the fundamental selection would be >> something like: >> >> >> ?s something:hasName "Barak Obama" >> >> and we want to tamper with how the literal string gets compared. We >> have one API that says "how similar are these strings" and another >> more complex model in which we build an index that rapidly returns all >> the strings that are within some distance of a query. We could, of >> course, build our own index by mining TDB, make our own query, and >> then get busy SPARQL-ing starting from a set of URI's thus derived, >> but I just wondered about a more integrated approach. > > Benson, > > ARQ provides "property functions" where a property is matched by calling > custom code, not the storage-level matching > > http://openjena.org/ARQ/extension.html#propertyFunctions > > One example is free-text matching, using Lucene: > > http://openjena.org/ARQ/lucene-arq.html > > A property function can provide the access to another index such as your > example of similar literals. You could either index literal to literal by > similarity or literal to resource it relates to. The similarity can return > multiple possible matches (one of the reasons for extending via properties > is that it gives a framework multiple matches unlike FILTERs). > > (Property functions do not work in all property paths situations currently - > not clear what it means in {0} and *, nor the interaction with the > backtracking search) > > Andy >
