Hi Viktor,
In my opinion, we should open a discussion about how to take advantage
of this. So far, we have discussed sometimes about the convenience to
include the concept of "Knowledge Base" in Stanbol. Although initially
it could sound like something similar to an EntityHub site, IMO a
Knowledge Base should be a storage plus an API that reflects the initial
structure of your dataset, allows to exploit it and allows to integrate
external resources. For example, for 2013's GSoC, Antonio used a single
node Neo4J database for storing Freebase entities' relations in order to
support graph based disambiguation engines. It would be nice if we can
define a generic API for such component and the Titan's one could be
another implementation that also allow to distribute the data.
Apart from this, regarding your email:
El 03/12/13 12:03, Viktor Gal escribió:
Hi all,
in the last couple of days i was working on enabling batch loading of a big RDF
dataset into Titan graph DB using Faunus.
Since Titan DB supports Sail API via Blueprint's GraphSail implementation, one
can use Titan as an RDF storage and use Sail API to support SPARQL queries.
now the question is whether we should implement a new Yard based on Titan, i.e.
TitanYard in order to support Titan as RDF storage?
Well, right now, if you want to use locally your data in some sense, you
need to create a Yard. As you know (because you have been dealing with
Freebase indexing tool :-) ), the triples are first imported to a
JenaTDB based RDF store in order to perform some pre-processing like
LDPath filtering. If you want your implementation to also allow
pre-processing, you might need to implement for example LDPath queries
over your backend. If you are not worried about pre-processing, maybe an
option is to directly implement a Titan based Yard.
Anyway, let's wait for the experts opinion ;-).
Cheers,
Rafa
i'll see if i can push my patch i was mentioning above into Faunus, but if they
are not interested we can still have that within Stanbol.
cheers,
viktor