On Mon, Mar 11, 2013 at 1:53 PM, Reto Bachmann-Gmür <[email protected]> wrote: > On Mon, Mar 11, 2013 at 9:49 AM, Rupert Westenthaler < > [email protected]> wrote: > >> On Mon, Mar 11, 2013 at 9:03 AM, Reto Bachmann-Gmür <[email protected]> >> wrote: >> > Hi, >> > >> > I missed this registation mechanism for the stanbol endpoint. Why can't >> all >> > triplecollections in TcManager be queried? >> > >> >> This was a design decision. There was never the intension to publish >> all (Clerezza) TripleCollections but rather to allow Stanbol >> components to provide a SPARQL service over their RDF data. >> > > By know we have some triple collections in TcManager thta will typically > not be queried (system graph and recipy graph) not sure which one you > wanted to hideaway back then.
In OSGI you never know what other modules are using Clerezza to store data. The current design ensures that only Graphs that are explicitly configured graphs are exposed. This assumptions seamed to be a good one to get started. > > What seems a bit strange with the current approach is that persistent > triple collections are twice on thewhiteboard once added there by TcManager > and once added there for registering them with the sparql endpoint. > Yes using the whiteboard is not an optimal solution, but it is ok for the current scope of the /sparql endpoint > >> With the adaption of the 2-layerd storage solution for the Contenthub >> (and later also the Entityhub) we might need to rethink the /sparql >> endpoint to support also the querying of RDF graphs not managed by >> Clerezza. > > > I don't understand the link to the 2-layered content hub > Because than a Entityhub Site can consist of a "Store" and "Semantic Indexe(s)". Currently the Entityhub typically can not provide SPARQL, as the SolrYard does not support it, but than you can use a TripleStore as "Store" and Solr for the "Semantic Index". That means that the Store could register itself with the SPARQL endpoint. This means that users will be able to SPARQL all Entityhub Sites, what would make an SPARQL endpoint much more interesting. > >> I have already started an implemented already a native Jena >> TDB store for the Entityhub in the "contenthub-two-layered-structure" >> > > Why this? Performance. Using Clerezza means for semantic indexing to convert "Jena TDB > Jena > Clerezza Graph > Solr InputDocument" while directly using Jena TDB APIs allows to implement "Jena TDB > Solr InputDocument". If you index 8 million DBpedia concepts or the ~120 million entities of Musicbrainz this makes a big difference. best Rupert > > Cheers, > Reto -- | Rupert Westenthaler [email protected] | Bodenlehenstraße 11 ++43-699-11108907 | A-5500 Bischofshofen
