Maybe a materialized view [1] would be the solution here?  It could maintain 
caching for these types of queries without having to have a whole separate 
index.  Since ontology information is probably rarely updated, it would seem 
like a good fit.

-Stephen

[1] http://en.wikipedia.org/wiki/Materialized_view


> -----Original Message-----
> From: Paolo Castagna [mailto:[email protected]]
> Sent: Wednesday, October 05, 2011 7:16 AM
> To: [email protected]
> Subject: Re: [jena-dev] Building RDF Schema information from TDB
> Dataset [ ARQ, TDB ]
> 
> Dave Reynolds wrote:
> > On Wed, 2011-10-05 at 11:22 +0100, Paolo Castagna wrote:
> >> Dave Reynolds wrote:
> >>> If you just want to list the properties and classes that are used
> then
> >>> you can do things like:
> >>>
> >>>   SELECT DISTINCT ?p WHERE {?s ?p ?o.}
> >>>
> >>>   SELECT DISTINCT ?cls WEHRE {?i a ?cls.}
> >> Any idea to speed up these two queries (for large TDB datasets) is
> welcome! :-)
> >
> > I nearly put a comment in that response that those can be very
> expensive
> > queries :)
> 
> :-)
> 
> The fact is that often people do not put vocabularies|ontologies in
> their data
> (rightly so). They also want to have a list of properties and classes
> actually
> used in a dataset, and they want counts (i.e. how many times a property
> or a
> class has been actually used in a dataset).
> 
> The list of properties (with counts) can be derived from the stats.opt
> file
> (if present).
> 
> Maybe people wanting to have super fast list of distinct
> properties|classes
> actually used in a dataset should have a custom index just for that.
> However, one would need to intercept all updates operation and keep
> those
> indexes in sync with TDB indexes. I have no idea on what is the best
> way to
> do that.
> 
> Are there better ideas?
> 
> Paolo
> 
> > Dave
> >
> >

Reply via email to