HI all, here is an update about this.
I had today an first look at the branch [1] especially with the intension to validate the possible usage of this functionality also with the Entityhub. For that reason I changed the Store and SemanticIndex do use generics. This allows to implement them not only for ContentItem (as needed by the Contenthub) but also Representation (as used by the Entityhub). The result looks promising and so Suat and myself discussed to split up the generic interfaces/implementation with the Contenthub specific one (see STANBOL-701). The commonly shared interfaces will be reside under "commons.semanticindex" and include: * Store interface (maybe feature reduces to be read-only and renamed to IndexingSource) * StoreManager: Managing interface that allows to lookup multiple Store (or IndexingSource) instances. * SemanticIndex and SemanticIndexManager (STANBOL-499) * I would like to have a possibility so that a Store (or IndexingSource) can notify others (mainly SemanticIndex) about changed Entities. Currently the need to be ask periodically about changes (see Store#changes() method) - something like an ItemNotifier maybe using the OSGI Event mechanism. Semantic Indexes could than use the Store#changes() method to get up-to-date when they are activated and than use the EntityNotifier functionality to keep in sync while they are active. As soon as this is available I will further evaluate how to use this with the Entityhub. best Rupert [1] http://svn.apache.org/repos/asf/incubator/stanbol/branches/contenthub-two-layered-structure/contenthub/ On Fri, Jul 20, 2012 at 5:24 PM, Rupert Westenthaler <[email protected]> wrote: > Hi Fabian, all > > Yes this is still developed within an own branch that we started > during the Hackathon in Saarbrücken. But you are completely right this > development was - until now - not visible enough to the community. > Especially because this design of splitting up > > 1st level storage that keeps the data and a > 2nd level storage that allows to build special indexes of the data > > is something that is not only interesting for the Contenthub, but > might be also adapted by the Entityhub. Especially when we want to > have the functionality of the Entityhub indexing tool available within > the Stanbol Environment. This would require to have a storage for the > Entity data (could be even a remote Service) and a 2nd storage that > holds the indexed data. > > Such a design could us even give more flexibility to build special > indexes - e.g. adding surface forms as alternate labels, collection > mentions for MLT queries, following related, broader or other > relations to build semantic contexts ... capabilities like that would > be key for adding things like Entity-Disambiguation to Apache Stanbol. > Especially if you want to use it with user managed vocabularies - > without re-indexing the whole vocabulary after changes. > > best > Rupert > > > On Fri, Jul 20, 2012 at 1:11 PM, Fabian Christ > <[email protected]> wrote: >> Hi, >> >> 2012/7/20 Suat Gonul <[email protected]>: >>> Again let me remind you that, this work is carried on under the >>> "contenthub-two-layered-structure" branch. Sorry for the bulk update, >> >> I am sorry Suat! - I did not recognize that you are still working in >> your own branch. I was thinking that there was a big change to the >> trunk without any notification before that. In this case - everything >> is fine ;) >> >> Best, >> - Fabian >> >> -- >> Fabian >> http://twitter.com/fctwitt > > > > -- > | Rupert Westenthaler [email protected] > | Bodenlehenstraße 11 ++43-699-11108907 > | A-5500 Bischofshofen -- | Rupert Westenthaler [email protected] | Bodenlehenstraße 11 ++43-699-11108907 | A-5500 Bischofshofen
