Hi Olivier I will try to upload precomputed indexes on my server. I tried this already to weeks ago, but encountered some problems with files greater than 2GByte. Tomorrow I will split them up in several files and look if it works.
Currently there are three precomputed solr indexes available - dbPedia - geonames - dblp (http://dblp.uni-trier.de/): created today based on a request of Andreas Gruber Solr Server configuration: All precomputed indexes uses the SolrYard implementation. [1] describes how to setup the SolrServer. Note that it is no longer required to set up an own SolrServer because the SolrYard now also support embedded solr server. To activate that just configure a file path instead of a http URL for the "Solr Server URL". The archives with the precomputed indexes will contain a Solr Core. Such cores need to be configured within the solr.xml as descibed in [3]. Entityhub Configuration: To activate the usage of caches for Referenced Site one has to do two additional steps (1) Create the Cache Instance: There are only two Options to set: " ": Here you need to provide the ID of the Yard used by this cache " ": This can be used to specify what data are stored in the cache. This mappings are only used if Entities loaded form the remote source are cached locally. So for precomputed full caches this can be empty. (2) Referenced Site (see also [2]): Two parameters need to be changed to tell the References Site to use a configured cache "Cache Strategy": To use a precomputed cache set this to "ALL" "Cache ID": This is the ID of the Cache (the same as the ID of the Yard) Note that when the "Cache Strategy" is set to ALL, than there is no need to configure a "Dereferencer Impl" nor a "Searcher Impl". However if configured they are used as fallback if the Cache is not active or throws an error. Predefined Entityhub configuration: To make it easier I will provide a predefined configuration for the entityhub. This can be copy/pased to the config directory within the sling folder. This will not provide plug and play functionality, but is rather intended to act as an starting point: Users will need to - adapt some properties (e.g. the Solr Server URL). - deactivate/delete Yards, Caches and References Sites they do not want. I will provide the links to the indexes as soon as I was able to load them up to my server. best Rupert Westenthaler [1] http://wiki.iks-project.eu/index.php/SolrYardConfiguration [2] http://wiki.iks-project.eu/index.php/ReferencedSiteConfiguration [3] http://wiki.apache.org/solr/CoreAdmin#Configuration On Tue, Feb 15, 2011 at 5:43 PM, Olivier Grisel <[email protected]> wrote: > Hi Rupert, > > I would like to upgrade the https://stanbol.demo.nuxeo.org to include > the entityhub service in full offline /sandalone mode (with a local > solr index of DBpedia for instance). I would also like to be able to > upgrade the Nuxeo / Stanbol connector to use the entityhub API > (instead of the direct DBpedia dereferencer currently implemented). > > Could you please tell me how to set this up (along with where to get a > copy of the precomputed solr index you are using for your tests)? > Ideally it would be best if you could update the README file in the > entityhub folder with those information. > > -- > Olivier > http://twitter.com/ogrisel - http://github.com/ogrisel > -- | Rupert Westenthaler [email protected] | Bodenlehenstraße 11 ++43-699-11108907 | A-5500 Bischofshofen
