Hi Olivier

I will try to upload precomputed indexes on my server. I tried this
already to weeks ago, but encountered some problems with files greater
than 2GByte.
Tomorrow I will split them up in several files and look if it works.

Currently there are three precomputed solr indexes available
 - dbPedia
 - geonames
 - dblp (http://dblp.uni-trier.de/): created today based on a request
of Andreas Gruber


Solr Server configuration:
All precomputed indexes uses the SolrYard implementation. [1]
describes how to setup the SolrServer. Note that it is no longer
required to set up an own SolrServer because the SolrYard now also
support embedded solr server. To activate that just configure a file
path instead of a http URL for the "Solr Server URL".

The archives with the precomputed indexes will contain a Solr Core.
Such cores need to be configured within the solr.xml as descibed in
[3].


Entityhub Configuration:

To activate the usage of caches for Referenced Site one has to do two
additional steps

(1) Create the Cache Instance:
There are only two Options to set:
" ": Here you need to provide the ID of the Yard used by this cache
" ": This can be used to specify what data are stored in the cache.
This mappings are only used if Entities loaded form the remote source
are cached locally. So for precomputed full caches this can be empty.

(2) Referenced Site (see also [2]):
Two parameters need to be changed to tell the References Site to use a
configured cache
"Cache Strategy": To use a precomputed cache set this to "ALL"
"Cache ID": This is the ID of the Cache (the same as the ID of the Yard)

Note that when the "Cache Strategy" is set to ALL, than there is no
need to configure a "Dereferencer Impl" nor a "Searcher Impl". However
if configured they are used as fallback if the Cache is not active or
throws an error.


Predefined Entityhub configuration:

To make it easier I will provide a predefined configuration for the
entityhub. This can be copy/pased to the config directory within the
sling folder.
This will not provide plug and play functionality, but is rather
intended to act as an starting point:
Users will need to
 - adapt some properties (e.g. the Solr Server URL).
 - deactivate/delete Yards, Caches and References Sites they do not want.


I will provide the links to the indexes as soon as I was able to load
them up to my server.

best
Rupert Westenthaler


[1] http://wiki.iks-project.eu/index.php/SolrYardConfiguration
[2] http://wiki.iks-project.eu/index.php/ReferencedSiteConfiguration
[3] http://wiki.apache.org/solr/CoreAdmin#Configuration

On Tue, Feb 15, 2011 at 5:43 PM, Olivier Grisel
<[email protected]> wrote:
> Hi Rupert,
>
> I would like to upgrade the https://stanbol.demo.nuxeo.org to include
> the entityhub service in full offline /sandalone mode (with a local
> solr index of DBpedia for instance). I would also like to be able to
> upgrade the Nuxeo / Stanbol connector to use the entityhub API
> (instead of the direct DBpedia dereferencer currently implemented).
>
> Could you please tell me how to set this up (along with where to get a
> copy of the precomputed solr index you are using for your tests)?
> Ideally it would be best if you could update the README file in the
> entityhub folder with those information.
>
> --
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel
>



-- 
| Rupert Westenthaler             [email protected]
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Reply via email to