[
https://issues.apache.org/jira/browse/STANBOL-141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018321#comment-13018321
]
Rupert Westenthaler commented on STANBOL-141:
---------------------------------------------
Files need to use the following syntax
<indexName>.solrindex[.<archiveFormat>]
indexName ... will be used as name for the Solr core used for the parsed data
archiveFormat ... can be used to parse the format of the archive.
Supported archiveFormats
- "" (missing), ".zip", ".jar" : A zip archive is assumed as default
- ".gz": It is assumed that the data are within an tar. One can also use
".tar.gz".
- ".bz2": It is assumed that the data are within an tar. One can also use
".tar.bz2".
- ".ref", ".properties": Index data reference. This are properties files that
need at least contain a value for the "Index-Archive" key. The DataFileProvider
service is used to locate the file by using the value of this key. All
properties are parsed the the DataFileProviderService as additional comments.
Examples of possible Solr Archive file names
dbpedia.solrindex
geonames.solrindex.ref
customers.solrindex.tar.gz
Notes
- IndexArchives are typically installed together with the SolrYard, Cache and
ReferencedSite configuration. Currently the preferred way to do this is by
using special bundles as described by STANBOL-140.
- The Sling Installer Framework requires to have a copy of installed files
within a private folder (usually "${sling-home}/installer"). This copy is used
to check for changes in case the installed file is updated. Managing a copy of
a gib Solr index (possible several GByte in size) is not ideal. Therefore it is
strongly recommended to use Index data references ".solrindex.ref" files in
such cases. The DataFileProviderService does not have such limitation. In
addition the File containing the index data can be deleted after the successful
initialization of the index.
> Support for installing Solr Indexes form Archives
> -------------------------------------------------
>
> Key: STANBOL-141
> URL: https://issues.apache.org/jira/browse/STANBOL-141
> Project: Stanbol
> Issue Type: New Feature
> Components: Entity Hub
> Reporter: Rupert Westenthaler
> Assignee: Rupert Westenthaler
>
> This assumes that precomputed indexes are provides as archives. The Solr
> Index provided by such archives need to be copied and configured as core to
> the EmbeddedSolrServer managed by the SolrYard.
> This functionality is a pre-requirement for STANBOL-92 and STANBOL-93.
> The first implementation will be based on the Apache Solr Installer. This has
> the advantages that different sources (default configuration of the launcher,
> a directory, a JCR) are provided out of the box. STANBOL-140 will allow to
> also use archives with additional configurations as source.
> Additional information can be found in this email thread [2]
> [2] http://markmail.org/thread/w2s7h3rcnkc2zlzm
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira