[ 
https://issues.apache.org/jira/browse/STANBOL-141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018321#comment-13018321
 ] 

Rupert Westenthaler commented on STANBOL-141:
---------------------------------------------

Files need to use the following syntax
  <indexName>.solrindex[.<archiveFormat>]

indexName ... will be used as name for the Solr core used for the parsed data
archiveFormat ... can be used to parse the format of the archive.
Supported archiveFormats
 - "" (missing), ".zip", ".jar" : A zip archive is assumed as default
 - ".gz": It is assumed that the data are within an tar. One can also use 
".tar.gz".
 - ".bz2": It is assumed that the data are within an tar. One can also use 
".tar.bz2".
 - ".ref", ".properties": Index data reference. This are properties files that 
need at least contain a value for the "Index-Archive" key. The DataFileProvider 
service is used to locate the file by using the value of this key. All 
properties are parsed the the DataFileProviderService as additional comments.

Examples of possible Solr Archive file names
  dbpedia.solrindex
  geonames.solrindex.ref
  customers.solrindex.tar.gz

Notes 
 - IndexArchives are typically installed together with the SolrYard, Cache and 
ReferencedSite configuration. Currently the preferred way to do this is by 
using special bundles as described by STANBOL-140.

 - The Sling Installer Framework requires to have a copy of installed files 
within a private folder (usually "${sling-home}/installer"). This copy is used 
to check for changes in case the installed file is updated. Managing a copy of 
a gib Solr index (possible several GByte in size) is not ideal. Therefore it is 
strongly recommended to use Index data references ".solrindex.ref" files in 
such cases. The DataFileProviderService does not have such limitation. In 
addition the File containing the index data can be deleted after the successful 
initialization of the index.

> Support for installing Solr Indexes form Archives
> -------------------------------------------------
>
>                 Key: STANBOL-141
>                 URL: https://issues.apache.org/jira/browse/STANBOL-141
>             Project: Stanbol
>          Issue Type: New Feature
>          Components: Entity Hub
>            Reporter: Rupert Westenthaler
>            Assignee: Rupert Westenthaler
>
> This assumes that precomputed indexes are provides as archives. The Solr 
> Index provided by such archives need to be copied and configured as core to 
> the EmbeddedSolrServer managed by the SolrYard.
> This functionality is a pre-requirement for STANBOL-92 and STANBOL-93.
> The first implementation will be based on the Apache Solr Installer. This has 
> the advantages that different sources (default configuration of the launcher, 
> a directory, a JCR) are provided out of the box. STANBOL-140 will allow to 
> also use archives with additional configurations as source.
> Additional information can be found in this email thread [2] 
> [2] http://markmail.org/thread/w2s7h3rcnkc2zlzm 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to