[ 
https://issues.apache.org/jira/browse/STANBOL-1092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rupert Westenthaler updated STANBOL-1092:
-----------------------------------------

    Description: 
Right now the default configuration of the SolrYard is

* commitWithin is deactivated - meaning that every change does trigger a commit
* Auto commit is deactivated
* Transaction Log is enabled
* Soft Commit is deactivated

This has several potential issues

* The default configuration is very slow for updates - as it makes an hard 
commit on each call that changes the index (e.g. loading Entities to a 
ManagedSite)
* If commitWithin is enabled there will be no hard commits, as commitWithin was 
changed to Soft-Commits with Solr 4.0. Because of the changes to the index are 
never persisted and the transaction log will grow forever.


With this issue the default configuration will be changed like follows:

* commitWithin will be enabled by default(change in the DEFAULT value for the 
configuration property). The (default) duration will be kept by 10sec (a 
fallback in case users remove the soft auto commit from the solrconf.xml)
    * the Entityhub will still use immediate commits on every change (keep the 
old default). The default configuration of the Entityhub will need to be 
adapted accordingly.
* (hard) auto commit will be set to 1min. This ensures that data are written to 
disc at least every minute and transaction logs will not grow indefinitely. 
* soft auto commit will be set to 1sec. This means that an added/updated Entity 
will be available to seaches latest 1sec after adding it
* transaction log will be activated, as this is required by the used 
solr.NRTCachingDirectoryFactory directoryFactory.


  was:
Right now the default configuration of the SolrYard is

* commitWithin is deactivated - meaning that every change does trigger a commit
* Auto commit is deactivated
* Transaction Log is enabled
* Soft Commit is deactivated

This has several potential issues

* The default configuration is very slow for updates (e.g. loading Entities to 
a ManagedSite)
* If commitWithin is enabled there will be no commits what will cause the 
Transaction Log to grow up to the size of the index


With this issue the default configuration will be changed like follows:

* commitWithin will be enabled by default(change in the DEFAULT value for the 
configuration property). The (default) duration will be kept by 10sec
* (hard) auto commit will be set to 20sec. This ensures that data are written 
to disc at least every 20sec
* soft auto commit will be set to 1sec. This means that an added/updated Entity 
will be available to seaches latest 1sec after adding it
* transaction log will be deactivated. Users that do need real time search will 
need to configure this manually. Please also not the Solr Documentation for how 
to configure this feature properly (e.g. it is recommended to have the 
transaction log on a different disc as the index)


    
> Improve UpdateHandler configuration of the SolrYard and enable commitWithin 
> by default
> --------------------------------------------------------------------------------------
>
>                 Key: STANBOL-1092
>                 URL: https://issues.apache.org/jira/browse/STANBOL-1092
>             Project: Stanbol
>          Issue Type: Improvement
>          Components: Entityhub
>            Reporter: Rupert Westenthaler
>            Assignee: Rupert Westenthaler
>
> Right now the default configuration of the SolrYard is
> * commitWithin is deactivated - meaning that every change does trigger a 
> commit
> * Auto commit is deactivated
> * Transaction Log is enabled
> * Soft Commit is deactivated
> This has several potential issues
> * The default configuration is very slow for updates - as it makes an hard 
> commit on each call that changes the index (e.g. loading Entities to a 
> ManagedSite)
> * If commitWithin is enabled there will be no hard commits, as commitWithin 
> was changed to Soft-Commits with Solr 4.0. Because of the changes to the 
> index are never persisted and the transaction log will grow forever.
> With this issue the default configuration will be changed like follows:
> * commitWithin will be enabled by default(change in the DEFAULT value for the 
> configuration property). The (default) duration will be kept by 10sec (a 
> fallback in case users remove the soft auto commit from the solrconf.xml)
>     * the Entityhub will still use immediate commits on every change (keep 
> the old default). The default configuration of the Entityhub will need to be 
> adapted accordingly.
> * (hard) auto commit will be set to 1min. This ensures that data are written 
> to disc at least every minute and transaction logs will not grow 
> indefinitely. 
> * soft auto commit will be set to 1sec. This means that an added/updated 
> Entity will be available to seaches latest 1sec after adding it
> * transaction log will be activated, as this is required by the used 
> solr.NRTCachingDirectoryFactory directoryFactory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to