[ 
https://issues.apache.org/jira/browse/STANBOL-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rupert Westenthaler updated STANBOL-1089:
-----------------------------------------

    Summary: Provide Topic Engine SolrConfiguration that uses n-grams  (was: 
Provide Topic Engine SolrConfiguration that do us n-grams)
    
> Provide Topic Engine SolrConfiguration that uses n-grams
> --------------------------------------------------------
>
>                 Key: STANBOL-1089
>                 URL: https://issues.apache.org/jira/browse/STANBOL-1089
>             Project: Stanbol
>          Issue Type: New Feature
>          Components: Enhancement Engines
>            Reporter: Rupert Westenthaler
>            Assignee: Rupert Westenthaler
>
> With the Topic Classification Engine now supporting to configure different 
> SolrCore configurations we should provide a configuration that does use 
> n-grams for topic classification.
> While this will not scale for very big classification schemes is should 
> provide improvements to small and medium sized models.
> Indexing of n-grams will be based on the Solr ShingleFilterFactory [1].
> The SolrCore configuration will be provided by the name 
> 'shingle-topic-model.solrindex.zip' by the Topic ClassificationEngine bundle 
> to the DataFileProvider. This means that users will need to configure this 
> name with the 'org.apache.stanbol.enhancer.engine.topic.solrCoreConfig' of 
> the TopicClassificationEngine. This property was added by STANBOL-1087
> [1] 
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ShingleFilterFactory

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to