That is correct. Solr is a search engine, not a text analysis engine.
There are a few open source text analysis systems: Weka, OpenNLP,
UIMA.

Someone is working on integrating UIMA with Solr:
https://issues.apache.org/jira/browse/SOLR-2129

But you should generally assume you will have a batch processing pass
over the data before indexing it.

On Mon, Dec 6, 2010 at 12:04 PM, webdev1977 <webdev1...@gmail.com> wrote:
>
> Thanks for the quick response!
>
> I was thinking more about the idea of having both structured and unstructred
> data coming into a system to be indexed/searched.  I would like these
> documents to be processed by some sort of entity/keyword/semantic
> processing.  I have a well defined taxonomy for my organization (it is quite
> large) and at the moment we use RetrievalWare to give keyword/classification
> suggestions.  This does NOT work well though, and RetrievalWare is pretty
> much useless to us.
>
> I want a way to do this process either at index time or search time.  All
> documents should be processed against this taxonomy.  I do not want the user
> to be able to nominate keywords, it must happen automatically.   I am
> assuming it is only natural for these keywords/taxonomy entities to show up
> as hierarchical facets?
>
> From what I can tell, there is no way to tell Solr.. here is my taxonomy..
> classify my documents and give me back facets and facet counts..
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Taxonomy-and-Faceting-tp2028442p2029636.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Lance Norskog
goks...@gmail.com

Reply via email to