Re: Text classification with Solr

Karl Wettin Tue, 27 Jan 2009 10:34:07 -0800


27 jan 2009 kl. 17.23 skrev Neal Richter:

Is it really neccessary to use Solr for it? Things going muchfaster withLucene low-level api and much faster if you're loading theclassification
corpus into the RAM.


Good points.  At the moment I'd rather have a daemon with a service
API.. as well as the filtering/tokenization capabilities Solr has
built in.  Probably will attempt to get the corpus' index in memory
via large memory allocation.

If it doesn't scale then I'll either go to Lucene api or implement a
custom inverted index via memcached.

Other note /at the moment/ is that it's not going to be a deeply
hierarchical taxonomy, much less a full indexing of an RDF/OWL
schema.. there are some gotchas for that.

If your corpus is small enought you may want to take a look at lucene/contrib/instantiated. It was made just for these sort of things.



    karl

Re: Text classification with Solr

Reply via email to