Hi,

One can use the OpenNLP Max entropy library and create there own named-entity extraction.
I had used it in one of the projects which I did with Solr.

It is easy to integrate most of the NLP libraries with Solr. Though we had named-entity extraction embedded in our crawler which would populate a field called entities in the database, which we would ingest in Solr as yet another field.

--Thanks and Regards
Vaijanath N. Rao

Julien Nioche wrote:
Hi,

Open Source NLP platforms like GATE (http://gate.ac.uk) or Apache UIMA are
typically used for these types of tasks. GATE in particular comes with an
application called ANNIE which does Named Entity Recognition. OpenCalais
does that as well and should be easy to embed, but it can't be tuned to do
more specific things unlike UIMA or GATE based applications.

Depending on the architecture you have in mind it could be worth
investigating Nutch and add the NER as a custom plugin; NLP being often a
CPU intensive task you could leverage the scalability of Hadoop in Nutch.
There is a patch which allows to delegate the indexing to SOLR. As someone
else already said these named entities could then be used as facets.

HTH

Julien

Reply via email to