Re: [lucy-user] Dictionary based NER with Lucy

Aleksandar Radovanovic Fri, 12 Oct 2012 06:28:01 -0700

Thank you Nick. Could you possibly give me some more specific guidelines?

At the moment, all indexed words are "flat" with no semantics - which is
great for general purposes. However, if one focuses on, let's say
biomedical literature, one would like to distinguish what words
represent gene names, drugs names etc.. User would be able to compose
search like "[drug_dictionary_ID] AND headache" to get documents
containing all drug names related to headache. Also, one could group
documents by dictionaries, e.g. group of documents related to genetics
(high frequency of gene/protein names), to diseases (mostly diseases
names), etc..

This could open possibilities for applying machine learning, pattern
analysis or automatic hypothesis generation using not words only but
their semantics as well. All without using unreliable "natural language
processing" algorithms.

Any ideas?

Alex

On 10/12/12 3:01 PM, Nick Wellnhofer wrote:
>
> If I understand your problem description correctly, you could simply
> create another full-text field containing the dictionary IDs related
> to a document separated by whitespace. Then you can search only the
> dictionary field.
>
> Nick
>
>

Re: [lucy-user] Dictionary based NER with Lucy

Reply via email to