Thank you Nick. Could you possibly give me some more specific guidelines? At the moment, all indexed words are "flat" with no semantics - which is great for general purposes. However, if one focuses on, let's say biomedical literature, one would like to distinguish what words represent gene names, drugs names etc.. User would be able to compose search like "[drug_dictionary_ID] AND headache" to get documents containing all drug names related to headache. Also, one could group documents by dictionaries, e.g. group of documents related to genetics (high frequency of gene/protein names), to diseases (mostly diseases names), etc..
This could open possibilities for applying machine learning, pattern analysis or automatic hypothesis generation using not words only but their semantics as well. All without using unreliable "natural language processing" algorithms. Any ideas? Alex On 10/12/12 3:01 PM, Nick Wellnhofer wrote: > > If I understand your problem description correctly, you could simply > create another full-text field containing the dictionary IDs related > to a document separated by whitespace. Then you can search only the > dictionary field. > > Nick > >
