[EMAIL PROTECTED] writes:
> 
> Viewed as an information retrieval problem (not the best way to view it,
> but this is just an initial approach), one could then (1) create a taxonomy
> of drugs and a taxonomy of conditions, and (2) implement a concept-oriented
> (taxonomy-oriented) search of the corpus for something like:
>                     {beta_blocker} AND {cardiac_condition}
> where '{beat_blocker}' expands via the taxonomy to the set of terms (words,
> sequences of words, etc.) that "fall under" that "concept" in the drug
> taxonomy and similarly for '{cardiac_condition}' under the condition
> taxonomy.
> 
I'm not sure that I understood everything, but there's one question coming
into my mind:
Why can't you just invert your taxonomy, that is expand each drug/condition 
in the indexed documents to the list of the drug/condition classes it belongs 
to? Instead of expanding the query.

> Verity didn't have these scaleability problems.  The one suggestion I have
> seen so far in the responses that seems relevant to the problem is the
> suggestion to transform the taxonomy (taxonomies) into an Analyzer.

An analyzer in lucene is just the instance used in preparing a document
for indexing and a query term for searching. It does tokenisation of input
and any additional preprocessing you implement.
So your specific analyzer might take a drug as input and output all
relevant terms. That's how you could implement your aproach in lucene.
I cannot comment on combining thousands of terms in a query though.

Morus

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to