On Thu, 27 Mar 2003 [EMAIL PROTECTED] wrote:

> Caveat:  I have not yet installed Lucerne or begun to experiment with it
> yet.  I have scanned the FAQ, but don't see anything that addresses this
> question.  Pardon the somewhat slow buildup to the question below, but I
> want to set the context.
>
> I am developing an application for 'text mining' adverse event reports in
> the pharmaceutical industry.  The querying will be driven by
> 'dictionaries', 'thesauri',  'taxonomies' or 'ontologies' (pick your
> favorite) of drug names, compounds, and medical conditions.  These thesauri
> are quite large.  For example, our drug name thesaurus is on the order of
> 60,000+ terms.

These terms are not equivalent, so it's not clear exactly what you mean
here.

> I was planning on using Verity to accomplish my first approach to shallow
> text mining since Verity is our corproate-wide search engine technology and
> it supports a number of relevant features (including 'topic sets' for
> representing the taxonomies).  However, Verity imposes restrictions on the
> size of topic sets that currently prohibit me from using it with our large
> taxonomies.  It is not obvious that they will be able to fix this problem
> in the timeframe I need.  Thus I am turning to other alternatives, and
> Lucerne appears to be one.
>
> So given that context, my question is this:  Does anyone on this list have
> experience attempting to use very large queries (potentially thousands or
> tens of thousands of terms) in Lucerne?  Does anyone have any knowledge of
> design or implementation details that would inhibit the use of such
> queries?  Does anyone have any idea of what the performance would be like
> in retrieving via such queries?

I do not have experience with such queries, so I can't speak to that
question directly.  However, I don't understand what the purpose of such a
query would be in the first place.  What are the documents that you are
indexing, and what information need are you trying to address?

Regards,

Joshua O'Madadhain

 [EMAIL PROTECTED] Per Obscurius...www.ics.uci.edu/~jmadden
  Joshua O'Madadhain: Information Scientist, Musician, Philosopher-At-Tall
 It's that moment of dawning comprehension that I live for--Bill Watterson
My opinions are too rational and insightful to be those of any organization.



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to