On Thu, 27 Mar 2003 [EMAIL PROTECTED] wrote: > Caveat: I have not yet installed Lucerne or begun to experiment with it > yet. I have scanned the FAQ, but don't see anything that addresses this > question. Pardon the somewhat slow buildup to the question below, but I > want to set the context. > > I am developing an application for 'text mining' adverse event reports in > the pharmaceutical industry. The querying will be driven by > 'dictionaries', 'thesauri', 'taxonomies' or 'ontologies' (pick your > favorite) of drug names, compounds, and medical conditions. These thesauri > are quite large. For example, our drug name thesaurus is on the order of > 60,000+ terms.
These terms are not equivalent, so it's not clear exactly what you mean here. > I was planning on using Verity to accomplish my first approach to shallow > text mining since Verity is our corproate-wide search engine technology and > it supports a number of relevant features (including 'topic sets' for > representing the taxonomies). However, Verity imposes restrictions on the > size of topic sets that currently prohibit me from using it with our large > taxonomies. It is not obvious that they will be able to fix this problem > in the timeframe I need. Thus I am turning to other alternatives, and > Lucerne appears to be one. > > So given that context, my question is this: Does anyone on this list have > experience attempting to use very large queries (potentially thousands or > tens of thousands of terms) in Lucerne? Does anyone have any knowledge of > design or implementation details that would inhibit the use of such > queries? Does anyone have any idea of what the performance would be like > in retrieving via such queries? I do not have experience with such queries, so I can't speak to that question directly. However, I don't understand what the purpose of such a query would be in the first place. What are the documents that you are indexing, and what information need are you trying to address? Regards, Joshua O'Madadhain [EMAIL PROTECTED] Per Obscurius...www.ics.uci.edu/~jmadden Joshua O'Madadhain: Information Scientist, Musician, Philosopher-At-Tall It's that moment of dawning comprehension that I live for--Bill Watterson My opinions are too rational and insightful to be those of any organization. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
