Yes and yes. Users range from Information Professionals to "naive" end
users.
If there's a string like "N-(t-Butyl)-N-(3,5-dinitrobenzoyl)-nitroxyl" users
can be expected to search for "dinitro", "3,5-dinitro", "nitrobenz" etc.
Each of the phrases you list here could be extracted by a domain-smart tokenizer, I would think.
There are also sequences of amino acids or DNA that users might want to match partially.
I find your use-cases extremely interesting. Please do keep us posted if you come up with something clever in this respect.
That's what I have been pondering about the whole morning and I'm going to
give
it a try. As far as I have tested, Lucene carries out these queries far
quicker
than your average relational DB text retrieval tool :-)
Cool. Lucene rocks!
Erik
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
