Hi folks, Just a little advertising message for those who are interested in semantic expansions :
http://kant.lingway.com/DemoUN is a demo of a multilingual IR system based on Lucene Please take a look at it - feedback is welcome! Julien ----- Original Message ----- From: "Peter Carlson" <[EMAIL PROTECTED]> To: "Lucene Developers List" <[EMAIL PROTECTED]> Sent: Wednesday, May 15, 2002 7:06 AM Subject: Re: Adding a TermExpansionQuery > Hi Eric, > > Thanks for the feedback. My intention was to abstract the source, but one of > my questions was, does Lucene set a configuration file which will use this > "Thesaurus" query, or will that have to be setup manually by the developer. > > Currently, Lucene does not provide a configuration file. > > As far as if the information is in the index directory. I was thinking this > might be a nice place for this information to exist, then it doesn't add any > other overhead to the system (i.e. No configuration file) and might be > easier to support multiple sources since the index has already been > abstracted. If you wanted to share the "Thesaurus" across many different > indices you could "copy" or "merge" that index component into the data > source. This could even be part of the build process for a file system. > > --Peter > > On 5/15/02 6:45 AM, "Eric D. Friedman" <[EMAIL PROTECTED]> wrote: > > > Whichever storage mechanism you choose, you should be sure to abstract its > > interface so that people can make other choices. With that out of the way, > > it doesn't matter too much whether you pick a properties file or an XML > > file. > > > > That said, I wouldn't expect to find this data stored in the index > > directory, since it's not part of the index and since users may want to > > share the data across several indices. I would also lean toward the > > XML file (for a file solution, that is -- an RDBMS should be supported > > too), since that lends itself more naturally to describing one-to-many > > relations than a properties file does. > > > > Personal opinion: "Thesaurus" is a more descriptive term than > > "TermExpansion." To me, term expansion suggests some kind of text > > globbing, whereas a thesaurus is a reference (a "lookup table") that > > provides *semantic* expansions of the kind you describe. Oracle's > > intermedia indexing engine has thesaurus features similar to what you > > describe and calls them by that name. > > > -- > To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: <mailto:[EMAIL PROTECTED]> > > -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
