On Fri, Dec 5, 2008 at 8:20 PM, Greg Holmberg <[EMAIL PROTECTED]> wrote: > I seem to remember that IBM's CAS Consumer for indexing into their semantic > search engine had to solve the same problem. I think it was configurable in > a file, if I remember correctly. > > Perhaps one of the IBM folks could describe what was done there? >
Yes, that's right. There's a separate file that contains the configuration rules for the indexer. This is described in the UIMA documentation: http://incubator.apache.org/uima/downloads/releaseDocs/2.2.2-incubating/docs/html/tutorials_and_users_guides/tutorials_and_users_guides.html#ugr.tug.application.integrating_text_analysis_and_search However, the search engine that is used for this (available on IBM alphaWorks) is able to index annotations over spans of text, which AFAIK Lucene is not. -Adam
