-------------- Original message ---------------------- From: "Roberto Franchini" <[EMAIL PROTECTED]>
> So we need a very highly configurable component, able to map only > certain declared features and applying the right analyzer and so on. > Mny ways are possible: > -completly programmatic: the indexer is abstract and should be > extended to implement the right mapping for a specialized typeSytem > and pipeline > -configurable: mapping rules are defined in a descriptor file; the > JENA component followed this way > -mix of the two: some mapping is configured, other are implemented I seem to remember that IBM's CAS Consumer for indexing into their semantic search engine had to solve the same problem. I think it was configurable in a file, if I remember correctly. Perhaps one of the IBM folks could describe what was done there? A separate question: what kinds of annotations is it possible to index into Lucene? In other words, what functionality are we shooting for? For example, can I index named entities? In my case, named entities look like that attached UML class diagram. I would like to perform queries for documents that contain certain entities or types of entities. For example, find documents that contain entity name=IBM, type=Company. Greg Holmberg
