On 9/13/07, Thilo Goetz <[EMAIL PROTECTED]> wrote: > You may wish to look at the semantic search > package on the IBM alphaworks site.
I assume that's what Scott's using. The CAS Consumer requires a SourceDocumentInformation object as input, which contains the URL of the document. That's what gets added to the index so that when a search is done we can send back the URLs of the documents that match. Unfortunately this code was never open sourced and I myself don't have it. I think your best bet is to create the SourceDocumentInformation object, and then if you want you can delete it afterwards. When integrating components from different sources it is common to have to put in a "type system mapper" (in this case a very simple one) that translates the FSs in the CAS to a format the downstream component can accept. Ideally there would be a standard UIMA type system for some things, such as the location of the source document, but that's not the case at the moment. -Adam
