I am using the semantic search package mentioned. I was considering trying to create a wrapper which extends SemanticSearchCasIndexer and adds/removes the SourceDocumentInformation annotation before and after calling the super.processCas method, but for now I have written a simple "type system mapper". I might have a go at this later.
Cheers, Scott. -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Adam Lally Sent: Thursday, 13 September 2007 10:52 PM To: [email protected] Subject: Re: SIAPI Indexing On 9/13/07, Thilo Goetz <[EMAIL PROTECTED]> wrote: > You may wish to look at the semantic search package on the IBM > alphaworks site. I assume that's what Scott's using. The CAS Consumer requires a SourceDocumentInformation object as input, which contains the URL of the document. That's what gets added to the index so that when a search is done we can send back the URLs of the documents that match. Unfortunately this code was never open sourced and I myself don't have it. I think your best bet is to create the SourceDocumentInformation object, and then if you want you can delete it afterwards. When integrating components from different sources it is common to have to put in a "type system mapper" (in this case a very simple one) that translates the FSs in the CAS to a format the downstream component can accept. Ideally there would be a standard UIMA type system for some things, such as the location of the source document, but that's not the case at the moment. -Adam IMPORTANT: This email remains the property of the Australian Defence Organisation and is subject to the jurisdiction of section 70 of the CRIMES ACT 1914. If you have received this email in error, you are requested to contact the sender and delete the email.
