I am using the semantic search package mentioned. I was considering
trying to create a wrapper which extends SemanticSearchCasIndexer and
adds/removes the SourceDocumentInformation annotation before and after
calling the super.processCas method, but for now I have written a simple
"type system mapper". I might have a go at this later.

Cheers,
Scott.

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of
Adam Lally
Sent: Thursday, 13 September 2007 10:52 PM
To: [email protected]
Subject: Re: SIAPI Indexing

On 9/13/07, Thilo Goetz <[EMAIL PROTECTED]> wrote:
> You may wish to look at the semantic search package on the IBM 
> alphaworks site.

I assume that's what Scott's using.  The CAS Consumer requires a
SourceDocumentInformation object as input, which contains the URL of the
document.  That's what gets added to the index so that when a search is
done we can send back the URLs of the documents that match.

Unfortunately this code was never open sourced and I myself don't have
it.  I think your best bet is to create the SourceDocumentInformation
object, and then if you want you can delete it afterwards.  When
integrating components from different sources it is common to have to
put in a "type system mapper" (in this case a very simple one) that
translates the FSs in the CAS to a format the downstream component can
accept.  Ideally there would be a standard UIMA type system for some
things, such as the location of the source document, but that's not the
case at the moment.

-Adam

IMPORTANT: This email remains the property of the Australian Defence 
Organisation and is subject to the jurisdiction of section 70 of the CRIMES ACT 
1914.  If you have received this email in error, you are requested to contact 
the sender and delete the email.


Reply via email to