Thank you Thilo, Well, I will investigate this idea.
Regards Rad 2009/6/25 Thilo Goetz <[email protected]> > Radwen ANIBA wrote: > > Hi everyone, > > > > Following some examples applications of UIMA allow us to understand how > > every component in UIMA framework works. That great. But one question > that a > > developper may ask is how to use the CAS to make a comparison of analyzed > > documents. > > > > The CAS is common to everydocument and when analzing one of them we have > an > > acces to the CAS for writing or updating. > > Let's imagine We have 3 documents to analyze. We write to the CAS > metadata > > relative to each of them, but to go futher for the analysis of the > documents > > it could be very interesting to compare these documents using the CAS, > > either in multiple manner or in pairwise. > > > > To illustrate what i'm saying, let's imagine we are looking for email > > adresses inside three big documents using UIMA regexp capabilities. > > A result may be illustrated like this : > > > > Document 1 : Number of Unique emails 9 | Number of emails in common with > > Document 2 : 10 | Number of emails in common with Document 3 : 6 > > Document 2 : Number of Unique emails 5| Number of emails in common with > > Document 1 : 20 | Number of emails in common with Document 3 : 1 > > Document 3 : Number of Unique emails 4 | Number of emails in common with > > Document 1 : 15 | Number of emails in common with Document 2 : 3 > > > > Here is a simple cross comparison of documents in pairwise using the CAS, > My > > question is how to achieve that ? > > Do we need to create additional Type System for the common information ? > We > > have to do it on the fly dynamically ? > > > > Thanks > > > > Rad > > > > Hi Rad, > > using the CAS to do this will get expensive very quickly. You will > not want to keep every document in its own CAS because of the memory > overhead. I would probably write the information you're interested > in to an external datastore (e.g., a DB such as Derby) and do the > comparison there. > > --Thilo >
