On 20.05.2013 23:01, GATE User wrote: > Thanks Richard and Peter: > > What I want to be able to do is, when the xml is returned, a program should > then be able to find the "corrected" message and use that for future > operations. Will using views allow this? Is it simply easier to just make a > new CAS? Thanks again. >
Can you please provide more information? I assume that xml refers to the document text you are processing. I think it will not make a big difference if you use an additional view or a completely new CAS, if the processing won't happen in one pipeline. You could add a new annotation, which indicates that the covered text has changed, in order to make the analysis engine sensible to modifications. If I have understood you correctly, then the approach Richard described (or the Ruta implementation) should solve your problem. Best, Peter > > > ________________________________ > From: Richard Eckart de Castilho <[email protected]> > To: [email protected]; GATE User <[email protected]> > Sent: Monday, May 20, 2013 4:42 AM > Subject: Re: Changing the original text based on annotations > > > Hi, > > UIMA doesn't allow text to be changed, but you can create a new view with new > text. > > When I needed that, I implemented a set of annotations to mark text as "to be > inserted/deleted/changed", > e.g. based on the results of a spell checker. Then I run an annotator which > interpreted all > these annotations an created a new view with the updated text. Subsequent > annotators would > work on the new view then. > > What I have done on this in the past is published as > > Eckart De Castilho, Richard, and Iryna Gurevych. > "DKPro-UGD: A Flexible Data-Cleansing Approach to Processing User-Generated > Discourse." [1] > > The latest version of the components described there is available in DKPro > Core [2]. > > Cheers, > > -- Richard > > [1] > http://www.ukp.tu-darmstadt.de/publications/details/?no_cache=1&tx_bibtex_pi1%5Bpub_id%5D=TUD-CS-2009-0078 > [2] > http://code.google.com/p/dkpro-core-asl/source/browse/de.tudarmstadt.ukp.dkpro.core-asl/trunk/de.tudarmstadt.ukp.dkpro.core.castransformation-asl > > Am 20.05.2013 um 04:21 schrieb GATE User <[email protected]>: > >> 1) How do I change the original message based on annotations in UIMA. For >> example, lets say I have the string: >> 201301012345 >> >> That contains both the date and time. I want to have an annotator that will >> find such things in the text and add a space between them so it becomes: >> 2030101 2345 >> >> What's the easiest way to modify the text in this instance? >> >> Also, let's say I have the sentence: >> >> See Spot run far down Main Street. >> >> and I have an annotator that that finds and labels main street as a street >> name. Now I want to make an annotator that, if it finds a street name >> annotation, to change that street name into something else, like River Blvd. >> So the above sentence would be: >> >> See Spot run far down River Blvd. >> >> What's the easiest way to do this? Will I, afterwards, have to resend the >> CAS through the pipeline again or is there an easy way to update all >> annotations that would be affected by the change since River Blvd is shorter >> than Main Street? >> >> Thanks in advance.
