Thanks for fixing it! :- ) It was driving me crazy! :- ) -----Original Message----- From: Peter Klügl [mailto:[email protected]] Sent: December 17, 2015 13:45 To: [email protected] Subject: Re: UIMA RUTA - Custom BLOCK extension
Yes, it was a bug. It's fixed now and here's the test: https://svn.apache.org/repos/asf/uima/ruta/trunk/ruta-core/src/test/java/org /apache/uima/ruta/BlockTest.java Thanks for pointing it out :-) Best, Peter Am 17.12.2015 um 21:47 schrieb Peter Klügl: > Hi, > > the first one looks like a bug. The block should behave as expected. > I'll take a look at it. > > About the array-block-extension: Yes, send me the code. > > I am already working towards explicit references to annotations (right > now you reference them with type expressions) and support of Arrays. > Then, your use case will directly supported without any extension like > > EntityRole.featureName->{LOG(Document.ct);}; > > or > > a:EntityRole.featureName{-> LOG(a.ct)}; > > Best, > > Peter > > > > Am 16.12.2015 um 02:02 schrieb Miguel Alvarez: >> Thanks again! >> >> I noticed some strange behaviour with the BLOCK statement while >> creating this new extension, which I am not sure it is correct. I >> thought these two statements should be equivalent: >> >> EntityRole{EntityRole.relationId == "5" -> >> LOG(EntityRole.relationId), LOG(EntityRole.ct)}; >> >> BLOCK(MyTest) EntityRole{EntityRole.relationId == "5"} { >> LOG(EntityRole.relationId); >> LOG(EntityRole.ct); >> } >> >> But it turns out that there are cases in which these two statements >> are not equivalent, and the second one shows some strange behaviour. >> >> If you have a string that has been annotated multiple times (let's >> say five >> times) with the same annotation type (in this case EntityRole) but >> for each of those annotations the feature relationId has different >> values (values from 1 to 5). In this case the first statement will >> log two lines, as I was expecting, one with the relationId value of >> "5" and the other with the covered text of the annotation. >> >> But in the second statement (the BLOCK) the script will end up >> logging ten lines. The first five will contain the value of "5" for >> the relationId, and the last 5 will contain the covered text for the >> annotation. >> >> Had you come across this before? Why is it doing it five times even >> though there is only one annotation where the relationId feature is >> equal to 5? And the interesting part is the order in which logs the >> information: >> first it >> logs the relationId feature value 5 times, and then it logs the >> covered text five times. >> >> Any ideas? >> >> Talking about developing new extensions. The other extension I tried >> to develop but then I got stuck at some point was another block that >> would iterate over the annotations contained in an feature array. But >> I am not sure how applicable this can be to the majority of users. >> For instance, let's say we have this: >> >> ARRAYBLOCK(featureName) EntityRole { >> Document{->LOG(Document.ct)}; >> } >> >> In this case the annotation EntityRole will have a feature named >> featureName that contains an array of annotations, and the block will >> loop through those annotations in the array changing also the scope >> to those annotations. The only way I could find of specifying the >> feature that contains the array was using the block id, but then I >> keep on getting a warning saying that the type hasn't been defined >> >> The problem I have is that it doesn't seem to be applying the >> elements within the block. If you think this one can be interesting I >> can send the code I have so far. >> >> Cheers, >> Miguel >> >> -----Original Message----- >> From: Peter Klügl [mailto:[email protected]] >> Sent: December 14, 2015 9:51 >> To: [email protected] >> Subject: Re: UIMA RUTA - Custom BLOCK extension >> >> Hi, >> >> Am 14.12.2015 um 18:20 schrieb Miguel Alvarez: >>> Thanks Peter! That is exactly what I was looking for. I think my >>> code wasn't working because of the way I was invoking the >>> constructor, which I didn't include in my previous emails. I assume >>> since you have included this in the code already, I don't need to do >>> anything to >> contribute it, right? >> >> Yes... but you are welcome to come up with other great ideas ;-) >> >> I will take of the documentation. I also think about adding the first >> (wrong) variant. >> >> Does anyone have ideas about the naming? If not it will remain >> DOCUMENTBLOCK or something similar. >> >> >> Best, >> >> Peter >> >>> I hope this new block is useful to many other people. >>> >>> Thanks, >>> Miguel >>> >>> -----Original Message----- >>> From: Peter Klügl [mailto:[email protected]] >>> Sent: December 14, 2015 4:37 >>> To: [email protected] >>> Subject: Re: UIMA RUTA - Custom BLOCK extension >>> >>> Hi, >>> >>> sorry, I misinterpreted your use case. >>> >>> Yes, you are completely right and your code looks correct. >>> >>> If getList() does not return the matches, then either the rule >>> wasn't able >>> to find any anchors at all to start matching, or the apply was >>> called with >>> false meaning the matches are not stored for performance reasons. You >> should >>> be able to just delegate to the RutaScriptBlock with a resetted >> RutaStream: >>> @Override >>> public ScriptApply apply(RutaStream stream, InferenceCrowd crowd) { >>> CAS cas = stream.getCas(); >>> AnnotationFS documentAnnotation = cas.getDocumentAnnotation(); >>> RutaStream completeStream = >>> stream.getWindowStream(documentAnnotation, >>> documentAnnotation.getType()); >>> ScriptApply result = super.apply(completeStream, crowd); >>> return result; >>> } >>> >>> I added this to the current trunk: >>> block impl: >>> >> https://svn.apache.org/repos/asf/uima/ruta/trunk/ruta-core-ext/src/main/java >> >>> /org/apache/uima/ruta/block/DocumentBlock.java >>> unit test: >>> >> https://svn.apache.org/repos/asf/uima/ruta/trunk/ruta-core-ext/src/test/java >> >>> /org/apache/uima/ruta/block/DocumentBlockTest.java >>> >>> >>> Does this work for you? >>> >>> Best, >>> >>> Peter >>> >>> >>> Am 14.12.2015 um 04:14 schrieb Miguel Alvarez: >>>> Hi Peter, >>>> >>>> >>>> Thanks for your prompt reply. >>>> >>>> >>>> Let me know if I am wrong, but I dont think the code you sent would >>>> work in case of having the custom BLOCK extension nested inside >>>> another block. For instance lets say we have these annotations in >>>> some >>> text: >>>> >>>> Annotation1 Annotation2 Annotation3 Annotation2 Annotation4 >>>> Annotation2 >>>> >>>> >>>> BLOCK Annotation3{} { >>>> >>>> // Extract some information from Annotation3s features and >>>> store >>>> them in variables >>>> >>>> DOCUMENTBLOCK Annotation2{} { >>>> >>>> // Use the information extracted from Annotation3 to >>>> determine >>>> if this particular Annotation2 is the one I want >>>> >>>> } >>>> >>>> } >>>> >>>> >>>> And I actually want the custom BLOCK extension to have the right >>>> context when within the BLOCK. So I want the DOCUMENTBLOCK extension >>>> to look for >>>> Annotation2 in the whole document, but once you are inside the >>>> DOCUMENTBLOCK >>>> Annotation2 should be the new scope (as the current BLOCK statement >>>> does right now). >>>> >>>> >>>> So initially this is the code I tried: >>>> >>>> >>>> public ScriptApply apply(RutaStream stream, InferenceCrowd >>>> crowd) { >>>> // Create a new stream of the whole document >>>> RutaStream docStream = >>>> stream.getWindowStream(stream.getCas().getDocumentAnnotation(), >>>> stream.getCas().getDocumentAnnotation().getType()); >>>> BlockApply result = new BlockApply(this); >>>> crowd.beginVisit(this, result); >>>> RuleApply apply = rule.apply(docStream, crowd, true); >>>> for (AbstractRuleMatch<? extends AbstractRule> eachMatch : >>>> apply.getList()) { >>>> if (eachMatch.matched()) { >>>> List<AnnotationFS> matchedAnnotations = ((RuleMatch) >>>> eachMatch).getMatchedAnnotations(null, null); >>>> if (matchedAnnotations == null || >>>> matchedAnnotations.isEmpty()) >> { >>>> continue; >>>> } >>>> AnnotationFS each = matchedAnnotations.get(0); >>>> if (each == null) { >>>> continue; >>>> } >>>> List<Type> types = ((RutaRuleElement) >>>> rule.getRuleElements().get(0)).getMatcher().getTypes(getParent() == >>>> null >> ? >>>> this : getParent(), docStream); >>>> for (Type eachType : types) { >>>> RutaStream window = docStream.getWindowStream(each, >> eachType); >>>> for (RutaStatement element : getElements()) { >>>> if (element != null) { >>>> element.apply(window, crowd); >>>> } >>>> } >>>> } >>>> } >>>> } >>>> crowd.endVisit(this, result); >>>> return result; >>>> } >>>> >>>> >>>> I thought I would just get a new stream that covers the whole >>>> document, and apply the rules to that but the call apply.getList() >>>> would never return anything even though I dont have any conditions in >>>> the RUTA script for the DOCUMENTBLOCK extension. And that is why I >>>> ended up calling the method getAllOfType, because that one was working >>>> fine, but of course, it doesnt apply the conditions. >>>> >>>> >>>> Any ideas why the getList wouldnt return anything even though I am >>>> passing a new stream that covers the whole document? >>>> >>>> >>>> If I get this to work, I have no problems contributing it to the UIMA >>>> RUTA project. >>>> >>>> >>>> Cheers, >>>> >>>> Miguel >>>> >>>> >>>> >>>> From: Peter Klügl < >>>> <http://gmane.org/get-address.php?address=peter.kluegl%2deqSzvFVgjydBD >>>> gjK7y7 TUQ%40public.gmane.org> peter.kluegl@...> >>>> Subject: >>>> <http://news.gmane.org/find-root.php?message_id=566D5BCE.7070503%40ave >>>> rbis.c >>>> om> Re: UIMA RUTA - Custom BLOCK extension >>>> Newsgroups: <http://news.gmane.org/gmane.comp.apache.uima.devel> >>>> gmane.comp.apache.uima.devel >>>> Date: 2015-12-13 11:51:42 GMT (14 hours and 50 minutes ago) >>>> >>>> Hi, >>>> oh yes, this is a nice extension. I was also already planning to >>>> add >>>> something like this, but in my use cases the explicit referencing to >>>> each matched annotation in the gobal context was missing. Thus, I am >>>> implementing the annotation issues first. >>>> It is possible to specify something like this right now in UIMA >>>> Ruta >>>> but I would not recommend it. You could either spam/remove annotations >>>> on the complete document or you could use the recursion functionality >>>> of BLOCKs. >>>> Now to the custom block: >>>> You need to apply the head rule of the block in order to >>>> evaluate the >>>> conditions. The scope is changed by the usage of a new restricted >>>> RutaStream (windowStream). In order to retain the scope, just use the >>>> given RutaStream. >>>> Without having tested it, it could look something like: >>>> <at> Override >>>> public ScriptApply apply(RutaStream stream, InferenceCrowd >>>> crowd) { >>>> BlockApply result = new BlockApply(this); >>>> crowd.beginVisit(this, result); >>>> RuleApply apply = rule.apply(stream, crowd, true); >>>> for (AbstractRuleMatch<? extends AbstractRule> eachMatch : >>>> apply.getList()) { >>>> if (eachMatch.matched()) { >>>> for (RutaStatement element : getElements()) { >>>> if (element != null) { >>>> element.apply(stream, crowd); >>>> } >>>> } >>>> } >>>> } >>>> crowd.endVisit(this, result); >>>> return result; >>>> } >>>> Let me know if this helps. >>>> Do you want to contribute the block extension? >>>> Best, >>>> Peter >>>> Am 12.12.2015 um 00:04 schrieb Miguel Alvarez: >>>>> Hi, >>>>> >>>>> >>>>> I am in the process of developing a custom BLOCK extension that >>>>> instead of changing the scope of the block, it uses the scope of the >>> whole Document. >>>>> With this type of BLOCK one could loop through a series of >>>>> annotations, >>>> and >>>>> for each of those annotations search in the whole document for >>>>> something else. I guess my first questions is: Is it even possible to >>>>> do something like this without creating a custom BLOCK extension? >>>>> >>>>> >>>>> I got something to work, but it doesn't seem to apply the conditions >>>>> for >>>> the >>>>> block. This is more or less the code I have so far: >>>>> >>>>> >>>>> List<Type> types = ((RutaRuleElement) >>>>> rule.getRuleElements().get(0)).getMatcher().getTypes(getParent() >>>>> == null >>> ? >>>>> this : getParent(), stream); >>>>> >>>>> for (Type eachType : types) { >>>>> >>>>> //System.out.println("each Type: " + >>>>> eachType.getShortName()); >>>>> >>>>> for(AnnotationFS each : >>>> stream.getAllofType(eachType)) >>>>> { >>>>> >>>>> RutaStream window = >>>>> stream.getWindowStream(each, >>>>> eachType); >>>>> >>>>> for (RutaStatement element : getElements()) { >>>>> >>>>> if (element != null) { >>>>> >>>>> element.apply(window, crowd); >>>>> >>>>> } >>>>> >>>>> } >>>>> >>>>> >>>>> } >>>>> >>>>> } >>>>> >>>>> >>>>> I assume in order to apply the conditions I would need something like >>>> this: >>>>> RuleApply apply = rule.apply(stream, crowd); >>>>> >>>>> >>>>> But for some reason this doesn't work, because I guess the scope has >>>> already >>>>> been changed and it is not able to find any of the annotations in >>>>> within >>>> the >>>>> scope. >>>>> >>>>> >>>>> Does this make any sense? Is there a better way to do this? >>>>> >>>>> >>>>> Any help would be much appreciated. >>>>> >>>>> >>>>> Cheers, >>>>> >>>>> Miguel >>>>> >>>>> >>>> >>>> >> >
