How would you write "buildwindows", given that its "action" method would be called once for each Sentence, in random order?
It's very simple to write a very small set of rules to construct all SentenceWindow facts of size 1 and then to extend them to any desired size, depending on some parameter. 1. Given a Sentence and no Window beginning with it, create a Window of length 1. 2. Given a Window of size n < desiredSize and given a Sentence immediately following it, extend the Window to one of size n+1. 3a. For any Window of desiredSize, inspect it for "closely situated ManualAnnotations". 3b. If ManualAnnotations have been associated with their containing Sentences up-front, you just need to find Windows with more than 1 ManualAnnotation, adding them in the RHS of rule 2 above. -W 2011/8/19 Bruno Freudensprung <[email protected]> > ** > > Hi Wolfgang, > > Thanks for your answer. > Sentences are not contiguous (might be some space characters in between) > but manual annotations cannot overlap sentences (interpret "overlap" in > terms of Drools Fusion terminology). > If I had an "inside" operator, do you think the following accumulate option > could be better? > > when > *$result : ArrayList() from accumulate ( $s: Sentence(), > buildwindows($s))* > * $w : SentenceWindows () **from $result* > a1 : ManualAnnotation (this *inside *$w) > a2 : ManualAnnotation (this != a1, this *inside *$w) > then > ... do something with a1 and a2 since they are "close" to each other > end > > Does anyone know something about accumulator parametrization (looking at > the source code it does not seem to be possible, though)? > Maybe a syntax inspired of operator parametrization could be nice: > > $result : ArrayList() from accumulate ( $s: Sentence(), > *buildwindows[3]($s) > *) > > Best regards, > > Bruno. > > Le 19/08/2011 13:55, Wolfgang Laun a écrit : > > There are some details that one should consider before deciding on a > particular implementation technique. > > - Are all Sentences contiguous, i.e., s1.end = pred( s2.start ) > - Can a ManualAnnotation start on one Sentence and end in the next or > any further successor? > > As in all problems where constraints depend on an order between facts, > performance is going to be a problem with increasing numbers of Sentences > and ManualAnnotations. > > Your accumulate plan could be a very inefficient approach. Creating O(N*N) > pairs and then looking for an overlapping window is much worse than looking > at each window, for instance. But it depends on the expected numbers for > both. > > -W > > > > 2011/8/19 Bruno Freudensprung <[email protected]> > >> Hello, >> >> I am trying to implement rules handling "Sentence", "ManualAnnotation" >> objects (imagine someone highligthing words of the document). Basically >> "Sentence" objects have "start" and "end" positions (fields) into the text >> of a document, and they are Comparable according to their location into the >> document. >> >> I need to write rules using the notion "window of consecutive sentences". >> >> Basically I am not very interested by those "SentenceWindow" objects, I >> just need them to define a kind of proximity between "ManualAnnotation" >> objects. >> What I eventually need in the "when" of my rule is something like: >> >> when >> ... maybe something creating the windows >> a1 : ManualAnnotation () >> a2 : ManualAnnotation (this != a1) >> SentenceWindow (this includes a1, this includes a2) >> then >> ... do something with a1 and a2 since they are "close" to each other >> end >> >> As I don't know the "internals" of Drools, I would like to have your >> opinion about what the best "idiom": >> >> - create all SentenceWindow objects and insert them in the working >> memory, then write rules against all the facts (SentenceWindow and >> ManualAnnotation) >> - implement an accumulator that will create a list of SentenceWindow >> object >> >> >> The first option could look like: >> >> rule "Create sentence windows" >> when >> # find 3 consecutive sentences >> s1 : Sentence() >> s2 : Sentence(this > s1) >> s3 : Sentence(this > s2) >> not Sentence(this != s2 && > s1 && < s3) >> then >> SentenceWindow swindow = new SentenceWindow(); >> swindow.setStart(s1.getStart()); >> swindow.setTheend(s3.getEnd()); >> insert(swindow); >> end >> >> ... Then use the first rule "as is". >> >> The accumulator option could look like (I am not really sure the syntax is >> correct) : >> >> when >> *$result : ArrayList() from accumulate ( $s: Sentence(), >> buildwindows($s))* >> a1 : ManualAnnotation () >> a2 : ManualAnnotation (this != a1) >> * SentenceWindows (this includes a1, this includes a2) **from $result* >> then >> ... do something with a1 and a2 since they are "close" to each other >> end >> >> Is it possible to decide if one way is best than the other? >> >> And one last question: it is possible to "parametrize" an accumulator (in >> order to provide the number of sentences that should be put in the windows)? >> I mean something like: >> >> when >> $result : ArrayList() from accumulate ( $s: Sentence(), * >> buildwindows(3,* $s)) >> >> >> Thanks in advance for you insights, >> >> Best regards, >> >> Bruno. >> >> _______________________________________________ >> rules-users mailing list >> [email protected] >> https://lists.jboss.org/mailman/listinfo/rules-users >> >> > > _______________________________________________ > rules-users mailing > [email protected]https://lists.jboss.org/mailman/listinfo/rules-users > > > > _______________________________________________ > rules-users mailing list > [email protected] > https://lists.jboss.org/mailman/listinfo/rules-users > >
_______________________________________________ rules-users mailing list [email protected] https://lists.jboss.org/mailman/listinfo/rules-users
