[ 
https://issues.apache.org/jira/browse/UIMA-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Klügl resolved UIMA-2757.
-------------------------------

       Resolution: Fixed
    Fix Version/s: 2.0.1TextMarker

added wild card, test and documentation
                
> TextMarker: Add wildcard rule element
> -------------------------------------
>
>                 Key: UIMA-2757
>                 URL: https://issues.apache.org/jira/browse/UIMA-2757
>             Project: UIMA
>          Issue Type: New Feature
>          Components: TextMarker
>            Reporter: Peter Klügl
>            Assignee: Peter Klügl
>             Fix For: 2.0.1TextMarker
>
>
> Right now, something like a wildcard or an I-don't-care rule element can be 
> implemented with ANY*?. However, those rule elements actually investigate 
> each token until the next rule element is successfully matched, meaning they 
> are slow if there is some space in between.
> A real wildcard, which just skips everything, would really be useful (and 
> faster). This can be implemented by not iterating over the visible inference 
> annotations, but actually finding a matchable position in the index and then 
> check whether it is visible. Since the next rule element can possibly quite 
> complex, it is maybe better to just match to the next annotation, and if that 
> one is invisible, then return a failed match. This behavior needs actually 
> some careful testing in different use cases.
> First suggestion for the syntax (** for wild card):
> CW **{-> MARK(Type)} PERIOD; 
> The "**" is maybe not the best solution since it looks quite like a 
> quantifier *?. Introducting an actual keyword can also be problematic since 
> they might be a type with the same name. Maybe something like
> CW #{-> MARK(Type)} PERIOD; 
> is better.
> This rule would create an annotation from the end of each capitalized word to 
> the begin of the next period, including the white spaces. However, those can 
> be removed with the TRIM action.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to