[jira] [Commented] (UIMA-6194) Ruta: RutaLiteralMatcher throws exception for special choice of string

2020-03-25 Thread Michael Stenger (Jira)


[ 
https://issues.apache.org/jira/browse/UIMA-6194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17067076#comment-17067076
 ] 

Michael Stenger commented on UIMA-6194:
---

OK, understood. Thanks for explaining.

> Ruta: RutaLiteralMatcher throws exception for special choice of string
> --
>
> Key: UIMA-6194
> URL: https://issues.apache.org/jira/browse/UIMA-6194
> Project: UIMA
>  Issue Type: Bug
>  Components: Ruta
>Affects Versions: 2.8.0ruta
>Reporter: Michael Stenger
>Assignee: Peter Klügl
>Priority: Minor
> Fix For: 2.8.1ruta, 3.0.1ruta
>
>
> For certain combinations of document text and RuleElementLiteral in the 
> script, method getAnnotation of class RutaLiteralMatcher throws a 
> NullPointerException.  This seems to be the case whenever the used string is 
> a postfix or infix of a word in the document, but itself doesn't occur.
> h4. Example
> Script
>  
> {code:java}
> DECLARE testType;
> "est" {-> testType};
> "est te"{-> testType};
> {code}
> Document
>  
> {code:java}
> test test{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (UIMA-6194) Ruta: RutaLiteralMatcher throws exception for special choice of string

2020-03-25 Thread Jira


[ 
https://issues.apache.org/jira/browse/UIMA-6194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17067047#comment-17067047
 ] 

Peter Klügl commented on UIMA-6194:
---

Yes and no. Not only complete tokens, but only at offsets of any annotation, 
which chould be smaller than a token. It could even be a character. I added 
some more test to check that the literal string match is restricted to 
RutaBasic: 
{noformat}
LiteralStringMatchTest.testInRutaBasicMatch()
{noformat}


> Ruta: RutaLiteralMatcher throws exception for special choice of string
> --
>
> Key: UIMA-6194
> URL: https://issues.apache.org/jira/browse/UIMA-6194
> Project: UIMA
>  Issue Type: Bug
>  Components: Ruta
>Affects Versions: 2.8.0ruta
>Reporter: Michael Stenger
>Assignee: Peter Klügl
>Priority: Minor
> Fix For: 2.8.1ruta, 3.0.1ruta
>
>
> For certain combinations of document text and RuleElementLiteral in the 
> script, method getAnnotation of class RutaLiteralMatcher throws a 
> NullPointerException.  This seems to be the case whenever the used string is 
> a postfix or infix of a word in the document, but itself doesn't occur.
> h4. Example
> Script
>  
> {code:java}
> DECLARE testType;
> "est" {-> testType};
> "est te"{-> testType};
> {code}
> Document
>  
> {code:java}
> test test{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (UIMA-6194) Ruta: RutaLiteralMatcher throws exception for special choice of string

2020-03-25 Thread Michael Stenger (Jira)


[ 
https://issues.apache.org/jira/browse/UIMA-6194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17066893#comment-17066893
 ] 

Michael Stenger commented on UIMA-6194:
---

I used the test for RutaLiteralMatcher in the repository to test how it 
responds to the examples I mentioned above. I therefore didn't use other 
analysis engines or modified the tokens in the test document. If you modify the 
test alike, you should see what I mean. What I get from the second comment is 
that RuleElementLiteral is supposed to work on complete tokens only, not 
snippets like "est" instead of "test". Is that what you mean?

> Ruta: RutaLiteralMatcher throws exception for special choice of string
> --
>
> Key: UIMA-6194
> URL: https://issues.apache.org/jira/browse/UIMA-6194
> Project: UIMA
>  Issue Type: Bug
>  Components: Ruta
>Affects Versions: 2.8.0ruta
>Reporter: Michael Stenger
>Assignee: Peter Klügl
>Priority: Minor
> Fix For: 2.8.1ruta, 3.0.1ruta
>
>
> For certain combinations of document text and RuleElementLiteral in the 
> script, method getAnnotation of class RutaLiteralMatcher throws a 
> NullPointerException.  This seems to be the case whenever the used string is 
> a postfix or infix of a word in the document, but itself doesn't occur.
> h4. Example
> Script
>  
> {code:java}
> DECLARE testType;
> "est" {-> testType};
> "est te"{-> testType};
> {code}
> Document
>  
> {code:java}
> test test{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (UIMA-6194) Ruta: RutaLiteralMatcher throws exception for special choice of string

2020-03-25 Thread Jira


[ 
https://issues.apache.org/jira/browse/UIMA-6194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17066542#comment-17066542
 ] 

Peter Klügl commented on UIMA-6194:
---

Ok, after having some sleep and some more thoughts about it: What I wrote is 
intended behavior, but it is not the current implementation. The changes for 
the literal string matcher in the last release have been implemented too fast 
with the believe that the test coverage ist good enough.  That's not true, I 
will add more tests and fix the behavior.

> Ruta: RutaLiteralMatcher throws exception for special choice of string
> --
>
> Key: UIMA-6194
> URL: https://issues.apache.org/jira/browse/UIMA-6194
> Project: UIMA
>  Issue Type: Bug
>  Components: Ruta
>Affects Versions: 2.8.0ruta
>Reporter: Michael Stenger
>Assignee: Peter Klügl
>Priority: Minor
> Fix For: 2.8.1ruta, 3.0.1ruta
>
>
> For certain combinations of document text and RuleElementLiteral in the 
> script, method getAnnotation of class RutaLiteralMatcher throws a 
> NullPointerException.  This seems to be the case whenever the used string is 
> a postfix or infix of a word in the document, but itself doesn't occur.
> h4. Example
> Script
>  
> {code:java}
> DECLARE testType;
> "est" {-> testType};
> "est te"{-> testType};
> {code}
> Document
>  
> {code:java}
> test test{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (UIMA-6194) Ruta: RutaLiteralMatcher throws exception for special choice of string

2020-03-24 Thread Jira


[ 
https://issues.apache.org/jira/browse/UIMA-6194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17066167#comment-17066167
 ] 

Peter Klügl commented on UIMA-6194:
---

What do you mean exactly by that you tried and that they pass the Matcher? With 
a script or with a direct call to the matcher?
The basic annotations are RutaBasic which are automatically created and managed 
to represent a complete disjunct partitioning. So you can modify the matching 
behavior of the literal string matches or also the dictionary lookup by adding 
your own annotations, e.g., decompounding.

If you have prepended other analysis engines or if you used some simple regex 
rules or if you modified offsets manually, there could be RutaBasics smaller 
than TokenSeeds.

(This may sound strange but I think it's a cool feature)


> Ruta: RutaLiteralMatcher throws exception for special choice of string
> --
>
> Key: UIMA-6194
> URL: https://issues.apache.org/jira/browse/UIMA-6194
> Project: UIMA
>  Issue Type: Bug
>  Components: Ruta
>Affects Versions: 2.8.0ruta
>Reporter: Michael Stenger
>Assignee: Peter Klügl
>Priority: Minor
> Fix For: 2.8.1ruta, 3.0.1ruta
>
>
> For certain combinations of document text and RuleElementLiteral in the 
> script, method getAnnotation of class RutaLiteralMatcher throws a 
> NullPointerException.  This seems to be the case whenever the used string is 
> a postfix or infix of a word in the document, but itself doesn't occur.
> h4. Example
> Script
>  
> {code:java}
> DECLARE testType;
> "est" {-> testType};
> "est te"{-> testType};
> {code}
> Document
>  
> {code:java}
> test test{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (UIMA-6194) Ruta: RutaLiteralMatcher throws exception for special choice of string

2020-03-24 Thread Michael Stenger (Jira)


[ 
https://issues.apache.org/jira/browse/UIMA-6194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17065914#comment-17065914
 ] 

Michael Stenger commented on UIMA-6194:
---

I got another question on this subject: The matching behavior of 
RutaLiteralMatcher confuses me a bit. The commentary in the class, method 
getAnnotation, indicates that only strings ranging from the start of a basic 
annotation to the end of a basic annotation are considered for matching. In the 
respective test, class RutaLiteralMatcherTest, strings "test", "is a test", "." 
and so on should be matched, but not "est" or "s a tes". Still, if I try "est" 
or "Th", they do pass the Matcher. Is that purposeful behavior? Thanks.

> Ruta: RutaLiteralMatcher throws exception for special choice of string
> --
>
> Key: UIMA-6194
> URL: https://issues.apache.org/jira/browse/UIMA-6194
> Project: UIMA
>  Issue Type: Bug
>  Components: Ruta
>Affects Versions: 2.8.0ruta
>Reporter: Michael Stenger
>Assignee: Peter Klügl
>Priority: Minor
> Fix For: 2.8.1ruta, 3.0.1ruta
>
>
> For certain combinations of document text and RuleElementLiteral in the 
> script, method getAnnotation of class RutaLiteralMatcher throws a 
> NullPointerException.  This seems to be the case whenever the used string is 
> a postfix or infix of a word in the document, but itself doesn't occur.
> h4. Example
> Script
>  
> {code:java}
> DECLARE testType;
> "est" {-> testType};
> "est te"{-> testType};
> {code}
> Document
>  
> {code:java}
> test test{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (UIMA-6194) Ruta: RutaLiteralMatcher throws exception for special choice of string

2020-03-20 Thread Jira


[ 
https://issues.apache.org/jira/browse/UIMA-6194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17063367#comment-17063367
 ] 

Peter Klügl commented on UIMA-6194:
---

Hi, sorry for the delayed response. This should already been fixed in the 
current trunk/snapshot. I will prepare a bugfix release asap.

> Ruta: RutaLiteralMatcher throws exception for special choice of string
> --
>
> Key: UIMA-6194
> URL: https://issues.apache.org/jira/browse/UIMA-6194
> Project: UIMA
>  Issue Type: Bug
>  Components: Ruta
>Affects Versions: 2.8.0ruta
>Reporter: Michael Stenger
>Assignee: Peter Klügl
>Priority: Minor
> Fix For: 2.8.1ruta, 3.0.1ruta
>
>
> For certain combinations of document text and RuleElementLiteral in the 
> script, method getAnnotation of class RutaLiteralMatcher throws a 
> NullPointerException.  This seems to be the case whenever the used string is 
> a postfix or infix of a word in the document, but itself doesn't occur.
> h4. Example
> Script
>  
> {code:java}
> DECLARE testType;
> "est" {-> testType};
> "est te"{-> testType};
> {code}
> Document
>  
> {code:java}
> test test{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)