[ 
https://issues.apache.org/jira/browse/UIMA-4079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14193211#comment-14193211
 ] 

Silvestre Losada commented on UIMA-4079:
----------------------------------------

Instead of change the algorithm maybe will be easy to change the data. Would it 
be possible normalize the information when it is loaded in memory? What I 
propose is to normalize the information when it is loaded in memory, according 
to our needs, for example removing special characters or white spaces. So if 
information is ok in wordlists then the match will be possible.

> MarkTable action not able to recognize entities with two or more words
> ----------------------------------------------------------------------
>
>                 Key: UIMA-4079
>                 URL: https://issues.apache.org/jira/browse/UIMA-4079
>             Project: UIMA
>          Issue Type: Bug
>          Components: ruta
>    Affects Versions: 2.2.2ruta
>            Reporter: Silvestre Losada
>             Fix For: 2.2.2ruta
>
>
> I think this error was introduced solving UIMA-4071. The problem is that  
> RutaStream.getVisibleCoveredText method removes whitespaces in covered text. 
> For example Bill Clinton covered text returns BillClinton.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to