[ 
https://issues.apache.org/jira/browse/UIMA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184082#comment-13184082
 ] 

Peter Klügl commented on UIMA-2233:
-----------------------------------

Additional information about tags, especially by what kind of (html) tags a 
token is surrounded, will be removed in this issue since the information is 
stored in the inference annotation, but is only set in the seed lexer. I will 
create a new issue to improve html support again.
                
> Make the seeding configurable and independently of the rule inference
> ---------------------------------------------------------------------
>
>                 Key: UIMA-2233
>                 URL: https://issues.apache.org/jira/browse/UIMA-2233
>             Project: UIMA
>          Issue Type: New Feature
>          Components: TextMarker
>            Reporter: Peter Klügl
>            Assignee: Peter Klügl
>
> The seeding needs to become more configurable and the user should be able to 
> choose the seeder or select given annotation types for the initial inference 
> annotations (TextMarkerBasic). Both cases need to be configurable in the 
> analysis engine descriptor. One possible approach for a more configurable 
> seeding is the usage of the rule-based ICU tokenizer that would replace the 
> JFlex lexer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to