Peter Klügl created UIMA-2508:
---------------------------------

             Summary: Improve lexer in default TextMarker seeding for html 
fragments
                 Key: UIMA-2508
                 URL: https://issues.apache.org/jira/browse/UIMA-2508
             Project: UIMA
          Issue Type: Improvement
          Components: TextMarker
            Reporter: Peter Klügl
            Assignee: Peter Klügl
            Priority: Minor


The default seeding creates erroneously markup annotations because the applied 
regexp in the lexer is just too simple. The identifier should be based on 
something like: \<\/?\w+(([ \t\f]+\w+([ \t\f]*=[ \t\f]*(\".*?\"|\'.*?\'|[^\'\"> 
\t\f]+))?)+[ \t\f]*|[ \t\f]*)\/?\>

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to