Peter Klügl created UIMA-2508:
---------------------------------
Summary: Improve lexer in default TextMarker seeding for html
fragments
Key: UIMA-2508
URL: https://issues.apache.org/jira/browse/UIMA-2508
Project: UIMA
Issue Type: Improvement
Components: TextMarker
Reporter: Peter Klügl
Assignee: Peter Klügl
Priority: Minor
The default seeding creates erroneously markup annotations because the applied
regexp in the lexer is just too simple. The identifier should be based on
something like: \<\/?\w+(([ \t\f]+\w+([ \t\f]*=[ \t\f]*(\".*?\"|\'.*?\'|[^\'\">
\t\f]+))?)+[ \t\f]*|[ \t\f]*)\/?\>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira