Mario Juric created UIMA-6128:
---------------------------------
Summary: Allow XMI to be optionally serialized with XML 1.1
instead of only 1.0
Key: UIMA-6128
URL: https://issues.apache.org/jira/browse/UIMA-6128
Project: UIMA
Issue Type: New Feature
Components: UIMA
Reporter: Mario Juric
Some unicode characters are not handled by XML 1.0 and it can require some
normalization or cleanup to be able to serialize the CAS to XMI, but
requirements may not necessarily allow all such characters to be fully removed
from the CAS. It can also be impossible to do such normalization/cleanup
without full reprocess when converting data already stored as compressed
binaries to XMI. Being able to optionally select XML 1.1 instead of the default
XML 1.0 would be an easier way for some to bypass many of those unicode issues.
See also discussion on the UIMA mailing list:
https://lists.apache.org/thread.html/7f8124b7be9ea20ab21dc616243e5661a0b7668a856532031fda71e3@%3Cuser.uima.apache.org%3E
This feature request suggests that an additional SerialFormat is introduced,
e.g. XMI_1_1, which can be selected as format parameter in the CasIOUtils.save
methods.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)