[
https://issues.apache.org/jira/browse/UIMA-6064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16865449#comment-16865449
]
Timo Boehme commented on UIMA-6064:
-----------------------------------
I think the problem starts with the possibility to categorize the features by
at least 3 orthogonal dimensions:
# defining API: SAX, Java (JAXP), Apache
# target: DTD, Schema, Stylesheet
# value type: boolean vs. allowed schemes (http, file, ...)
(PS: why was {{javax.xml.XMLConstants.ACCESS_EXTERNAL_SCHEMA}} not specified,
see [https://docs.oracle.com/javase/tutorial/jaxp/properties/properties.html)]
The current Apache Xerces does not support the new JAXP features (only the Java
bundled does) so getting a warning each time.
Thus it might be hard to find good abstract settings to capture the different
uses - e.g. if JAXP is correctly setup outside there would be no need to do it
here again. Maybe this could be a possibility:
* have one switch for restricting all/restrict none
* have an environment variable for XML features to set (comma separated), e.g.
feature1:true,feature2:false,...
* have an environment variable for XML properties to set (comma separated),
e.g. property1:all,property2:file,...
For all features/properties defined via variable use this value instead of the
hard-coded one.
> External DTD usage in XML descriptors disabled during build revision upgrade
> ----------------------------------------------------------------------------
>
> Key: UIMA-6064
> URL: https://issues.apache.org/jira/browse/UIMA-6064
> Project: UIMA
> Issue Type: Bug
> Components: Core Java Framework
> Affects Versions: 2.10.2SDK
> Reporter: Timo Boehme
> Priority: Major
>
> Between version 2.10.1 and 2.10.2 the XMLParser configuration was changed
> (fixed, without the possibility to adjust it) to not allow for DTD and its
> loading from external file.
> This is done in XMLUtils.createSAXParserFactory() which sets the
> DISALLOW_DOCTYPE_DECL and LOAD_EXTERNAL_DTD feature. Before the
> SAXParserFactory was created without adjusting these features.
> While I understand that this was done to prevent malicious XML from doing
> nasty things, the kind how it was done is problematic:
> * the change happened in a revision build, no major or minor number change
> * it was not documented
> * one cannot simply change it back like using an environment variable,
> method call etc. - the only workaround is to do a problematic sub-classing of
> XMLParser_impl with additional configuration etc.
> We use the DTDs for CPE descriptors quite a lot to have the descriptor in
> modular chunks using entities etc. Thus it is important (for the time being)
> to use DTD there - and we know that the XML is not problematic.
> Because this feature (DTD) is crucial I have marked this as a BUG since such
> changes should not occur in a build upgrade or it should at least be possible
> to get the old behavior easily back.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)