Bjørn Mølgård Vester created XMLSCHEMA-65:
---------------------------------------------

             Summary: Make XmlSchema return schemas in a collection in a 
predictable order
                 Key: XMLSCHEMA-65
                 URL: https://issues.apache.org/jira/browse/XMLSCHEMA-65
             Project: XmlSchema
          Issue Type: Improvement
    Affects Versions: 2.3.0
            Reporter: Bjørn Mølgård Vester


When XmlSchema returns schemas in a collection, they are in an unpredictable 
order that depends on the hashcode for, among other things, the absolute path 
to them on the file system. This is a problem for reproducible builds, as the 
result will differ depending on where you store the schemas you work on.

For example, I use CXF's "wsimport" tool to generate Java classes from my 
schemas. The schemas are in a source control system, which is checked out by a 
CI server in different folders depending on the branch name among other things.

CXF keeps schemas in an XmlSchemaCollection instance and asks for all schemas 
during type generation. It then uses these to generate an "ObjectFactory" class 
that contains elements from these schemas. As the schemas are returned in a 
different order depending on the checkout folder path, the generated source 
code will contain the elements in different order from build to build.

This in turn breaks our build cache that fingerprints the schema files as 
inputs and generated code as output, which breaks the cache further in the 
build pipeline and leads to long build times.

While this could be fixed in CXF by sorting the schemas first, I think it will 
be beneficial to make that class return schemas in a predictable order for all 
clients, and not just CXF. The order could be the same as the order of when 
schemas were added to begin with (requiring the client to iterate over them in 
a predictable order), or it could sort them by namespace or similar. I 
personally prefer the former as it is a bit more flexible.

Technically, the XmlSchemaCollection class keeps schemas in a HashMap where the 
key is a SchemaKey, containing a "systemId" field that includes the full file 
path.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@ws.apache.org
For additional commands, e-mail: dev-h...@ws.apache.org

Reply via email to