https://bz.apache.org/bugzilla/show_bug.cgi?id=64931

            Bug ID: 64931
           Summary: Implement validation of changelog.xml file at build
                    time
           Product: Tomcat 10
           Version: 10.0.0-M10
          Hardware: PC
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Documentation
          Assignee: dev@tomcat.apache.org
          Reporter: knst.koli...@gmail.com
  Target Milestone: ------

I have a fix for this that I will commit shortly. I am filing an issue to
better document the problem and design decisions.


The file "webapps/docs/changelog.xml" sometimes has structural errors. Those
errors are hard to spot. Thus it would be better to have an automated solution
to catch and report them at build time.

For example, in Apache Tomcat 10.0.0-M10 the file has two such errors: at lines
182 and 1550.

https://github.com/apache/tomcat/blob/10.0.0-M10/webapps/docs/changelog.xml#L181


There are the following possibilities to implement the check:

(1) With XSLT, in the tomcat-docs.xsl stylesheet.

It is possible, but it would be an odd choice.

- Reporting an error can be done in XSLT 1.0 with

  <xsl:message terminate = "yes">...</xsl:message>

More recent versions of XSLT specification support validation against an XML
Schema.

- Custom behaviour could be triggered by file name. The tomcat-docs.xsl
stylesheet declared a `<xsl:param name="filename"` parameter.


(2) With an XML Schema.

I tried this way, but failed.

- Validation against an XML Schema is triggered with Apache Ant Task
schemavalidate.

- Running a check against the changelog file with a simple schema fails shortly
with an error:

  Element type "document" must be declared.

- My investigation (running with `ant -verbose` and searching through source
code) found that this message is generated when performing a validation against
a DTD.

(MSG_ELEMENT_NOT_DECLARED, org.apache.xerces.impl.dtd.XMLDTDValidator, in
Apache Xerces 2.12.0)

- I tried running with `<schemavalidate disableDTD="true"`, but it does not
help, as it fails at a `<!DOCTYPE document` declaration at the top of
changelog.xml file.

- I did not found any other setting, any parser feature that could selectively
turn off validation against a DTD.


(3) With a DTD.

I went with this way, and it worked successfully.

Validation against a DTD can be performed with Apache Ant Task xmlvalidate.

- Notes:

1. I defined the DTD inline in the changelog.xml file itself.

It could be moved to an external file, but there is no actual need as I am not
going to validate other files.

2. Any XML element used in changelog.xml and project.xml files must be declared
in the DTD. Any its attributes must be declared as well.

(The project.xml file is included into the changelog as an external entity.)

Thus far the only HTML markup elements that are actually used in Tomcat 10
changelog are <code> and <a>, but we may want to add others in the future.

A useful Tutorial on DTDs:
https://www.w3schools.com/xml/xml_dtd_intro.asp

A simple generic way to declare an element is

 <!ELEMENT elementname ANY>

A simple generic way to declare an attribute of an element is

 <!ATTLIST elementname attributename CDATA #IMPLIED>

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to