Hi,
I've tried to make validation check on my WIN system without cygwin, jade and nsgmls and found some strange things about PHPDOC build system. I'll be glad if somebody could read this and answer my questions. =)
Well, I am trying to decipher what you try to tell us, but sometimes it is quite hard...
xmllint needs to be run several times.
(A) The first run should be used to detect wrong entities, without trailing ; This errors should be fixed on the fly with a php script.
(B) The second run should be used to detect valid missing entities.
(c) The third run should be used to detect valid missing ids.
Multiple runs of xmllint seems to be necessary. As long as wrong entities (without trailing ;) exists, the error output is not usable to create valid missing entities. Also is seems there is no way to convince convince xmllint to report errors for missing entities and missing IDREFS in the same run.
1. Is the problem with missing ; still in place? Xmllint does the job pretty well, but is it really necessary to include additional PHP layer for processing xmllint messages? Every editor's highlightning engine can point out missing ; in xml entity, so I doubt there are many such errors.
Not everybody is using highlighting editors. Also saying that everyone will do just fine is not a good point. We have seen recently that some very experienced doc people committed code, which made 'make test' shout. The thing that it is possible does not mean it will work out. The missing entity and id handling is in place to let the manual build / display with livedocs even if there are some small errors. This is not to stop the automatic manual generation from updates, especially in these times manual updates are only done monthly or so...
Even if we switch to using "make test_xml" in place of "make test", there is a chance, that people will not run it. So regardless of the fact that it would find the ;-less entities perfectly, it does not help in itself.
(B) The second run should be used to detect valid missing entities. (c) The third run should be used to detect valid missing ids.
2. What is "valid missing entity"? 3. What is "valid missing id"?
The (A) point was about finding entities without ';' ending. These entitites invalidate the XML document. After the XML document is fixed, there only remain valid stuff, and so the valid word in these names. They are a bit misleading, sure :)
As I said before, I don't have nsgmls, so my missing-entities.ent is empty. But validation process goes just fine with cmd> xmllint.exe --noent --noout --valid manual.xml
It might be close to perfect in the English tree, but remember we have a lot of translations, some of them having quite inexperienced team members.
except for one thing with Zend API:
element link: validity error : IDREF attribute linkend references an unknown ID "zend"
But it can be fixed by replacing <link linkend="zend"> with &zend; entity, which contents will depend on if ZENDAPI is available. i.e. if ZENDAPI <!ENTITY zend '<link linkend="zend">ZendAPI</link>'>; else <!ENTITY zend '<ulink url="http://www.zend.com/zend/api.php">ZendAPI</ulink>'; endif
Is there only one place linking to the Zend part?
To my point of view there should not be any "missing valid" entries.
There should not be, but there could be. Like I might add a link with &url.google;, but I forget to add this to the global entity file. Now I will not broke the build with this, but only that single link. If we remove the automatic missing entity/id creation, it becomes much easier (again) to break the build. This process is used to prevent the readers from suffering from errors made by authors/translators.
More probably, that these "missing valid" will transform into interfaces for linking external documentation, so php hacks aren't good.
No these are not always things linking to external docs. Most of the time, these are mispelled entity/id names, entity references, for which the actual added entities are forgotten, or existing documentation becoming broken due to id/entity name changes elsewhere.
Goba
