Steven Noels wrote:
Stefano Mazzocchi wrote:

I'm more and more considering sitemap validation harmful.

why:

1) the sitemap logic is too hard to be validated from any validation language (it requires java runtime capabilitles)

2) it reduces the effort of clean and meaningful error messages in the treeprocessor


'Interesting' perspective, to say the least.

Some thoughts:

1) http://outerthought.net/downloads/sitemap.pdf and http://outerthought.net/downloads/sitemap_a4_poster.pdf

cat /usr/local/apache/logs/access_log | grep sitemap.pdf | wc -l -> 1825 downloads in 3 months (dec-jan-feb). Add some 2500 in the 4 months preceding that period. And another 2500 for the poster version, brings us to a total of 975 downloads / month for Bruno's sitemap poster.

... which means there's a _vested_ interest in trying to understanding the sitemap, and people are even willing to look at some graphical depiction of it in order to understand.

2) In our experience, when we confront people with the sitemap, they are bewildered until we give them a copy of Pollo with the sitemap grammar loaded into it and some very basic customization (http://pollo.sourceforge.net/sitemap1.png). I assume the same happens when people see Sunbow. Needless to say, having 3 different grammars for the sitemap is a major PITA (XSD, RNG and a Pollo-specific grammar) is troublesome at best, so some rationalization is more then appropriate.

3) Some days ago when investigating http://marc.theaimsgroup.com/?t=104643526200004&r=1&w=2, I encountered some way to 'address' a matched group of a matcher pattern when nesting matchers which I never heard of, and already forgot about it ATM. :-( I can say for myself that I do a reasonable effort in keeping up with new-things-Cocoon, but it was something I clearly missed. I'm pretty sure it is only 'documented in code' or on the mailing list somewhere.

Did I say that I consider having a sitemap schema descriptor harmful?


No, damn, I just said that I consider using that schema to validate the sitemap harmful.

Example, try

<generate uri="..."/>

where the uri attribute is not allowed in generate (shoulc be 'src'), the treeprocessor totally ignores this and sends the empty string to the parser, resulting in the error

System ID not found!

Sitemap validation has stopped us from fixing the error messaging capabilities on mistakes.


I don't parse this: in what way does the sitemap validation relieve somebody of the task of properly handling exceptions on the code level?

The level of error-cheching of the treeprocessor isn't really that pretty and know why? because validation removed most of the mistakes that *us* developers do... but when users don't validate, they come up with *wierd* error messages that don't give them *any* clue whatsoever on how to fix the problem.


My reasoning is that if we didn't have validation, we would see the same mistakes the users see and fix the treeprocessor instead of patching more and more the validation phase.



Reply via email to