Hi Andreas,

Andreas L Delmelle wrote:
> On Dec 18, 2007, at 13:17, Vincent Hennebert wrote:
> 
> <snip />
>>>> IIUC, this check isn’t necessary? If test children weren’t allowed this
>>>> would have been caught by the validateChildNode method?
>>>
>>> No, because validateChildNode() is not called for text-nodes.
>>
>> Hmmm, this is unfortunate IMO. That makes validation unreliable, and
>> that forces to implement hacks like in Wrapper.
> 
> Unfortunate, maybe, but I'm not sure why implementing an override for a
> method that contains behavior that is specific to instances of the
> Wrapper class should be considered a hack...?

Because Wrapper is no longer agnostic, and contains an if statement that 
could be avoided. Whether text nodes are acceptable or not should be 
determined in the same way as for validateChildNode, by deferring to the 
parent node.
If validateChildNode fails, then no addChildNode method is called; that 
should be the same for addCharacters IMO (call it addTextNode).

> FTM I'm trying to fit it in in the current design as cleanly as
> possible.

Valid concern, sorry if I appeared rude. Still, it’s a hack, so it 
should either be removed, or temporarily left as is (even if it’s meant 
to stay in the code for years ;-) ) but properly documented as such. 
A TODO comment, with basically the few lines you wrote to answer my 
question, would do the job (and then I wouldn’t have asked the question 
:-P).
I’m certainly not against introducing hacks in the code, I’ve done that 
myself in a few places, but they should clearly appear as such IMO, so 
that newcomers aren’t lost or misled.


> It's good to see that the question is raised, instead of my
> changes being blindly accepted.

Well after all it’s an opportunity to explore the topic in more extent 
:-)


> Then again, as I indicated earlier, something seems fishy/awkward about
> hardcoding the validation rules in the FOs, since this leaves too little
> room for adding extensions.

That can be discussed independently in my opinion, so I won’t enter the 
details too much. Basically validateChildNode should only deal with 
elements from the XSL-FO namespace. The validation of foreign element 
belongs to another process that will indeed have to be discussed 
somehow, somewhere, at some moment ;-)


> IMO, we need some way of registering valid child nodes at runtime. The
> default rules in the Recommendation can be fixed at compile time, but
> there is also still room for improvement.
> 
> Note, for instance, that there's a significant number of FOs that allow
> no children, and this leads to a lot of code-duplication, as they all
> individually implement the method like:
> 
> protected void validateChildNode(...) throws ValidationException {
>   invalidChildError(..., "XSL Content Model: empty");
> }

Actually this is more something like “invalidChildError(loc, nsURI, 
localName);”, which is different (!), and which I’m happy with, see 
below.


> I was thinking in the direction of using a dummy marker interface, so
> the duplicate overrides can all be removed, and consolidated in the
> superclass.
> 
> For example:
> 
> public interface EmptyFObj { //empty body, only used as a marker
> interface, like Cloneable or Serializable }
> ...
> 
> public class ExternalGraphic extends AbstractGraphics implements
> EmptyFObj { ... }
> ...
> 
> in FObj.validateChildNode():
> 
> if (this instanceof EmptyFObj) {
>   invalidChildError(..., "XSL Content Model: empty");
> }

While this is a very sensible solution, I’m a bit uncomfortable with it. 
This would put validation at two different levels: one in FObj, which 
will test for every EmptyFObj; and one in each remaining non EmptyFObj 
class, further down in the hierarchy. Moreover an if test would now be 
performed for each FObj instance. That kills a bit the object-oriented 
design IMO. It seems more consistent to me to rely on method 
re-definition, after all this is the very purpose of object inheritance.

Actually the current situation looks fine to me, since the methods do 
nothing but deferring to a common invalidChildError method (at least all 
the methods I’ve checked). So what can be factorised has already been 
done.

The ideal solution is probably to use object composition, with 
a delegate object dedicated to validation. FObjs would be passed the 
appropriate object at their creation. We would have 
a NoChildAllowedValidator, a MixedContentValidator, etc. Then only one 
NoChildAllowedValidator instance would be needed for every FObj with 
empty content. And for those FObjs which have a particular content 
model, we could use an anonymous class etc... A bit overkill maybe ;-), 
that’s why I’m happy with the current situation as long as nothing more 
is done than calling a common factorised method.


>>> One could argue that the default characters() event should also do
>>> some
>>> form of validation, but that could turn out to be very tricky. The SAX
>>> parser is obliged to report *all* characters (including possible
>>> ignorable whitespace). We would already need to perform
>>> space-normalization here to make sure there really are stray characters
>>> in places where they do not belong...
>>
>> Well it’s not the same kind of normalization as required by the
>> Recommendation. For objects accepting mixed content we would accept any
>> kind of text node, and the proper normalization would indeed be done
>> later.
>> For the other objects we just need to know if the text node contains XML
>> whitespace only (space, tab, line-feed, carriage-return). Something like
>> String.trim().length() == 0 should be enough for the validation. WDYT?
> 
> Hmm... but in some cases, the white-space could be significant
> (depending on the FO properties
> white-space-treatment/white-space-collapse/linefeed-treatment of the
> containing block, if any).
> 
> Take:
> 
> <fo:block white-space-treatment="preserve">
> **<fo:block-container>
> ****<fo:wrapper>
> ******<fo:block>
> 
> The *'s are preserved space-characters. The first two are allowed, the
> other six are, very strictly speaking an error (albeit, an easily
> recoverable one).

(And what about the 4 in the middle?)
I don’t agree with your interpretation. I’d say:
- fo:block accepts PCDATA, so the first 2 stars will be recorded and 
  subject to further whitespace refinement (and appear to be retained).
- fo:block-container doesn’t accept PCDATA, so the 4 stars in the middle 
  will be silently ignored as XML whitespace characters resulting from 
  some pretty-printing process.
- the last 6 stars will be silently ignored as well, due to the content 
  model of the wrapper’s parent; had this parent been an fo:block, they 
  would have been recorded as text and be subject to further whitespace 
  refinement (and retained).

WDYT?
Vincent


-- 
Vincent Hennebert                            Anyware Technologies
http://people.apache.org/~vhennebert         http://www.anyware-tech.com
Apache FOP Committer                         FOP Development/Consulting

Reply via email to