Ken Gengler wrote: > True, although the <![CDATA[ sequence is only really likely to be seen > in cases where your sending an XML document within the XML-RPC > document. I'm more comfortable putting a restriction on such embedded > documents than I am on the XML-RPC document. The reason is that I > building an XML DOM explicitly and, thus, I'd have to add a CDATA node > explicitly to have one in the output. Thus, I can explicitly avoid it > - and I can easily comply with the restriction. I don't know if that > would be universally acceptable.
Yes, this only depends on the string you'd like to put into CDATA. If it's very likeley that "]]" won't appear inside of the string, it is probably ok to pass it through as is. (I would get problems, as this sequence was likely to appear within some of my data.) But the CDATA is of course an optimization against entity parsing and replacing. > Allowing them only in string tags seem reasonable. Again, I'm using > them to avoid the encoding hit over large strings. ints or booleans > should never be more than a few characters. There were always some exceptions for <string>s in the XML-RPC spec, so I would say it doesn't matter in this case to add a senseful one this time. (speaking of exceptions, the "<value>textdata...</value>" stupidity already was removed from that RFC draft.) > > But the <emptytag/> problem makes me really worry, because it was a > > really huge slowdown for token parsers to scan for #<emptytag\s*/># > > then. Isn't is possible with newer toolkits to instruct them to write > > the tags in canonicalized form in the first place? > > Well, i know that JDOM can be told to do so. And, if other parsers do > as well, I think this is an acceptable restriction. It just worried > when when I heard the statement "simple search replacing" as a > requirement for the server. On any string of decent size, it can become > a performance problem for a high volume service. The no-short-empty-tags rule would also only apply to <string>s, because it can't again happen to <int>, <boolean> or <dateTime...>, which all have a strong content requirement. So the "simple search replacing" would really be stupid, even if a XML parser wouldn't support outputting the long version of empty tags (which I think is unlikely). But if a XML lib didn't support it, then an exceptional test for an empty string and writing "<string></string>" into the output stream on behalf of the XML library would be acceptable. > I'm left wondering if there couldn't be some sort of "acceptable > formats" negotiation in the XML-RPC request. If the client stated that > it couldn't handle empty tags or CDATA's, etc., the server could avoid > using them. That way, the server could conserve resources for the > majority of the clients that work with it. Perhaps this "acceptable > formats" information could simply be an attribute to the methodCall > tag? This is an interesting idea, and I already thought about this too. But then we would likely end up with two different XML-RPC versions. While I personally believe it is an easy task to make really stupid token parsers work with XML-RPC by adding some restrictions, I don't say, this should be done to the disadvantage of XML parser based solutions. (Not at all!) And I really hope to allow token parsers with just a few restrictions on the format (nobody ever used XML comments and namespaces, and tag attributes were never part of the XML-RPC spec). Adding attribs to <methodCall> looks a bit overkill; and what now came to my mind, is abusing the XML processing instruction: <?xml version="1.0" encoding="UTF-8"?> Here it would be possible to add attribs like 'canonicalized="yes"' or 'emptytags="long"' to impose some format restrictions or to even tell that the party on the other end was a limited XML parser ('parser="no"'). Actually the processing tag counts for the current document, so this was a little bit abuse, but as long as nobody tells the [EMAIL PROTECTED] ;) Sadly, the w3c hasn't made any recommendations on Simplified XML (and I don't like to get flamed on the XML mailing list for asking there ;) But I guess putting some notes into the <?xml...?> tag would be acceptable - I do some research on this. mario