Re: application/rpc+xml, and another RFCication attempt

Mario Salzer Wed, 21 Jan 2004 13:16:33 -0800

Ken Gengler wrote:
> True, although the <![CDATA[ sequence is only really likely to be seen 
> in cases where your sending an XML document within the XML-RPC 
> document. I'm more comfortable putting a restriction on such embedded 
> documents than I am on the XML-RPC document. The reason is that I 
> building an XML DOM explicitly and, thus, I'd have to add a CDATA node 
> explicitly to have one in the output. Thus, I can explicitly avoid it  
> - and I can easily comply with the restriction. I don't know if that 
> would be universally acceptable.


Yes, this only depends on the string you'd like to put into CDATA. If
it's very likeley that "]]" won't appear inside of the string, it is
probably ok to pass it through as is. (I would get problems, as this
sequence was likely to appear within some of my data.)

But the CDATA is of course an optimization against entity parsing
and replacing.

> Allowing them only in string tags seem reasonable. Again, I'm using 
> them to avoid the encoding hit over large strings. ints or booleans 
> should never be more than a few characters.

There were always some exceptions for <string>s in the XML-RPC spec,
so I would say it doesn't matter in this case to add a senseful one
this time.
(speaking of exceptions, the "<value>textdata...</value>" stupidity
already was removed from that RFC draft.)

> > But the <emptytag/> problem makes me really worry, because it was a
> > really huge slowdown for token parsers to scan for #<emptytag\s*/>#
> > then. Isn't is possible with newer toolkits to instruct them to write
> > the tags in canonicalized form in the first place?
>
> Well, i know that JDOM can be told to do so. And, if other parsers do 
> as well, I think this is an acceptable restriction. It just worried 
> when when I heard the statement "simple search replacing" as a 
> requirement for the server. On any string of decent size, it can become 
> a performance problem for a high volume service.

The no-short-empty-tags rule would also only apply to <string>s, because
it can't again happen to <int>, <boolean> or <dateTime...>, which all
have a strong content requirement. So the "simple search replacing"
would really be stupid, even if a XML parser wouldn't support
outputting the long version of empty tags (which I think is unlikely).
But if a XML lib didn't support it, then an exceptional test for an
empty string and writing "<string></string>" into the output stream
on behalf of the XML library would be acceptable.

> I'm left wondering if there couldn't be some sort of "acceptable 
> formats" negotiation in the XML-RPC request. If the client stated that 
> it couldn't handle empty tags or CDATA's, etc., the server could avoid 
> using them. That way, the server could conserve resources for the 
> majority of the clients that work with it. Perhaps this "acceptable 
> formats" information could simply be an attribute to the methodCall 
> tag?

This is an interesting idea, and I already thought about this too. But
then we would likely end up with two different XML-RPC versions. While
I personally believe it is an easy task to make really stupid token
parsers work with XML-RPC by adding some restrictions, I don't say, this
should be done to the disadvantage of XML parser based solutions. (Not
at all!)
And I really hope to allow token parsers with just a few restrictions
on the format (nobody ever used XML comments and namespaces, and tag
attributes were never part of the XML-RPC spec).

Adding attribs to <methodCall> looks a bit overkill; and what now came
to my mind, is abusing the XML processing instruction:
<?xml version="1.0" encoding="UTF-8"?>

Here it would be possible to add attribs like 'canonicalized="yes"'
or 'emptytags="long"' to impose some format restrictions or to even
tell that the party on the other end was a limited XML parser
('parser="no"'). Actually the processing tag counts for the current
document, so this was a little bit abuse, but as long as nobody tells
the [EMAIL PROTECTED] ;)

Sadly, the w3c hasn't made any recommendations on Simplified XML (and
I don't like to get flamed on the XML mailing list for asking there ;)
But I guess putting some notes into the <?xml...?> tag would be
acceptable - I do some research on this.

mario

Re: application/rpc+xml, and another RFCication attempt

Reply via email to