 in such an approach and I am sure that not many XML parsers will like
 CDATA blocks of several megabytes.

_all_ xml parsers cope with cdata blocks of several megabytes.

But the fact is that you're going to end up having to Base64 encode all the image data - which will blow the physical file size WAY out of proportion. And if don't do that (ie. attempt to leave in binary data), then you are violating the spirit of XML's design goals.

