On Thu, 2003-08-14 at 22:58, Nathan Carl Summers wrote:
> > I haven't heard a single good argument for it except that it can do
> > most of the things that the XML/archive approach can do.
> s/most/all, and many other good things besides.
> > There was however nothing mentioned that it can do better. Or did I miss
> > something?
> XML is a text markup language. If the designers thought of using it for
> raster graphics, it was an afterthought at best. XML is simply the wrong
> tool for the job. The XML/archive idea is the software equivalent of
> making a motorcycle by strapping a go-cart engine to the back of a
> bicycle. It will work, of course, but it's an inelegant hack that will
> never be as nice as something designed for the job.
I think it is an elegant solution to the problem of designing a file
format w/o knowing beforehand what will have to go into it. I don't
think that binary chunks are feasible for a format that will have to
extend a lot while it is already being used. None of the file formats
mentioned provide this functionality and I think it is essential here.
> But to answer your question:
> 1. Putting metadata right next to the data it describes is a Good Thing.
> The XML "solution" arbitrarily separates human readable data from binary
> data. No one has yet considered what is to be done about non-human
> readable metadata, but I imagine it will be crammed into the archive file
> some way, or Base64ed or whatever. Either way is total lossage.
How is metadata in the archive total lossage? If the metadata is binary
it should of course be treated just like image data.
> 2. Imagine a very large image with a sizeable amount of metadata. If this
> seems unlikely, imagine you have some useful information stored in
> parasites. The user in our example only needs to manipulate a handfull of
> layers. A good way of handling this case is to not load everything into
> memory. Say that it just parses out the layer list at the start, and then
> once a layer is selected and the metadata is requested, it is read in.
> With the XML proposal, the parser would have to parse through every byte
> until it gets to the part it is interested in, which is inefficient.
The XML parser would only have to read in the image structure which
tells it where to locate the actual data in the archive, nothing else.
> 4. Implementing a reader for the XML/archive combo is unnecessarily
> complex. It involves writing a parser for the semantics and structure of
> XML, a parser for the semantics and structure of the archive format, and a
> parser for the semantics and structure of the combination. It is true
> that libraries might be found that are suitable for some of the work, but
> developers of small apps will shun the extra bloat, and such libraries
> might involve licensing fun.
We are already depending on an XML parser right now. I don't see any
problem here. I do know however that the code that reads stuff like TIFF
or PNG is ugly and almost unreadable. SAX-based XML parsers tend to be
darn simple however.
> The semantics and structure of the
> combination is not a trivial aspect -- with a corrupt or buggy file, the
> XML may not reflect the contents of the archive. With an integrated
> approach, this is not a concern.
I don't see how an integrated approach avoids this problem any better.
> 5. Either the individual layers will be stored as valid files in some
> format, or they will be stored as raw data. If they are stored as true
> files, they will be needlessly redundant and we will be limited to
> whatever limitations the data format we choose uses. If we just store raw
> data in the archive, then it's obvious that this is just a kludge around
> the crappiness of binary data in XML.
I don't understand you. If you think that raw data is a good idea, we
can have have raw data in the XML archive. Allowing a set of existing
file formats to be embedded makes the definition of our format a lot
simpler however and allows for various established compression
techniques to be used.
Gimp-developer mailing list