The idea of XML diffing for applications like OpenOffice worries me a bit: Such diffing/merging algorithms can either be generic -- they operate on any two or three XML trees -- or application specific, tuned for a specific schema.
If generic, we'd expect them to work pretty well for very minor differences (e.g., someone rewords a couple of paragraphs) but fail gracelessly for larger changes. Naively done, a merge could produce output which is not valid to the application: it might fail to conform to the schema or, even if that bug is avoided, might produce far coarser conflict reports than one would like (e.g., "Sure, there are only 20 changed words in these five pages but all the application can say is `these 5 pages conflict'.") If application-specific, and we'd like all our personal productivity apps to have these "teamware" capabilities, then conjuring up format-specific diff/merge tools is yet another thing (like component architectures, configuration subsystem hooks, etc.) which every application is supposed to have. What tends to happen with such architectures is that many applications either don't have those parts at all or have them in a poor form. That state of affairs is *slowly* corrected over time in response to perceived demand but then only that the cost of considerable bloat in the overall stack. You get enough such "just one more thing" features that well-behaved applications are supposed to have and pretty soon you have an intractable system. I think it's saner to realize that merging is inevitably going to screw up syntax, from time to time. A lesson from merging of software source texts is that if user's can handle hacking the source directly, the syntax screw-ups aren't a serious problem -- perhaps even a virtue since user's can see the result and infer what's going on. I don't think we can reasonably expect average users to start learning XML, the schemas their applications demand, and the semantic constraints on those trees. XML won't do as the form in which to present data file source to users, although we do want something of comparable programmer-simplicity and expressive power. On the other hand, we know from history that we *can* expect average users to learn "technical typing" systems -- like the classic mark-up languages of yore. And we know from recent history that with a bit of care, this kind of syntax can be clean enough that source texts are legible (even attractive) and easy to learn through simple intuitive leaps. There's a psychological shift that I don't think is too scary for users and that might even be a bit demystifying and helpful to them: that your nice WYSIWYG tools are just fancy interfaces to a far more banal and not terribly scary mark-up style plain-text data file. (Certainly there are editors that are already close to this -- some for TeX, for example (although TeX is not exactly a particularly friendly or attractive markup language)). If we accept that users can make that shift, and that the mark-up source is more accessible, easier to type, and more fault tolerant than typical uses of XML -- then every application has a generic fallback to use when syntactically invalid input is encountered: just throw the source up at the user and let her puzzle it out. -t _______________________________________________ Gnu-arch-users mailing list [email protected] http://lists.gnu.org/mailman/listinfo/gnu-arch-users GNU arch home page: http://savannah.gnu.org/projects/gnu-arch/
