[Gnu-arch-users] merging OpenOffice formats and the like

Thomas Lord Tue, 11 Oct 2005 08:51:59 -0700

The idea of XML diffing for applications like OpenOffice
worries me a bit:

Such diffing/merging algorithms can either be generic -- 
they operate on any two or three XML trees -- or application
specific, tuned for a specific schema.


If generic, we'd expect them to work pretty well for very 
minor differences (e.g., someone rewords a couple of paragraphs)
but fail gracelessly for larger changes.  Naively done, 
a merge could produce output which is not valid to the application:
it might fail to conform to the schema or, even if that bug is
avoided, might produce far coarser conflict reports than one
would like (e.g., "Sure, there are only 20 changed words in these
five pages but all the application can say is `these 5 pages
conflict'.")

If application-specific, and we'd like all our personal productivity
apps to have these "teamware" capabilities, then conjuring up
format-specific diff/merge tools is yet another thing (like 
component architectures, configuration subsystem hooks, etc.)
which every application is supposed to have.   What tends to 
happen with such architectures is that many applications either
don't have those parts at all or have them in a poor form.  That
state of affairs is *slowly* corrected over time in response to
perceived demand but then only that the cost of considerable bloat
in the overall stack.   You get enough such "just one more thing"
features that well-behaved applications are supposed to have and
pretty soon you have an intractable system.

I think it's saner to realize that merging is inevitably going
to screw up syntax, from time to time.   A lesson from merging
of software source texts is that if user's can handle hacking
the source directly, the syntax screw-ups aren't a serious 
problem -- perhaps even a virtue since user's can see the result
and infer what's going on.

I don't think we can reasonably expect average users to start
learning XML, the schemas their applications demand, and the
semantic constraints on those trees.  XML won't do as the form
in which to present data file source to users, although we do
want something of comparable programmer-simplicity and expressive
power.

On the other hand, we know from history that we *can* expect
average users to learn "technical typing" systems -- like 
the classic mark-up languages of yore.  And we know from recent
history that with a bit of care, this kind of syntax can be 
clean enough that source texts are legible (even attractive)
and easy to learn through simple intuitive leaps.

There's a psychological shift that I don't think is too scary
for users and that might even be a bit demystifying and helpful
to them:  that your nice WYSIWYG tools are just fancy interfaces
to a far more banal and not terribly scary mark-up style plain-text
data file.  (Certainly there are editors that are already close
to this -- some for TeX, for example (although TeX is not exactly
a particularly friendly or attractive markup language)).

If we accept that users can make that shift, and that the mark-up
source is more accessible, easier to type, and more fault tolerant
than typical uses of XML -- then every application has a generic
fallback to use when syntactically invalid input is encountered:
just throw the source up at the user and let her puzzle it out.

-t





_______________________________________________
Gnu-arch-users mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/gnu-arch-users

GNU arch home page:
http://savannah.gnu.org/projects/gnu-arch/

[Gnu-arch-users] merging OpenOffice formats and the like

Reply via email to