Hi Alex,
I've just been catching up on my email list reading and noticed you've
raised this issue here and also in opendocument-users and xsl-list.
I'll answer here, with a DocBook specific response, but the same
arguments apply to other documentation formats - but perhaps this
discussion belongs in docbook-apps and should be continued there?
We (DeltaXML) currently provide a general purpose/well-formed xml
compare and a DocBook specific compare product. We are also working
on 3-way or more generally n-way 'Merge' products that will have more
relevance when used with revision control systems. However, Thomas
has already given you some good advice - an existing VCS system, perhaps
with additional normalization steps, may meet your requirements.
You asked do people use software revision control for documents - the
answer is certainly yes. As a company, we do this; we version our
product documentation with our source code. Until around 2 years ago it
was subversion (svn) and its now mainly mercurial (hg), git would have
been an appropriate alternative. Other approaches are to use a content
management system (CMS) or a filesystem with webdav/deltaV support. I
was recently at the DITA Europe 2012 conference (yes I know! - but the
answer I suspect is equally applicable to DocBook) and tried to asses
the use of software version control - when talking to people I often
asked the question - "do you use a CMS or software version control
systems" - the result was around 50/50. I suspect that software/IT
companies have the expertise readily available, and like us, find it
easier to adapt to using software version control.
Your follow-on question/email asked about line-based vs xml-aware
comparison:
On 08/12/2012 16:25, Alex S wrote:
Thank you for your time & responses. I remember reading somewhere that
a pure text/ linear comparison based tool/ system may not be ideal to
compare & merge XML tree structure based documents.
When using an XML-aware algorithm as part of the merge/update process
there is a possibility to get better results. For example, consider a
user on one branch using an editing or authoring tool which mixes up
attributes, for example reordering them or re-indenting them over
multiple lines. When the branches are merged you are likely to get a
"false conflict" with a line based algorithm, whereas an XML-aware
algorithm shouldn't identify a change.
Taking this a step further, you can do more if the tools/algorithms
understand the grammar or XML format being processed. Here's a DocBook
5 example, in the ancestor revision there is a section with a title and
an itemized list:
<sect1>
<title>Merge example</title>
<para>In this example...</para>...
In one branch a user adds an indexterm, in another branch a revhistory
is added.
The ancestor revision used in 3 way diff or merge algorithms allows them
to work out that different sets of lines have been inserted at the same
point (relative to the ancestor) and that they are not identical, and
hence gives a conflict. This is the conflict from mercurial:
<sect1>
<title>Merge example</title>
<<<<<<< mine
<indexterm><primary>Revision Control</primary></indexterm>
=======
<revhistory>
<revision>
<date>2012-12-12</date><revdescription><simpara>Testing
hg</simpara></revdescription>
</revision>
</revhistory>
>>>>>>> theirs
<para>In this example...</para>
However, the DocBook 5 grammar allows "one or more of"
revhistory/indexterm and so you could argue that this isn't really a
conflict here. Conversely, there are places in DocBook where you have a
choice of elements without a one-or-more (+), zero-or-more (*)
repetition qualifier and adding both of the choices from different
branches is definitely a conflict irrespective of how they are
represented as lines. We propose that in an XML 'grammar aware'
system, conflict can and should be related to the grammar rules.
We are addressing the software version control use-case with these
enhanced types of conflict detection in our upcoming 'merge' products.
Integrations with hg, git and svn (probably in that order) are
planned. One of the problems we found in the past was that software
version control usually handles just binary or text files. svn allowed
you to plug-in alternative merge or diff tools, but only for all types
of text file. We are planning to take another look at the interfaces to
see if there are any ways in which we can plug-in our algorithms only
for specific types of file.
Thanks,
Nigel
--
Nigel Whitaker, Software Architect, DeltaXML Ltd. "Experts in information
change"
[email protected] http://www.deltaxml.com +44 1684 869035
Registered in England: 02528681 Reg. Office: Monsell House, WR8 0QN, UK
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]