Consequences of Working in Office Documents Here

Dennis E. Hamilton Tue, 21 Jun 2011 19:20:44 -0700

BACK STORY

On a different list, not just here on ooo-dev, there has been some surprise to 
see us putting binaries (ODF documents) into some SVN locations used by the 
PPMC.


My impression is that the experienced hands here in ASF are expecting to see 
DIFFs in commit messages on SVN, but binaries don't get DIFFed since it is 
usually unintelligible and almost always uninteresting.  For some, it is new 
news that ODF packages are not XML files.

Someone suggested that one could unpack the Zip of these documents and then do 
diffs of the respective XML parts and that could serve as a DIFF on what the 
changes are.  They also noticed they'd never seen that done.

THE INSIGHT

On seeing that suggestion (clearly the kinds of things developers think of, it 
being what we do), it struck me that we have a geeks are from Mars, users are 
from Venus situation here.

I think the clash of expectations has to do with the differences in tools that 
are applicable at the level we work at, and how we see what it is we are at 
work on.

We need to understand that we really have different experience sets, and they 
all are important in the context of the OpenOffice.org project.

A GEEKY LOOK

Here is a geeky explanation of why it does no good to figure out a better way 
to show DIFFs of the XML inside an ODF package if you want to know what an 
author contributor/committer changed.  (You might want that as a forensics 
tool, but not for knowing what someone changed in the course of their work on a 
document.)

My (updated) explanation:

The problem is that diff-ing the XML is not what's wanted.  That's like 
decompiling two programs and posting a diff of the assembly language.  (There 
are also binary blobs -- I said blogs by mistake in another post -- in the 
Zipped ODF package.)

The level of abstraction that one cares about for accounting for changes in a 
document in one of these formats is at the presentation or print-preview level. 
 There are document compare utilities that provide such functions.  It's like 
the comparison you get between two wiki pages.  It isn't shown as a comparison 
of the WikiText, but of the resulting presentation anywhere I've looked.  (I 
know that on Apache we have a production process where we use SVN as a 
publishing location and see diffs of Markdown a kind of plaintext markup.  I 
know that fits beautifully into the source-code revision developer toolcraft 
model, but you wouldn't want to know about changes in an ODF document that way, 
BECAUSE IT IS NOT WHAT IS AUTHORED.)

There are also change-tracking (historically called red-lining in my 
experience) provisions in the ODF Format and the software products handle it to 
varying degrees of reliability.  This is like showing a kind of merge with the 
removed text and the inserted text all shown in the document and distinguished 
by highlighting and strikethroughs of various forms.  A reviewer can agree to 
accept a change or can reject a change, make more changes, etc.

So there are (at least) two different levels of envisioning, of toolcraft and 
of work practices among us.  At one level, there is the world of SVN, compiler 
and build processes, and source code in simply-formatted text.  For ODF (and 
OOXML and more of these), the XML in the Zip is object code, not the source 
code.  The source code counterpart is at quite another level.

Worlds are colliding here on Apache OpenOffice.org.  It is going to be very 
interesting what we learn from each other and how we manage to function in some 
kind of shared culture within the Apache Way.

Some of us navigate both levels with some fluency.  That is not the case for 
most of us and, I am learning, not natural for me either: OpenOffice is not my 
tool of choice apart from using it as an ODF forensic tool, and my development 
toolcraft is not SVN, LAMP, etc.

It is very important to grasp this, because if we don't recognize it, the 
authors of documentation and people working at the user-issues level are going 
to be left with no way to fit in and not much that feels like it is appropriate 
for their specialized activities.

 - Dennis

Consequences of Working in Office Documents Here

Reply via email to