"Philip Oakley" <philipoak...@iee.org> writes:

> From: "Dale R. Worley" <wor...@alum.mit.edu>
>> From the original poster's point of view:  Yes, you can use Git to
>> store
>> various versions of MS Word documents, but you probably don't get much
>> benefit from doing so, since Git can't see into the different versions
>> of documents to see how they differ; to Git they're just blobs.  OTOH,
>> it may be that "collections of blobs" is all that you need the storage
>> system to provide.
>>
>> Konstantin Khomoutov <flatw...@users.sourceforge.net> writes:
>>> "Steve (Gadget) Barnes" <gadgetst...@hotmail.com> wrote:
>>>> At the risk of getting flamed for mentioning a differnt dVCS, the
>>>> Mercurial, (hg), project has a very sneaky extension called zipdoc
>>>> that stores the content of the zip files, (docx are actually zips
>>>> containing XML), and the fact that they belong in a specific .docx,
>>>> (or whatever), file.  On committing such a file it is actually
>>>> unzipped and the constituents either stored, or for an update,
>>>> diffed
>>>> and then on a pull they are pulled as constituent parts and then
>>>> zipped to reconstitute the original file.
>>>>
>>>> You could either consider using Mercurial or trying to find or
>>>> develop a similar extension.
>>>
>>> I wonder what this actually buys: you'll end up with a bunch of XML
>>> files (and picture files, if any, and the Manifest file, and so on),
>>> and the problem is that that XML file representing "the content" is
>>> as
>>> readable as the original .docx.  As they say, "XML combines the
>>> efficiency of text files with the readability of binary files" [1].
>>> I mean, diffing a machine-produced XML files, where a tiny
>>> logical change in a document could result in hefty parts of that XML
>>> swath rewritten is just marginally better than the original problem.
>>
>> The question is this:  If you make a small change to the document
>> (as a
>> human sees it), does this cause a small change to the XML files within
>> the Zip?  If the answer is Yes, then many revisions of a document
>> can be
>> stored densely in a repository.  And it might be possible to merge
>> small
>> differences in documents using a standard merging approach.
>>
>> But the only way to know would be to talk to someone who has
>> considerable experience with this.
>>
> While not having personal experience, I've seen a number of reports
> that the 'expanded XML' approach to "docx" style documents (including
> LibreOffice I understand), which are zips of XMLs, often fails because
> the main package presumes that the internal XML files are in a
> particular order. Once the zip has been expanded, that order of file
> components is lost, so when the VCS repackages the zip, the components
> are not in the right order, and the main program can't read it
> properly.
>
> The key to all this (doing version differencing) is to locate a method
> [program] which can be fed the old and new versions, and have the diff
> presented to you in a meaninful fashion. Often 'Word' style documents
> don't have a good way that is both meaningful and compact at the same
> time. (a human factors problem, not a coding problem ;-) !
>
> If the OP's originating program has a 'compare documents' mode then a
> small bit of coding should allow Git to feed the old version and new
> version to it, as long as it has an external API (rather than it all
> being via Gui/menu selection).

I might be completely off track here ( and probably dreaming), but
can;'t you define diff tools depending on file type? You could then use
MS Word and =compare the old and the new version? I think to remember
that I set it up some years ago?

Rainer


>
> --
>
> Philip 

-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, 
UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :       +33 - (0)9 53 10 27 44
Cell:       +33 - (0)6 85 62 59 98
Fax :       +33 - (0)9 58 10 27 44

Fax (D):    +49 - (0)3 21 21 25 22 44

email:      rai...@krugs.de

Skype:      RMkrug

PGP: 0x0F52F982

-- 
You received this message because you are subscribed to the Google Groups "Git 
for human beings" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Attachment: signature.asc
Description: PGP signature

Reply via email to