I think that the important part here is that others can review the
work being done. When that work is encapsulated behind binary formats,
then it makes it *very* difficult to perform that review.

Sure, some artifacts in the repository *need* to be binary. Nobody
will dispute that.

But when the primary work of this PMC can be done in a reviewable
format, then it helps all of us to make that happen.

Cheers,
-g

On Wed, Jun 22, 2011 at 01:29, Dave Fisher <[email protected]> wrote:
> On Jun 21, 2011, at 8:58 PM, Daniel Shahaf wrote:
>
>> Dennis E. Hamilton wrote on Tue, Jun 21, 2011 at 19:20:13 -0700:
>>> BACK STORY
>>>
>>> On a different list, not just here on ooo-dev, there has been some
>>> surprise to see us putting binaries (ODF documents) into some SVN
>>> locations used by the PPMC.
>>>
>>> My impression is that the experienced hands here in ASF are expecting
>>> to see DIFFs in commit messages on SVN, but binaries don't get DIFFed
>>> since it is usually unintelligible and almost always uninteresting.
>>> For some, it is new news that ODF packages are not XML files.
>>>
>>> Someone suggested that one could unpack the Zip of these documents and
>>> then do diffs of the respective XML parts and that could serve as
>>> a DIFF on what the changes are.  They also noticed they'd never seen
>>> that done.
>>>
>>> THE INSIGHT
>>>
>>> On seeing that suggestion (clearly the kinds of things developers
>>> think of, it being what we do), it struck me that we have a geeks are
>>> from Mars, users are from Venus situation here.
>>>
>>> I think the clash of expectations has to do with the differences in
>>> tools that are applicable at the level we work at, and how we see what
>>> it is we are at work on.
>>>
>>> We need to understand that we really have different experience sets,
>>> and they all are important in the context of the OpenOffice.org
>>> project.
>>>
>>> A GEEKY LOOK
>>>
>>> Here is a geeky explanation of why it does no good to figure out
>>> a better way to show DIFFs of the XML inside an ODF package if you
>>> want to know what an author contributor/committer changed.  (You might
>>> want that as a forensics tool, but not for knowing what someone
>>> changed in the course of their work on a document.)
>>>
>>> My (updated) explanation:
>>>
>>
>> Long email.  In the end, the expectation is for commit mails to contain
>> reviewable diffs, I don't think you've addressed how that might be done?
>
> As far as I know binary files are acceptable elsewhere in SVN.
>
>>
>> (as opposed to how it shouldn't be done)
>
> Generally ODF files will be documentation and testcases, and generally 
> consistent., like PNGs, JPEGs, etc. No one complains about PDFs or any of the 
> MS Office formats in SVN. We haven't seemed to care about that in the Apache 
> POI project, I can't answer for PDFBox.
>
> I unzipped an ODF zip then each part is a huge set of verbose xml on two 
> lines. Header and data. For example, content.xml.
>
> <?xml version="1.0" encoding="UTF-8"?>
> <office:document-content 
> xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0" 
> xmlns:style="urn:oasis:names:tc:opendocument:xmlns:style:1.0" 
> xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0" 
> xmlns:table="urn:oasis:names:tc:opendocument:xmlns:table:1.0" 
> xmlns:draw="urn:oasis:names:tc:opendocument:xmlns:drawing:1.0" 
> xmlns:fo="urn:oasis:names:tc:opendocument:xmlns:xsl-fo-compatible:1.0" 
> xmlns:xlink="http://www.w3.org/1999/xlink"; 
> xmlns:dc="http://purl.org/dc/elements/1.1/"; 
> xmlns:meta="urn:oasis:names:tc:opendocument:xmlns:meta:1.0" 
> xmlns:number="urn:oasis:names:tc:opendocument:xmlns:datastyle:1.0" 
> xmlns:presentation="urn:oasis:names:tc:opendocument:xmlns:presentation:1.0" 
> xmlns:svg="urn:oasis:names:tc:opendocument:xmlns:svg-compatible:1.0" 
> xmlns:chart="urn:oasis:names:tc:opendocument:xmlns:chart:1.0" 
> xmlns:dr3d="urn:oasis:names:tc:opendocument:xmlns:dr3d:1.0" xmlns: ....
>
> Diff won't work easily. Maybe SVN needs to provide "zip" storage and then 
> "xml" diff within. Could the Subversion project whip that out now. We'll wait 
> until they do before we proceed. I'm being sarcastic here. But if it 
> available now that would be pretty cool.
>
> The real issue is that a binary document was used to update a table where 
> everyone made changes. Changes that were important to those viewing the 
> commit messages. I know we all love office documents around here, but ...
>
> Maybe we should be exchanging that particular file as a CSV.
>
> (BTW - I notice that Calc's save options don't include XLSX, etc.)
>
> Best Regards,
> Dave
>
>

Reply via email to