Hi Sönke, > On Mar 15, 2019, at 1:02 AM, Sönke Liebau > <soenke.lie...@opencore.com.INVALID> wrote: > >> If a work is directly editable, it is not binary. So I'd argue (with >> anyone who has a different opinion ;-) that files with .ppt, .odt, and >> other file types are source files because there are editors for them.
> Happy to argue :) > While you are right in principle, have you ever tried to diff a pptx file > with git? At some point during the initial discussions for this project, > someone stated something along the lines of "while pptx is xml under the > hood this is so complex and un-diffable that they can, in essence, be > treated as binary files" - I just sort of took that as gospel when writing > this. So now we should probably separate policy and logistics. Apache policy is "no binaries in releases" which is commonly understood to mean that releases contain the sources to build the release, and not the result of compiling. In this sense, contributed pptx is not binary because it is not the result of processing anything. So there is no issue with having releases of these files if we choose to release them. But I agree with you about logistics. A contributed pptx will be processed into a canonical form and updates should then be done on the canonical form. > > Again, I am absolutely not fundamentally disagreeing with you, I agree that > material in pptx can be useful, should be part of the indexed content, > properly tagged and all that. But, I do think that the overall target > should be to convert everything that is donated into our canonical (and > still to be defined) format eventually and updates done directly in there. > During that conversion, I think there may be stumbling stones if updates to > pptx content come in. I was hoping that the conversion from pptx to canonical form is mostly done automatically. I have no experience here, just hope. But I have no hope that we can diff pptx files meaningfully. Craig > >>> >>> That is all manageable I'm sure, but we should put some thought into >>> it up front I think. >>> >>> Best regards, >>> Sönke >>> > Craig L Russell c...@apache.org