[
https://issues.apache.org/jira/browse/COR-18?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281900#comment-14281900
]
David Fisher commented on COR-18:
---------------------------------
We should look to see if Apache Commons or HTTPD have an Apache Licensed
version.
> Replacing MiniZip
> -----------------
>
> Key: COR-18
> URL: https://issues.apache.org/jira/browse/COR-18
> Project: Corinthia
> Issue Type: Bug
> Components: DocFormats - platform
> Environment: source
> Reporter: jan iversen
> Assignee: jan iversen
> Priority: Blocker
> Fix For: 0.5
>
>
> MiniZip is a bit thin and, because of some changes needed, it might be better
> to replace it in the DocFormats/3rdparty/external/ folder, as @peterkelly
> observes at #26 (comment)
> Easy Steps
> For now, it might be desirable to simply replace the current code with
> MiniZip 1.1 from http://www.winimage.com/zLibDll/minizip.html
> Since it is a simple dependency, this should work fine so long as there are
> no breaking API changes in between 1.0h and 1.1.
> Eventually?
> It would be good to have something behind a stable API that permits random
> access for reading file streams as Peter suggests. Ideally, that API would be
> aligned around the Document Container File (DCF) profile of the official
> PKWare specification that is used commonly among ePub, ODF, and the Open
> Packaging Conventions (OPC) used in OOXML and elsewhere. I don't know what
> the latest status of that profile is at ISO/IEC JTC1 SC34, but it will become
> a common international specification for these specialized usage of Zip as a
> compound document-format container file.
> There are other places to look for ideas and possible sources of reusable
> code and API considerations, including in Apache OpenOffice, the Apache ODF
> Toolkit (using Java). , and the Microsoft open-sourcing of its OOXML-access
> layer (in .NET I think). And the Microsoft platform has some native support
> that it might be useful to be able to rely on in Windows-targeted builds.
> There is also a CodePlex LibOPC project that is C code under a BSD-form
> license at https://libopc.codeplex.com/ One interesting feature of LibOPC
> that may interest Apache OpenOffice folk (i.e., @janiversen) is a python
> script for generating Visual Studio projects that can be used for
> manipulating and building on Windows.
> One caveat. For ingesting Zip-based document files, there needs to be a fair
> amount of code to ensure resiliency and defense against DOS-ing of
> applications with malformed document files. That may have to be grown, with
> attention to the code footprint on limited-capacity devices (where presumably
> some of the heavy-lifting is off-loaded to the cloud). It is an interesting
> feature of the OPC specification is that it is also designed to support
> remoting of the document streams in a way where there is no requirement that
> a Zip file be transferred to the client. That may be very much eventually,
> but it is useful to think about having an API that would allow for that
> underneath.
> Lest we forget?
> Although this is all .NET-fu, there may be useful ideas on this project,
> https://github.com/OfficeDev/Open-Xml-Sdk
> as a source of ideas (and some of the system-level dependencies may have
> Native Windows counterparts as well). This might be useful for mining for
> other ideas higher up in the API modeling too.
> ---
> I didn't think to mention POI and whatever they use as a model close to the
> Zip packages.
> I didn't realize until looking at the proposal to become an Apache incubator
> project that the sources for minizip and tidy-html5 are not pristine. It
> would be good to reconstruct the modification process and leave more
> footprints if the changes are not in the repository here. (Actually, it would
> be good to reconstruct the modification anyhow, but diffs from git would be
> helpful.)
> I'm thinking that there is no hurry to replace these in early stages. If a
> better API is desired, the first step of getting that in place would be to
> build a shim that goes from that API to anything hand at first, such as
> minizip or some other library, and worry about fit and performance later.
> jan:
> POI is in java, so they have other packages available.
> I am currently working on expanding the platform part to also include zip and
> html, so that we can change the libraries at a later stage. I think your idea
> of using libOPC is valid and interesting...you, peter and svante knows better
> if it fits to the project.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)