On dinsdag 8 september 2020 11:05:45 CEST Jos van den Oever wrote:
> On dinsdag 8 september 2020 10:59:37 CEST Christian Grün wrote:
> > > Oh, a shame that the cross-implementation module is not maintained.
> > 
> > The Archive Module was supposed to become the new EXPath standard.
> > Unfortunately, different versions of that module were specified one
> > after another such that the spec that’s currently publicly available
> > doesn’t reflect our implementation anymore [1].
> > 
> > I didn’t know that the ZIP Module is still maintained in other
> > implementations of XQuery. Is it still popular e.g. in eXist-db?
> 
> I've used it in production to create governemnt epub files (law bundles).
> 
> > > The archive module also compresses the 'mimetype' file with this code:
> > When calling archive:update, you can supply more properties with an
> > archive:entry element:
> > 
> > <archive:entry last-modified='2011-11-11T11:11:11'
> > 
> >                compression-level='8'
> >                encoding='US-ASCII'>hello.txt</archive:entry>
> 
> I assumed that files that are not mentioned in the archive:update call or
> zip:update-entries call would not be touched.
> 
> I'll see if this way works.

Calling with compression-level="0" still compresses the file. And because a 
call with update is done, the entire zip needs to be rewritten while taking 
care that 'mimetype' is the first entry even though the archive spec says "The 
relative order of all the existing and replaced entries within the archive is 
preserved." This example demonstrates that compression-level="0" does do what 
the api promises:

```xquery
let $file := "test.ods"
let $archive := file:read-binary($file)
let $mimetype := archive:extract-text($archive, "mimetype")
let $content_xml := fn:parse-xml(archive:extract-text($archive, 
"content.xml"))
let $content_xml := local:change($content_xml, local:add_number_value_type#1)
let $entries := (
   <archive:entry compression-level='0'>{"mimetype"}</archive:entry>,
   <archive:entry>{"content.xml"}</archive:entry>
)
let $contents := ($mimetype, fn:serialize($content_xml))
let $updated := archive:update($archive, $entries, $contents)
return file:write-binary($file, $updated) 
```

On the archive spec: the example in '3.1 Creating a simple EPUB document' is 
not valid XQuery and does not match the description of the function.

Best regards,
Jos


> > [1] http://expath.org/spec/archive/20130930
> > 
> > > let $file := "test.ods"
> > > let $archive := file:read-binary($file)
> > > let $content := parse-xml(archive:extract-text($archive, "content.xml"))
> > > let $content := local:change($content, local:add_number_value_type#1)
> > > let $updated := archive:update($archive, "content.xml", $content)
> > > return file:write-binary($file, $updated)
> > > 
> > > Cheers,
> > > Jos
> > > 
> > > > Hope this helps,
> > > > Christian
> > > > 
> > > > [1] https://docs.basex.org/wiki/Archive_Module
> > > > 
> > > > On Tue, Sep 8, 2020 at 9:29 AM Jos van den Oever
> > > > <j...@vandenoever.info>
> > > 
> > > wrote:
> > > > > Hello all,
> > > > > 
> > > > > As you might know, epub files and ODF files are zip files with
> > > > > specific
> > > > > contents. BaseX supports the expath zip module and could in theory
> > > > > be
> > > > > used
> > > > > for creating these files if it were not for a missing simple
> > > > > feature.
> > > > > 
> > > > > There is one rule for epub and ODF files that cannot be followed by
> > > > > BaseX
> > > > > at the moment: the first file in the zip container should be named
> > > > > 'mimetype' and is a plain test file that contains the mimetype
> > > > > string.
> > > > > This is meant to allow applications to read the mimetype at a fixed
> > > > > offset in the file and without doing decompression.
> > > > > 
> > > > > In unzip -vl it looks like this:
> > > > >  Length   Method    Size  Cmpr    Date    Time   CRC-32   Name
> > > > > 
> > > > > --------  ------  ------- ---- ---------- ----- --------  ----
> > > > > 
> > > > >       20  Stored       20   0% 10-14-2018 05:57 2cab616f  mimetype
> > > > > 
> > > > > Here is an XQuery to create a file with just that entry:
> > > > > 
> > > > > ````xquery
> > > > > declare namespace zip = "http://expath.org/ns/zip";;
> > > > > 
> > > > > let $zip :=
> > > > > <zip:file href="new.epub">
> > > > > 
> > > > >   <zip:entry name="mimetype" compressed="no" method="text">
> > > > >   
> > > > >     {"application/epub+zip"}
> > > > >   
> > > > >   </zip:entry>
> > > > > 
> > > > > </zip:file>
> > > > > return zip:zip-file($zip)
> > > > > ```
> > > > > 
> > > > > BaseX does not support the 'compressed' option. Without that option
> > > > > the
> > > > > file 'mimetype' is stored in compressed form and cannot be used by
> > > > > applications to quickly determine the mimetype of the file.
> > > > > 
> > > > > Modifying the xml in an exisiting epub or ODF with
> > > > > zip:update-entries
> > > > > is
> > > > > also not possible because the mimetype file is still compressed.
> > > > > 
> > > > > An additional issue: when reading a zip file, the entries in
> > > > > <zip:file>
> > > > > are
> > > > > not in the same order as they are in the zip file. So when modifying
> > > > > an
> > > > > existing file, the mimetype entry has to moved to the front of the
> > > > > list
> > > > > explicitly.
> > > > > 
> > > > > In short: to make BaseX support the creation of epub en ODF files it
> > > 
> > > should:
> > > > >  - support the 'compressed' attribute
> > > > >  - retain the order of files in the zip file in the <zip:file>
> > > > >  element.
> > > > > 
> > > > > Best regards,
> > > > > Jos

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to