Hi (cc to the list), The filenames in your archive may be CP437-encoded, while many newer archives use Unicode. Unfortunately, the standard JDK ZIP library that we use is not smart enough to detect different filename encodings. We could add an $options argument to all Archive Functions that open and create/update archives [1]. For the moment, you will need to…
• create archives with Unicode filenames (e.g., with "zip --unicode"), • avoid characters in the 80-FF range from CP437 [2], or • invoke an external unzipper, e.g. via proc:system. Sorry for that, Christian [1] https://github.com/BaseXdb/basex/issues/2344 [2] https://en.wikipedia.org/wiki/Code_page_437 On Thu, Nov 14, 2024 at 12:59 PM Grythe, Thomas Berge < tho...@innlandetfylke.no> wrote: > Hi! > > I ran the code you sent me. I then found out the cause of the error. The > reason for the error message 'malput input' is that there was the letter > 'å' in a file name of the zip file (see the attached image). > > When I changed the title "2 klage på vedtak" to "2 klage paa vedtak", the > program worked. > > Is there an easy way to fix this so that the code can handle special > characters like "æ, ø" and "å"? > > Best regards, Thomas. > ------------------------------ > *Fra:* Christian Grün <christian.gr...@gmail.com> > *Sendt:* onsdag 13. november 2024 16:11 > *Til:* Grythe, Thomas Berge <tho...@innlandetfylke.no> > *Emne:* Re: [basex-talk] Potential bug in archive:extract-to > > > Denne eposten er sendt fra en person utenfor organisasjonen. Ikke klikk på > lenker eller åpne vedlegg før du er sikker på hvem avsender er og at > innholdet er trygt. > > Hi Thomas, > > maybe we can first try to simplify the script. Could you check what the > following code does? > > let $inputpath :="E:\Transfer\vaaler_websak – Kopi\" > for $name in //record//entry/text() > where contains($name, '.bin') > let $input := $inputpath || $name > let $target := $inputpath || substring-before($name, '.') > return archive:extract-to($target, $input) > > If yes, could you possibly send the problematic archive file to me (it > needn’t be shared over the list)? > > Thanks, > Christian > > > On Wed, Nov 13, 2024 at 3:58 PM Grythe, Thomas Berge < > tho...@innlandetfylke.no> wrote: > > Hi! > > Thank you for your reply. Attached is the code of my program and also an > image of the extracted files from a zip-file. If I run the program with > this zip -file, I am getting the error message 'malformed input'. > > Can the file -names cause this problem? > > Best regards, Thomas. > ------------------------------ > *Fra:* Christian Grün <christian.gr...@gmail.com> > *Sendt:* onsdag 13. november 2024 14:32 > *Til:* Grythe, Thomas Berge <tho...@innlandetfylke.no> > *Kopi:* basex-talk@mailman.uni-konstanz.de < > basex-talk@mailman.uni-konstanz.de> > *Emne:* Re: [basex-talk] Potential bug in archive:extract-to > > > Denne eposten er sendt fra en person utenfor organisasjonen. Ikke klikk på > lenker eller åpne vedlegg før du er sikker på hvem avsender er og at > innholdet er trygt. > > Hi Thomas, > > As Martin indicated, it would be interesting what the xquery:eval function > call does. Could you possibly provide us with a little self-contained > example? > > With BaseX 11 or later, you can simply do: > > archive:extract-to('/path/to/target', '/path/to/archive') > > Best, > Christian > > > On Tue, Nov 12, 2024 at 10:28 AM Grythe, Thomas Berge < > tho...@innlandetfylke.no> wrote: > > Hi! > > I am an electronic archivist and I have recently tried to use BaseX to > unzip files in an archive. The two main lines I have used are: > > let $archive := file:read-binary(xquery:eval($filepath-corrected) > return archive:extract-to(xquery:eval($dir_corrected), $archive) > > The variables $dir_corrected and $archive are defined earlier in the > code. But I get the error message 'malformed input off : 10, length : 1' , > indicating that there is an issue with the input data being processed. > > Do you know what can cause this problem? And do you know of a possible > work - around? > > > > Med vennlig hilsen > > *Thomas Berge Grythe* > Rådgiver > Innlandet Fylkesarkiv/IKA Opplandene > > Telefon: 48 99 47 85 > E-post: tho...@innlandetfylke.no > > > > > > *Innlandet fylkeskommune* > Telefon: 62 00 08 80 > www.innlandetfylke.no > > > > > >