Not entirely sure that is what James is looking for…I THINK he’s more interested in reading than creating.
Commons compress has some example code at https://commons.apache.org/proper/commons-compress/examples.html: === InputStream fin = Files.newInputStream(Paths.get("some-file")); BufferedInputStream in = new BufferedInputStream(fin); OutputStream out = Files.newOutputStream(Paths.get("archive.tar")); Deflate64CompressorInputStream defIn = new Deflate64CompressorInputStream(in); final byte[] buffer = new byte[buffersize]; int n = 0; while (-1 != (n = defIn.read(buffer))) { out.write(buffer, 0, n); } out.close(); defIn.close(); === BOB From: MG <mg...@arscreat.com> Sent: Saturday, February 17, 2024 10:37 AM To: users@groovy.apache.org; Bob Brown <b...@transentia.com.au> Subject: Re: Cannot process zip file with Groovy I agree, would also recommend using Apache libs, we use e.g. the ZIP classes that come with the ant lib in the Groovy distribution (org.apache.tools.zip.*): Here is a quickly sanitzed version of our code (disclaimer: Not compiled/tested; Zip64Mode.Always is important if you expect larger files): InputStream zipInputStream(String compressedFilename) { final zipFile = new ZipFile(new File(compressedFilename)) final zipEntry = (ZipEntry) zipFile.entries.nextElement() if(zipEntry === null) { throw new Exception("${zipFile.name} has no entries") } final zis = zipFile.getInputStream(zipEntry) return zis } OutputStream zipOutputStream(String filename, String compressedFileExtension = "zip") { final fos = new FileOutputStream(filename + '.' + compressedFileExtension) final zos = new ZipOutputStream(fos) zos.useZip64 = Zip64Mode.Always // To avoid org.apache.tools.zip.Zip64RequiredException: ... exceeds the limit of 4GByte. final zipFileName = org.apache.commons.io.FilenameUtils.getName(filename) final zipEntry = new ZipEntry(zipFileName) zos.putNextEntry(zipEntry) return zos } Cheers, mg On 17/02/2024 00:52, Bob Brown wrote: MY first thought was “are you SURE it is a kosher Zip file?” Sometimes one gets ‘odd’ gzip files masquerading as plain zip files. Also, apparently “java.util.Zip does not support DEFLATE64 compression method.” : https://www.ibm.com/support/pages/zip-file-fails-route-invalid-compression-method-error IF this is the case, you may need to use: https://commons.apache.org/proper/commons-compress/zip.html (maybe worth looking at the “Known Interoperability Problems” section of the above doc) May be helpful: https://stackoverflow.com/a/76321625 HTH BOB From: James McMahon <jsmcmah...@gmail.com><mailto:jsmcmah...@gmail.com> Sent: Saturday, February 17, 2024 4:20 AM To: users@groovy.apache.org<mailto:users@groovy.apache.org> Subject: Re: Cannot process zip file with Groovy Hello Paul, and thanks again for taking a moment to look at this. I tried as you suggested: - - - - - - - - - - import java.util.zip.ZipInputStream def ff = session.get() if (!ff) return try { ff = session.write(ff, { inputStream, outputStream -> def zipInputStream = new ZipInputStream(inputStream) def entry = zipInputStream.getNextEntry() while (entry != null) { entry = zipInputStream.getNextEntry() } outputStream = inputStream } as StreamCallback) session.transfer(ff, REL_SUCCESS) } catch (Exception e) { log.error('Error occurred processing FlowFile', e) session.transfer(ff, REL_FAILURE) } - - - - - - - - - - Once again it threw this error and failed: ExecuteScript[id=ae3e5de5-018d-1000-ff81-b0c807b75086] Error occurred processing FlowFile: org.apache.nifi.processor.exception.ProcessException: IOException thrown from ExecuteScript[id=ae3e5de5-018d-1000-ff81-b0c807b75086]: java.util.zip.ZipException: invalid compression method - Caused by: java.util.zip.ZipException: invalid compression method It bears repeating: I am able to list and unzip the file at the linux command line, but cannot get it to work from the script. What is interesting (and a little frustrating) is that the NiFi UnpackContent will successfully unzip the zip file. However, the reason I am trying to do it in Groovy is that UnpackContent exposes the file metadata for each file in a tar archive - lastModifiedDate, for example - but it does not do so for files extracted from zips. And I need that metadata. So here I be. Can I explicitly set my (de)compression in the Groovy script? Where would I do that, and what values does one typically encounter for zip compression? Jim On Thu, Feb 15, 2024 at 9:26 PM Paul King <pa...@asert.com.au<mailto:pa...@asert.com.au>> wrote: What you are doing to read the zip looks okay. Just a guess, but it could be that because you haven't written to the output stream, it is essentially a corrupt data stream as far as NiFi processing is concerned. What happens if you set "outputStream = inputStream" as the last line of your callback? Paul. <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> Virus-free.www.avast.com<http://Virus-free.www.avast.com> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> On Fri, Feb 16, 2024 at 8:48 AM James McMahon <jsmcmah...@gmail.com<mailto:jsmcmah...@gmail.com>> wrote: > > I am struggling to build a Groovy scri[t I can run from a NiFi ExecuteScript > processor to extract from a zip file and stream to a tar archive. > > I tried to tackle it all at once and made little progress. > I am now just trying to read the zip file, and am getting this error: > > ExecuteScript[id=ae3e5de5-018d-1000-ff81-b0c807b75086] Error occurred > processing FlowFile: org.apache.nifi.processor.exception.ProcessException: > IOException thrown from > ExecuteScript[id=ae3e5de5-018d-1000-ff81-b0c807b75086]: > java.util.zip.ZipException: invalid compression method > - Caused by: java.util.zip.ZipException: invalid compression method > > > This is my simplified code: > > > import java.util.zip.ZipInputStream > > def ff = session.get() > if (!ff) return > > try { > ff = session.write(ff, { inputStream, outputStream -> > def zipInputStream = new ZipInputStream(inputStream) > def entry = zipInputStream.getNextEntry() > while (entry != null) { > entry = zipInputStream.getNextEntry() > } > } as StreamCallback) > > session.transfer(ff, REL_SUCCESS) > } catch (Exception e) { > log.error('Error occurred processing FlowFile', e) > session.transfer(ff, REL_FAILURE) > } > > > I am able to list and unzip the file at the linux command line, but cannot > get it to work from the script. > > > Has anyone had success doing this? Can anyone help me get past this error? > > > Thanks in advance. > > Jim > >