I agree, would also recommend using Apache libs, we use e.g. the ZIP
classes that come with the ant lib in the Groovy distribution
(org.apache.tools.zip.*):
Here is a quickly sanitzed version of our code (disclaimer: Not
compiled/tested; Zip64Mode.Always is important if you expect larger files):
InputStream zipInputStream(String compressedFilename) {
final zipFile = new ZipFile(new File(compressedFilename))
final zipEntry = (ZipEntry) zipFile.entries.nextElement()
if(zipEntry === null) { throw new Exception("${zipFile.name} has no
entries") }
final zis = zipFile.getInputStream(zipEntry)
return zis
}
OutputStream zipOutputStream(String filename, String
compressedFileExtension = "zip") {
final fos = new FileOutputStream(filename + '.' +
compressedFileExtension)
final zos = new ZipOutputStream(fos)
zos.useZip64 = Zip64Mode.Always // To avoid
org.apache.tools.zip.Zip64RequiredException: ... exceeds the limit of
4GByte.
final zipFileName =
org.apache.commons.io.FilenameUtils.getName(filename)
final zipEntry = new ZipEntry(zipFileName)
zos.putNextEntry(zipEntry)
return zos
}
Cheers,
mg
On 17/02/2024 00:52, Bob Brown wrote:
MY first thought was “are you SURE it is a kosher Zip file?”
Sometimes one gets ‘odd’ gzip files masquerading as plain zip files.
Also, apparently “java.util.Zip does not support DEFLATE64 compression
method.” :
https://www.ibm.com/support/pages/zip-file-fails-route-invalid-compression-method-error
IF this is the case, you may need to use:
https://commons.apache.org/proper/commons-compress/zip.html
(maybe worth looking at the “Known Interoperability Problems” section
of the above doc)
May be helpful: https://stackoverflow.com/a/76321625
HTH
BOB
*From:*James McMahon <jsmcmah...@gmail.com>
*Sent:* Saturday, February 17, 2024 4:20 AM
*To:* users@groovy.apache.org
*Subject:* Re: Cannot process zip file with Groovy
Hello Paul, and thanks again for taking a moment to look at this. I
tried as you suggested:
- - - - - - - - - -
import java.util.zip.ZipInputStream
def ff = session.get()
if (!ff) return
try {
ff = session.write(ff, { inputStream, outputStream ->
def zipInputStream = new ZipInputStream(inputStream)
def entry = zipInputStream.getNextEntry()
while (entry != null) {
entry = zipInputStream.getNextEntry()
}
*outputStream = inputStream*
} as StreamCallback)
session.transfer(ff, REL_SUCCESS)
} catch (Exception e) {
log.error('Error occurred processing FlowFile', e)
session.transfer(ff, REL_FAILURE)
}
- - - - - - - - - -
Once again it threw this error and failed:
ExecuteScript[id=ae3e5de5-018d-1000-ff81-b0c807b75086] Error occurred
processing FlowFile:
org.apache.nifi.processor.exception.ProcessException: IOException
thrown from ExecuteScript[id=ae3e5de5-018d-1000-ff81-b0c807b75086]:
java.util.zip.ZipException: invalid compression method
- Caused by: java.util.zip.ZipException: invalid compression method
It bears repeating: I am able to list and unzip the file at the linux
command line, but cannot get it to work from the script.
What is interesting (and a little frustrating) is that the NiFi
UnpackContent /will /successfully unzip the zip file. However, the
reason I am trying to do it in Groovy is that UnpackContent exposes
the file metadata for each file in a tar archive - lastModifiedDate,
for example - but it does /not/ do so for files extracted from zips.
And I need that metadata. So here I be.
Can I explicitly set my (de)compression in the Groovy script? Where
would I do that, and what values does one typically encounter for zip
compression?
Jim
On Thu, Feb 15, 2024 at 9:26 PM Paul King <pa...@asert.com.au> wrote:
What you are doing to read the zip looks okay.
Just a guess, but it could be that because you haven't written to the
output stream, it is essentially a corrupt data stream as far as NiFi
processing is concerned. What happens if you set "outputStream =
inputStream" as the last line of your callback?
Paul.
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>>
Virus-free.www.avast.com <http://Virus-free.www.avast.com>
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
On Fri, Feb 16, 2024 at 8:48 AM James McMahon
<jsmcmah...@gmail.com> wrote:
>
> I am struggling to build a Groovy scri[t I can run from a NiFi
ExecuteScript processor to extract from a zip file and stream to a
tar archive.
>
> I tried to tackle it all at once and made little progress.
> I am now just trying to read the zip file, and am getting this
error:
>
> ExecuteScript[id=ae3e5de5-018d-1000-ff81-b0c807b75086] Error
occurred processing FlowFile:
org.apache.nifi.processor.exception.ProcessException: IOException
thrown from
ExecuteScript[id=ae3e5de5-018d-1000-ff81-b0c807b75086]:
java.util.zip.ZipException: invalid compression method
> - Caused by: java.util.zip.ZipException: invalid compression method
>
>
> This is my simplified code:
>
>
> import java.util.zip.ZipInputStream
>
> def ff = session.get()
> if (!ff) return
>
> try {
> ff = session.write(ff, { inputStream, outputStream ->
> def zipInputStream = new ZipInputStream(inputStream)
> def entry = zipInputStream.getNextEntry()
> while (entry != null) {
> entry = zipInputStream.getNextEntry()
> }
> } as StreamCallback)
>
> session.transfer(ff, REL_SUCCESS)
> } catch (Exception e) {
> log.error('Error occurred processing FlowFile', e)
> session.transfer(ff, REL_FAILURE)
> }
>
>
> I am able to list and unzip the file at the linux command line,
but cannot get it to work from the script.
>
>
> Has anyone had success doing this? Can anyone help me get past
this error?
>
>
> Thanks in advance.
>
> Jim
>
>