Hi Jim,

The ExecuteStreamCommand will only output 1 flowfile, so using it to unzip
in this fashion won't yield the results you need.

Instead, you might try a workaround with ExecuteStreamCommand to unzip your
file and then tar to repackage it.  Then UnpackContent should be able to
read the tar file metadata.  I have used ExecuteStreamCommand to execute
bash scripts.  An example is shown below, which you can modify for your
needs.  The ExecuteStreamCommand properties "Command Path=/bin/bash" and
"Command Arguments=/path/to/script.sh" is all you need for this script to
work.

#!/bin/bash
tmpzipfile=$(mktemp)
tmptarfile=$(mktemp)
#remove the tmptarfile file, we just need a temporary filename, and will
recreate it below
rm -f $tmptarfile
#create a directory to unzip files to
tmpdir=$(mktemp -d)

cat /dev/stdin >> $tmpzipfile
# here is your unzip command to unzip $tmpzipfile to $tmpdir, preserving
file metadata
# here is your tar command to tar $tmpdir to $tmptarfile
cat $tmptarfile >> /dev/stdout

#cleanup
rm -f $tmpzipfile
rm -f $tmptarfile
rm -rf $tmpdir



On Wed, Jan 31, 2024 at 12:55 PM James McMahon <[email protected]> wrote:

> If anyone can show me how to get my ExecuteStreamCommand configured
> properly as a workaround, I am still interested in that.
> Jim
>
> On Wed, Jan 31, 2024 at 12:39 PM James McMahon <[email protected]>
> wrote:
>
>> I tried to find a Create option for tickets here,
>> https://issues.apache.org/jira/projects/NIFI/issues/NIFI-11859?filter=allopenissues
>> .
>> I did not find one, and suspect maybe I have no such privilege perhaps?
>> In any case, thank you for creating that.
>> Jim
>>
>> On Wed, Jan 31, 2024 at 12:37 PM Joe Witt <[email protected]> wrote:
>>
>>> I went ahead and wrote it up here
>>> https://issues.apache.org/jira/browse/NIFI-12709
>>>
>>> Thanks
>>>
>>> On Wed, Jan 31, 2024 at 10:30 AM James McMahon <[email protected]>
>>> wrote:
>>>
>>>> Happy to do that Joe. How do I create and submit a JIRA for
>>>> consideration? I have not done one - at least, not for years.
>>>> If you get me started, I will do a concise and thorough description in
>>>> the ticket.
>>>> Sincerely,
>>>> Jim
>>>>
>>>> On Wed, Jan 31, 2024 at 12:12 PM Joe Witt <[email protected]> wrote:
>>>>
>>>>> James,
>>>>>
>>>>> Makes sense to create a JIRA to improve UnpackContent to extract these
>>>>> attributes in the event of a zip file that happens to present them.  The
>>>>> concept of lastModifiedDate does appear easily accessed if available in 
>>>>> the
>>>>> metadata.  Owner/Creator/Creation information looks less standard in the
>>>>> case of a Zip but perhaps still capturable as extra fields.
>>>>>
>>>>> Thanks
>>>>>
>>>>> On Wed, Jan 31, 2024 at 10:01 AM James McMahon <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> I tried to use UnpackContent to extract the files within a zip file
>>>>>> named ABC DEF (1).zip. (the filename has spaces in its name).
>>>>>>
>>>>>> UnpackContent seemed to work, but it did not preserve file attributes
>>>>>> from the files in the zip. For example, the  lastModifiedTime   is not
>>>>>> available so downstream I am unable to do
>>>>>> this: 
>>>>>> ${file.lastModifiedTime:toDate("yyyy-MM-dd'T'HH:mm:ssZ"):format("yyyyMMddHHmmss")}
>>>>>>
>>>>>> I did some digging and found that on the UnpackContent page, it says:
>>>>>> file.lastModifiedTime  "The date and time that the unpacked file was
>>>>>> last modified (*tar only*)."
>>>>>>
>>>>>> I need these file attributes for those files I extract from the zip.
>>>>>> So as an alternative I tried configuring an ExecuteStreamCommand
>>>>>> processor like this:
>>>>>> Command Arguments  -c;"unzip -p -q < -"
>>>>>> Command Path  /bin/bash
>>>>>> Argument Delimiter   ;
>>>>>>
>>>>>> It throws these errors:
>>>>>>
>>>>>> 16:41:30 UTCERROR13023d28-6154-17fd-b4e8-7a30b35980ca
>>>>>> ExecuteStreamCommand[id=13023d28-6154-17fd-b4e8-7a30b35980ca] Failed to
>>>>>> write flow file to stdin due to Broken pipe: java.io.IOException: Broken
>>>>>> pipe 16:41:30 UTCERROR13023d28-6154-17fd-b4e8-7a30b35980ca
>>>>>> ExecuteStreamCommand[id=13023d28-6154-17fd-b4e8-7a30b35980ca] 
>>>>>> Transferring
>>>>>> flow file FlowFile[filename=ABC DEF (1).zip] to nonzero status. 
>>>>>> Executable
>>>>>> command /bin/bash ended in an error: /bin/bash: -: No such file or 
>>>>>> directory
>>>>>>
>>>>>> It does not seem to be applying the unzip to the stdin of the ESC
>>>>>> processor. None of the files in the zip archive are output from ESC.
>>>>>>
>>>>>> What needs to be changed in my ESC configuration?
>>>>>>
>>>>>> Thank you in advance for any help.
>>>>>>
>>>>>>

Reply via email to