James,

E_NOTIMPL means that feature is not implemented. I can see there is discussion 
about this down at sourceforge but the detail is blocked by my employer's 
firewall.

p7zip / Discussion / Help: E_NOTIMPL for stdin / stdout 
pipe<https://sourceforge.net/p/p7zip/discussion/383044/thread/8066736d/>

https://sourceforge.net/p/p7zip/discussion/383044/thread/8066736d

Steve Hindmarch

From: James McMahon <[email protected]>
Sent: 29 September 2022 12:12
To: Hindmarch,SJ,Stephen,VIR R <[email protected]>
Cc: [email protected]
Subject: Re: Can ExecuteStreamCommand do this?

I ran with these Command Arguments in the ExecuteStreamCommand configuration:
x;-si;-so;-spf;-aou
${filename} removed, -si indicating use of STDIN, -so STDOUT.

The same error is thrown by 7z through ExecuteStreamCommand: Executable command 
/bin/7za ended in an error: ERROR: Can not open the file as an archive  
E_NOTIMPL

I tried this at the command line, getting the same failure:
cat testArchive.7z | 7za x -si -so | dd of=stooges.txt


On Thu, Sep 29, 2022 at 6:44 AM James McMahon 
<[email protected]<mailto:[email protected]>> wrote:
Good morning, Steve. Indeed, that second paragraph is exactly how I did get 
this to work. I unpack to disk and then read in the twelve results using a 
GetFile. So far it is working well. It just feels a little wrong to me to do 
this, as I have introduced an extra write to and read from disk, which is going 
to be slower than doing it all in memory within the JVM. While that may not 
seem like anything significant for a single 7z file, as we work across 
thousands and thousands it can be significant.

I am about to try what you suggested above: dropping the ${filename} entirely 
from the STDIN / STDOUT configuration. I realize it is not likely going to give 
me the twelve output flowfiles I'm seeking in the "output stream" path from 
ExecuteStreamCommand. I just want to see if it works without throwing that 
error.

Welcome any other thoughts or comments you may have. Thanks again for your 
comments so far.

Jim

On Thu, Sep 29, 2022 at 5:23 AM 
<[email protected]<mailto:[email protected]>> wrote:
James,

I have been thinking more about your problem and this may be the wrong 
approach. If you successfully unpack your files into the flow file content, you 
will still have one output flow file containing the unpacked contents of all of 
your files. If you need 12 separate files in their own flowfiles then you will 
need to find some way of splitting them up. Is there a byte sequence you can 
use in a SplitContent process, or a specific file length you can use in 
SplitText?

Otherwise you may be better off using ExecuteStreamCommand to unpack the files 
on disk. Run it verbosely and use the output of that step to create a list of 
the locations where your recently unpacked files are. Or create a temporary 
directory to unpack in and fetch all the files in there, cleaning up 
aftwerwards. Then you can load the files with FetchFile. FetchFile can be 
instructed to delete the file it has just read so can also clean up after 
itself.

Steve Hindmarch

From: 
stephen.hindmarch.bt.com<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fstephen.hindmarch.bt.com%2F&data=05%7C01%7Cstephen.hindmarch%40bt.com%7Ceb3e9d5ccfd74fc2646608daa20b7814%7Ca7f356889c004d5eba4129f146377ab0%7C0%7C0%7C638000467408748985%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=XLh23oDzEOdy5nfg848cKdvu77cW18GwTRJxfj6COOE%3D&reserved=0>
 via users <[email protected]<mailto:[email protected]>>
Sent: 29 September 2022 09:19
To: [email protected]<mailto:[email protected]>; 
[email protected]<mailto:[email protected]>
Subject: RE: Can ExecuteStreamCommand do this?

James,

Using ${filename} and -si together seems wrong to me. What happens when you try 
that on the command line?

Steve Hindmarch

From: James McMahon <[email protected]<mailto:[email protected]>>
Sent: 28 September 2022 13:49
To: [email protected]<mailto:[email protected]>; 
Hindmarch,SJ,Stephen,VIR R 
<[email protected]<mailto:[email protected]>>
Subject: Re: Can ExecuteStreamCommand do this?

Thank you Steve. I 've employed a ListFile/FetchFile to load the 7z files into 
the flow . When I have my ESC configured like this following, I get my unpacked 
files results to the #{unpacked.destination} directory on disk:
Command Arguments            x;${filename};-spf;-o#{unpacked.destination};-aou
Command Path                    /bin/7a
Ignore STDIN                       true
Working Directory                #{unpacked.destination}
Argument Delimiter               ;
Output Destination Attribute  No value set
I get twelve files in my output destination folder.

When I try this one, get an error and no output:
Command Arguments            x;${filename};-si;-so;-spf;-aou
Command Path                    /bin/7a
Ignore STDIN                       false
Working Directory                #{unpacked.destination}
Argument Delimiter               ;
Output Destination Attribute  No value set

This yields this error...
Executable command /bin/7za ended in an error: ERROR: Can not open the file as 
archive
E_NOTIMPL
...and it yields only one flowfile result in Output Stream, and that is a brief 
text/plain report of the results of the 7za extraction like this:

This indicates it did indeed find my 7z file and it did indeed identify the 12 
files in it, yet still I get no output to my outgoing flow path:
Extracting archive: /parent/subparent/testArchive.7z
- -
Path = /parentdir/subdir/testArchive.7z
Type = 7z
Physical Size = 7204
Headers Size = 298
Method = LZMA2:96k
Solid = +
Blocks = 1

Everything is Ok

Folders: 1
Files: 12
Size: 90238
Compressed: 7204

${filename} in both cases is a fully qualified name to the file, like this: 
/dir/subdir/myTestFile.7z.

I can't seem to get the ESC output stream to be the extracted files. Anything 
jump out at you?

On Wed, Sep 28, 2022 at 8:06 AM 
stephen.hindmarch.bt.com<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fstephen.hindmarch.bt.com%2F&data=05%7C01%7Cstephen.hindmarch%40bt.com%7Ceb3e9d5ccfd74fc2646608daa20b7814%7Ca7f356889c004d5eba4129f146377ab0%7C0%7C0%7C638000467408748985%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=XLh23oDzEOdy5nfg848cKdvu77cW18GwTRJxfj6COOE%3D&reserved=0>
 via users <[email protected]<mailto:[email protected]>> wrote:
Hi James,

I am not in a position to test this right now, but you have to think of the 
flowfile content as STDIN and STDOUT. So with 7zip you need to use the "-si" 
and "-so" flags to ensure there are no files involved. Then if you can load the 
content of a file into a flowfile, eg with GetFile, then you should be able to 
unpack it with ExecuteStreamCommand. Set "Ignore STDIN" = "false".

I have written up my own use case on github. This involves having a Redis 
script as the input, and results of the script as the output.

my-nifi-cluster/experiment-redis_direct.md at main * hindmasj/my-nifi-cluster * 
GitHub<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhindmasj%2Fmy-nifi-cluster%2Fblob%2Fmain%2Fdocs%2Fexperiment-redis_direct.md&data=05%7C01%7Cstephen.hindmarch%40bt.com%7Ceb3e9d5ccfd74fc2646608daa20b7814%7Ca7f356889c004d5eba4129f146377ab0%7C0%7C0%7C638000467408748985%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=CvSe4TJ%2FU14HvUg%2FTYg4S9sTKikYeNpyrNfGbpoC31A%3D&reserved=0>

The first part of the post shows how to do it with the input commands on the 
command line, so a bit like you running "7za ${filename} -so". The second part 
has the script inside the flowfile and is treated as STDIN, a bit like you 
doing "unzip -si -so".

See if that helps. Fundamentally, if you do "7za -si -so < myfile.7z" on the 
command line and see the output on the console, ExecuteStreamCommand will 
behave the same.

Steve Hindmarch
From: James McMahon <[email protected]<mailto:[email protected]>>
Sent: 28 September 2022 12:02
To: [email protected]<mailto:[email protected]>
Subject: Can ExecuteStreamCommand do this?

I continue to struggle with ExecuteStreamCommand, and am hoping one of you from 
our user community can help me with the following:
1. Can ExecuteStreamCommand be used as I am trying to use it?
2. Can you direct me to an example where ExecuteStreamCommand is configured to 
do something similar to my use case?

My use case:
The incoming flowfiles in my flow path are 7z zips. Based on what I've 
researched so far, NiFi's native processors don't handle unpacking of 7z files.

I want to read the 7z files as STDIN to ExecuteStreamCommand.
I'd like the processor to call out to a 7za app, which will unpack the 7z.
One incoming flowfile will yield multiple output files. Let's say twelve in 
this case.
My goal is to output those twelve as new flowfiles out of ExecuteStreamCommand, 
to its output stream path.

I can't yet get this to work. Best I've been able to do is configure 
ExecuteStreamCommand to unpack ${filename} to a temporary output directory on 
disk. Then I have another path in my flow polling that directory every few 
minutes looking for new data. Am hoping to eliminate that intermediate 
write/read to/from disk by keeping this all within the flow and JVM memory.

Thanks very much in advance for any assistance.

Reply via email to