Re: ListSFTP doesn't follow symlinks

2022-02-03 Thread Guillermo Muñoz Salgado
Thanks Mark and David.

I will watch the Issue at Jira to be updated.

Thanks again!

Regards.

El jue, 3 feb 2022 a las 18:28, David Handermann (<
exceptionfact...@apache.org>) escribió:

> Guille,
>
> Thanks for confirming the same behavior on ListSFTP, and thanks Mark for
> reproducing the issue.
>
> Apparently the problem is in the NiFi SFTPTransfer class, which is not
> distinguishing between a symbolic link for a file and for a directory.  In
> this case, SFTPTransfer is attempting to process the symlinked file as a
> directory, causing the error.  I have assigned NIFI-6699 and will plan on
> submitting a pull request to resolve the problem soon.
>
> Thanks again for reporting this issue and helping track down the problem!
>
> Regards,
> David Handermann
>
> On Thu, Feb 3, 2022 at 10:58 AM Mark Payne  wrote:
>
>> Guille,
>>
>> Thanks for the extra details.
>>
>> I just tried again. In my case, all worked as expected when I had a
>> symlink to a directory. But when I had a symlink to a file, I got the same
>> error and stack trace as you. So looks like we are handling the case
>> properly for symlinked directories but not symlinked files.
>>
>> Thanks
>> -Mark
>>
>>
>>
>> On Feb 3, 2022, at 11:43 AM, Guillermo Muñoz <
>> guillermo.munoz.salg...@gmail.com> wrote:
>>
>> Hi, David.
>>
>> Sorry for the misunderstanding, my fault. Firstly, we tried using
>> ListSFTP and FetchSFTP, and when it didn't work, we tried another option
>> (GetSFTP), and I pasted the wrong stack trace. So, i've done the following
>> tests:
>>
>>- ListSFTP + FetchSFTP: Error in ListSFTP [1]
>>- GetSFTP: Error [2]
>>- Generate flowfile + FetchSFTP with the name of the symlink in the *
>>Remote File* property: OK, the file is downloaded.
>>
>> So, it seems the issue is in ListSFTP and GetSFTP, but FetchSFTP  works
>> fine.
>>
>> Thanks. Regards
>>
>> --
>> Guille
>>
>> [1]
>> 2022-02-03 17:36:45,466 ERROR [Timer-Driven Process Thread-8]
>> o.a.nifi.processors.standard.ListSFTP
>> ListSFTP[id=64443154-ac76-1736-9e49-f2ca388dfbdf] Unable to get listing
>> from  *.gz; skipping: java.io.FileNotFoundException: Could not
>> perform listing on  *.gz because could not find the file on the
>> remote server
>> java.io.FileNotFoundException: Could not perform listing on  *.gz
>> because could not find the file on the remote server
>> at
>> org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:350)
>> at
>> org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:365)
>> at
>> org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:262)
>> at
>> org.apache.nifi.processors.standard.ListFileTransfer.performListing(ListFileTransfer.java:120)
>> at
>> org.apache.nifi.processors.standard.ListSFTP.performListing(ListSFTP.java:150)
>> at
>> org.apache.nifi.processors.standard.ListFileTransfer.performListing(ListFileTransfer.java:112)
>> at
>> org.apache.nifi.processor.util.list.AbstractListProcessor.listByTrackingTimestamps(AbstractListProcessor.java:750)
>> at
>> org.apache.nifi.processor.util.list.AbstractListProcessor.onTrigger(AbstractListProcessor.java:525)
>> at
>> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>> at
>> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1273)
>> at
>> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214)
>> at
>> org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:63)
>> at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
>> at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>> at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>> at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>> at java.lang.Thread.run(Thread.java:748)
>>
>>
>> [2]
>> 2022-02-03 17:39:14,714 ERROR [Timer-Driven Process Thread-27]
>> o.a.nifi.processors.standard.GetSFTP
>> GetSFTP[id=c0300c77-017e-1000--fa9c1f31] Unable to get listing from
>> *.gz; skipping: java.io.FileNotFoundException: Could not perform
>> listing on *.gz because could not find the file on the remote server
>> java.io.FileNotFoundException: Could not perform listing on *.gz
>> because could not find the file on the remote server
>> at
>> 

Re: ListSFTP doesn't follow symlinks

2022-02-03 Thread David Handermann
Guille,

Thanks for confirming the same behavior on ListSFTP, and thanks Mark for
reproducing the issue.

Apparently the problem is in the NiFi SFTPTransfer class, which is not
distinguishing between a symbolic link for a file and for a directory.  In
this case, SFTPTransfer is attempting to process the symlinked file as a
directory, causing the error.  I have assigned NIFI-6699 and will plan on
submitting a pull request to resolve the problem soon.

Thanks again for reporting this issue and helping track down the problem!

Regards,
David Handermann

On Thu, Feb 3, 2022 at 10:58 AM Mark Payne  wrote:

> Guille,
>
> Thanks for the extra details.
>
> I just tried again. In my case, all worked as expected when I had a
> symlink to a directory. But when I had a symlink to a file, I got the same
> error and stack trace as you. So looks like we are handling the case
> properly for symlinked directories but not symlinked files.
>
> Thanks
> -Mark
>
>
>
> On Feb 3, 2022, at 11:43 AM, Guillermo Muñoz <
> guillermo.munoz.salg...@gmail.com> wrote:
>
> Hi, David.
>
> Sorry for the misunderstanding, my fault. Firstly, we tried using ListSFTP
> and FetchSFTP, and when it didn't work, we tried another option (GetSFTP),
> and I pasted the wrong stack trace. So, i've done the following tests:
>
>- ListSFTP + FetchSFTP: Error in ListSFTP [1]
>- GetSFTP: Error [2]
>- Generate flowfile + FetchSFTP with the name of the symlink in the *
>Remote File* property: OK, the file is downloaded.
>
> So, it seems the issue is in ListSFTP and GetSFTP, but FetchSFTP  works
> fine.
>
> Thanks. Regards
>
> --
> Guille
>
> [1]
> 2022-02-03 17:36:45,466 ERROR [Timer-Driven Process Thread-8]
> o.a.nifi.processors.standard.ListSFTP
> ListSFTP[id=64443154-ac76-1736-9e49-f2ca388dfbdf] Unable to get listing
> from  *.gz; skipping: java.io.FileNotFoundException: Could not
> perform listing on  *.gz because could not find the file on the
> remote server
> java.io.FileNotFoundException: Could not perform listing on  *.gz
> because could not find the file on the remote server
> at
> org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:350)
> at
> org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:365)
> at
> org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:262)
> at
> org.apache.nifi.processors.standard.ListFileTransfer.performListing(ListFileTransfer.java:120)
> at
> org.apache.nifi.processors.standard.ListSFTP.performListing(ListSFTP.java:150)
> at
> org.apache.nifi.processors.standard.ListFileTransfer.performListing(ListFileTransfer.java:112)
> at
> org.apache.nifi.processor.util.list.AbstractListProcessor.listByTrackingTimestamps(AbstractListProcessor.java:750)
> at
> org.apache.nifi.processor.util.list.AbstractListProcessor.onTrigger(AbstractListProcessor.java:525)
> at
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
> at
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1273)
> at
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214)
> at
> org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:63)
> at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
>
>
> [2]
> 2022-02-03 17:39:14,714 ERROR [Timer-Driven Process Thread-27]
> o.a.nifi.processors.standard.GetSFTP
> GetSFTP[id=c0300c77-017e-1000--fa9c1f31] Unable to get listing from
> *.gz; skipping: java.io.FileNotFoundException: Could not perform
> listing on *.gz because could not find the file on the remote server
> java.io.FileNotFoundException: Could not perform listing on *.gz
> because could not find the file on the remote server
> at
> org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:350)
> at
> org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:365)
> at
> org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:262)
> at
> 

Re: ListSFTP doesn't follow symlinks

2022-02-03 Thread Mark Payne
Guille,

Thanks for the extra details.

I just tried again. In my case, all worked as expected when I had a symlink to 
a directory. But when I had a symlink to a file, I got the same error and stack 
trace as you. So looks like we are handling the case properly for symlinked 
directories but not symlinked files.

Thanks
-Mark



On Feb 3, 2022, at 11:43 AM, Guillermo Muñoz 
mailto:guillermo.munoz.salg...@gmail.com>> 
wrote:

Hi, David.

Sorry for the misunderstanding, my fault. Firstly, we tried using ListSFTP and 
FetchSFTP, and when it didn't work, we tried another option (GetSFTP), and I 
pasted the wrong stack trace. So, i've done the following tests:

  *   ListSFTP + FetchSFTP: Error in ListSFTP [1]
  *   GetSFTP: Error [2]
  *   Generate flowfile + FetchSFTP with the name of the symlink in the Remote 
File property: OK, the file is downloaded.

So, it seems the issue is in ListSFTP and GetSFTP, but FetchSFTP  works fine.

Thanks. Regards

--
Guille

[1]
2022-02-03 17:36:45,466 ERROR [Timer-Driven Process Thread-8] 
o.a.nifi.processors.standard.ListSFTP 
ListSFTP[id=64443154-ac76-1736-9e49-f2ca388dfbdf] Unable to get listing from  
*.gz; skipping: java.io.FileNotFoundException: Could not perform listing on 
 *.gz because could not find the file on the remote server
java.io.FileNotFoundException: Could not perform listing on  *.gz because 
could not find the file on the remote server
at 
org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:350)
at 
org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:365)
at 
org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:262)
at 
org.apache.nifi.processors.standard.ListFileTransfer.performListing(ListFileTransfer.java:120)
at 
org.apache.nifi.processors.standard.ListSFTP.performListing(ListSFTP.java:150)
at 
org.apache.nifi.processors.standard.ListFileTransfer.performListing(ListFileTransfer.java:112)
at 
org.apache.nifi.processor.util.list.AbstractListProcessor.listByTrackingTimestamps(AbstractListProcessor.java:750)
at 
org.apache.nifi.processor.util.list.AbstractListProcessor.onTrigger(AbstractListProcessor.java:525)
at 
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at 
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1273)
at 
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214)
at 
org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:63)
at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


[2]
2022-02-03 17:39:14,714 ERROR [Timer-Driven Process Thread-27] 
o.a.nifi.processors.standard.GetSFTP 
GetSFTP[id=c0300c77-017e-1000--fa9c1f31] Unable to get listing from 
*.gz; skipping: java.io.FileNotFoundException: Could not perform listing on 
*.gz because could not find the file on the remote server
java.io.FileNotFoundException: Could not perform listing on *.gz because 
could not find the file on the remote server
at 
org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:350)
at 
org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:365)
at 
org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:262)
at 
org.apache.nifi.processors.standard.GetFileTransfer.fetchListing(GetFileTransfer.java:299)
at 
org.apache.nifi.processors.standard.GetFileTransfer.onTrigger(GetFileTransfer.java:126)
at 
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at 
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1273)
at 
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214)
at 
org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:63)
at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at 

Re: ListSFTP doesn't follow symlinks

2022-02-03 Thread Guillermo Muñoz
Hi, David.

Sorry for the misunderstanding, my fault. Firstly, we tried using ListSFTP
and FetchSFTP, and when it didn't work, we tried another option (GetSFTP),
and I pasted the wrong stack trace. So, i've done the following tests:

   - ListSFTP + FetchSFTP: Error in ListSFTP [1]
   - GetSFTP: Error [2]
   - Generate flowfile + FetchSFTP with the name of the symlink in the *Remote
   File* property: OK, the file is downloaded.

So, it seems the issue is in ListSFTP and GetSFTP, but FetchSFTP  works
fine.

Thanks. Regards

--
Guille

[1]
2022-02-03 17:36:45,466 ERROR [Timer-Driven Process Thread-8]
o.a.nifi.processors.standard.ListSFTP
ListSFTP[id=64443154-ac76-1736-9e49-f2ca388dfbdf] Unable to get listing
from  *.gz; skipping: java.io.FileNotFoundException: Could not perform
listing on  *.gz because could not find the file on the remote server
java.io.FileNotFoundException: Could not perform listing on  *.gz
because could not find the file on the remote server
at
org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:350)
at
org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:365)
at
org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:262)
at
org.apache.nifi.processors.standard.ListFileTransfer.performListing(ListFileTransfer.java:120)
at
org.apache.nifi.processors.standard.ListSFTP.performListing(ListSFTP.java:150)
at
org.apache.nifi.processors.standard.ListFileTransfer.performListing(ListFileTransfer.java:112)
at
org.apache.nifi.processor.util.list.AbstractListProcessor.listByTrackingTimestamps(AbstractListProcessor.java:750)
at
org.apache.nifi.processor.util.list.AbstractListProcessor.onTrigger(AbstractListProcessor.java:525)
at
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1273)
at
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214)
at
org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:63)
at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


[2]
2022-02-03 17:39:14,714 ERROR [Timer-Driven Process Thread-27]
o.a.nifi.processors.standard.GetSFTP
GetSFTP[id=c0300c77-017e-1000--fa9c1f31] Unable to get listing from
*.gz; skipping: java.io.FileNotFoundException: Could not perform
listing on *.gz because could not find the file on the remote server
java.io.FileNotFoundException: Could not perform listing on *.gz
because could not find the file on the remote server
at
org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:350)
at
org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:365)
at
org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:262)
at
org.apache.nifi.processors.standard.GetFileTransfer.fetchListing(GetFileTransfer.java:299)
at
org.apache.nifi.processors.standard.GetFileTransfer.onTrigger(GetFileTransfer.java:126)
at
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1273)
at
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214)
at
org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:63)
at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at

Re: ListSFTP doesn't follow symlinks

2022-02-03 Thread Mark Payne
Guille,

I did a quick test on my MacBook and things worked as expected following 
symlinks. If I set the property to “false” it didn’t get the files. If I set it 
to “true” it did retrieve the files. Either way it didn’t error, though - just 
didn’t follow the symlink.

Of course, that’s not to say that there’s not some issue, just that it’s not 
obviously always broken :)

One thing that I notice in the log message there, though:

"Could not perform listing on   testfile.gz…”

There are a couple of spaces there - it’s not looking for “testfile.gz” but 
rather “testfile.gz” - are there actually spaces in the filename? 
Or any other sort of character there, that is perhaps not being properly 
escaped?

Thanks
-Mark


On Feb 3, 2022, at 10:57 AM, Guillermo Muñoz Salgado 
mailto:mun...@gmail.com>> wrote:

Hi all,

We are developing a use case in which we have to get some files from a server. 
We have implemented it by the listSFTP + FetchSFTP way in a 3 nodes cluster 
running nifi 1.15.3. But we are having some issues when what we want to get are 
symlinks instead of files. We have set true the property "Follow symlink" but 
we get the same results. Are we doing something wrong? Or is it a bug or a 
known issue? We have found this issue [1] but it is old and resolved and this 
other one [2], that is older and unresolved.  We're not sure if they are 
related to this behaviour or not.

I paste our error log:

2022-02-03 16:27:41,002 ERROR [Timer-Driven Process Thread-18] 
o.a.nifi.processors.standard.GetSFTP 
GetSFTP[id=c0300c77-017e-1000--fff-ffa9c1f31] Unable to get listing from 
testfile.gz; skipping: java.io.FileNotFoundException: Could not perform listing 
on testfile.gz because could not find the file on the remote server
java.io.FileNotFoundException: Could not perform listing on   testfile.gz 
because could not find the file on the remote server
at 
org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:350)
at 
org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:365)
at 
org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:262)
at 
org.apache.nifi.processors.standard.GetFileTransfer.fetchListing(GetFileTransfer.java:299)
at 
org.apache.nifi.processors.standard.GetFileTransfer.onTrigger(GetFileTransfer.java:126)
at 
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at 
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1273)
at 
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214)
at 
org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:63)
at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Thanks in advance
--
Guille

[1] https://issues.apache.org/jira/browse/NIFI-5560
[2] https://issues.apache.org/jira/browse/NIFI-6699



Re: ListSFTP doesn't follow symlinks

2022-02-03 Thread David Handermann
Hi Guille,

Thanks for raising this issue and providing a stack trace.  You mentioned
using ListSFTP and FetchSFTP, but the stack references GetFileTransfer,
which corresponds to GetSFTP.

Can you confirm the same error using FetchSFTP?  If you can confirm the
same issue with FetchSFTP, it would be very helpful to add those details on
the newer Jira issue NIFI-6699.

NiFi SFTP processors switched to a different SSH library after the
resolution of NIFI-5560, so it is possible that some changes may be
necessary.  However, it would be helpful to confirm whether this is an
issue with FetchSFTP, GetSFTP, or both processors.

Regards,
David Handermann

On Thu, Feb 3, 2022 at 9:57 AM Guillermo Muñoz Salgado 
wrote:

> Hi all,
>
> We are developing a use case in which we have to get some files from a
> server. We have implemented it by the listSFTP + FetchSFTP way in a 3 nodes
> cluster running nifi 1.15.3. But we are having some issues when what
> we want to get are symlinks instead of files. We have set true the property 
> *"Follow
> symlink" *but we get the same results. Are we doing something wrong? Or
> is it a bug or a known issue? We have found this issue [1] but it is old
> and resolved and this other one [2], that is older and unresolved.  We're
> not sure if they are related to this behaviour or not.
>
> I paste our error log:
>
> 2022-02-03 16:27:41,002 ERROR [Timer-Driven Process Thread-18]
> o.a.nifi.processors.standard.GetSFTP
> GetSFTP[id=c0300c77-017e-1000--fff-ffa9c1f31] Unable to get listing
> from testfile.gz; skipping: java.io.FileNotFoundException: Could not
> perform listing on testfile.gz because could not find the file on the
> remote server
> java.io.FileNotFoundException: Could not perform listing on   testfile.gz
> because could not find the file on the remote server
> at
> org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:350)
> at
> org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:365)
> at
> org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:262)
> at
> org.apache.nifi.processors.standard.GetFileTransfer.fetchListing(GetFileTransfer.java:299)
> at
> org.apache.nifi.processors.standard.GetFileTransfer.onTrigger(GetFileTransfer.java:126)
> at
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
> at
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1273)
> at
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214)
> at
> org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:63)
> at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
>
> Thanks in advance
> --
> Guille
>
> [1] https://issues.apache.org/jira/browse/NIFI-5560
> [2] https://issues.apache.org/jira/browse/NIFI-6699
>


ListSFTP doesn't follow symlinks

2022-02-03 Thread Guillermo Muñoz Salgado
Hi all,

We are developing a use case in which we have to get some files from a
server. We have implemented it by the listSFTP + FetchSFTP way in a 3 nodes
cluster running nifi 1.15.3. But we are having some issues when what
we want to get are symlinks instead of files. We have set true the
property *"Follow
symlink" *but we get the same results. Are we doing something wrong? Or is
it a bug or a known issue? We have found this issue [1] but it is old and
resolved and this other one [2], that is older and unresolved.  We're not
sure if they are related to this behaviour or not.

I paste our error log:

2022-02-03 16:27:41,002 ERROR [Timer-Driven Process Thread-18]
o.a.nifi.processors.standard.GetSFTP
GetSFTP[id=c0300c77-017e-1000--fff-ffa9c1f31] Unable to get listing
from testfile.gz; skipping: java.io.FileNotFoundException: Could not
perform listing on testfile.gz because could not find the file on the
remote server
java.io.FileNotFoundException: Could not perform listing on   testfile.gz
because could not find the file on the remote server
at
org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:350)
at
org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:365)
at
org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:262)
at
org.apache.nifi.processors.standard.GetFileTransfer.fetchListing(GetFileTransfer.java:299)
at
org.apache.nifi.processors.standard.GetFileTransfer.onTrigger(GetFileTransfer.java:126)
at
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1273)
at
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214)
at
org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:63)
at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Thanks in advance
--
Guille

[1] https://issues.apache.org/jira/browse/NIFI-5560
[2] https://issues.apache.org/jira/browse/NIFI-6699


Re: Hashicorp vault transit engine for sensitive properties of processors

2022-02-03 Thread Joe Gresock
Cannon,

There is also an ongoing effort to add a new framework-level component
called a Parameter Provider [1], which will allow parameter values to be
fetched into a NiFi Parameter Context from an external source.  A HashiCorp
Vault Key/Value Parameter Provider [2] has been proposed and is being
worked.  In this feature, sensitive parameter values are still encrypted as
usual in the actual flow.xml.gz once fetched from the external source, so
it doesn't exactly match your original question, but it seems to be one
step closer.  We hope to include this in a release sometime soon.

[1] https://issues.apache.org/jira/browse/NIFI-8998
[2] https://issues.apache.org/jira/browse/NIFI-9401

Joe

On Thu, Feb 3, 2022 at 9:06 AM David Handermann 
wrote:

> Hi Cannon,
>
> The following NiFi Jira issue outlines a previous attempt to integrate
> flow property storage and retrieval using HashiCorp Vault:
>
> https://issues.apache.org/jira/browse/NIFI-6825
>
> I updated and closed the issue since the description itself outlines an
> approach that no longer fits well in light of recent improvements to the
> framework.
>
> Feel free to create a new issue and we can use that going forward.
>
> Regards,
> David Handermann
>
> On Thu, Feb 3, 2022 at 12:23 AM Cannon Palms 
> wrote:
>
>> Thanks Joe! Do you know if there is an existing JIRA issue to track such
>> a feature proposal (vault integration for flow secrets)? I’d be happy to
>> create one if not.
>>
>> Cannon
>>
>> On Tue, Feb 1, 2022 at 3:46 PM Joe Gresock  wrote:
>>
>>> Hi Cannon,
>>>
>>> Both the HashiCorp Vault Transit and Key/Value Sensitive Property
>>> Providers are able to protect NiFi's configuration files (e.g.,
>>> nifi.properties, login-identity-providers.xml, and authorizers.xml).  In
>>> the case of the Transit implementation, you would use the encrypt-config.sh
>>> tool from the NiFi Toolkit to encrypt properties in these files using the
>>> Vault Transit Engine, and these will be decrypted using Vault as NiFi
>>> starts up.  The process is similar for the Key/Value implementation, but
>>> the values of the properties are stored inside the Vault server instead of
>>> being encrypted at rest in the configuration files.
>>>
>>> The properties in the flow.xml.gz file (e.g., your ConsumeMQTT processor
>>> password) are protected by a different mechanism, and there is not
>>> currently a Vault implementation that protects these.
>>>
>>> Hope this helps,
>>> Joe
>>>
>>> On Tue, Feb 1, 2022 at 11:05 AM Cannon Palms 
>>> wrote:
>>>
 Hello,

 From what I understand from the documentation, the transit engine of
 Hashicorp Vault is definitely supported for system properties. It is also
 clear that the standard key/value engine of Hashicorp vault is supported
 for sensitive processor properties (e.g. the password used to connect to an
 MQTT broker in a ConsumeMQTT processor).

 What I cannot tell is if NiFi supports using the transit engine for
 these sensitive properties of processors.

 I'd like to ensure that these properties are encrypted at rest inside
 of the registry, but decrypted using the transit engine and a provided
 vault encryption key at runtime.

 Is this currently supported? Or is the only the standard key/value
 engine supported for such properties?

 Thanks,
 Cannon

>>>


Re: Hashicorp vault transit engine for sensitive properties of processors

2022-02-03 Thread David Handermann
Hi Cannon,

The following NiFi Jira issue outlines a previous attempt to integrate flow
property storage and retrieval using HashiCorp Vault:

https://issues.apache.org/jira/browse/NIFI-6825

I updated and closed the issue since the description itself outlines an
approach that no longer fits well in light of recent improvements to the
framework.

Feel free to create a new issue and we can use that going forward.

Regards,
David Handermann

On Thu, Feb 3, 2022 at 12:23 AM Cannon Palms  wrote:

> Thanks Joe! Do you know if there is an existing JIRA issue to track such a
> feature proposal (vault integration for flow secrets)? I’d be happy to
> create one if not.
>
> Cannon
>
> On Tue, Feb 1, 2022 at 3:46 PM Joe Gresock  wrote:
>
>> Hi Cannon,
>>
>> Both the HashiCorp Vault Transit and Key/Value Sensitive Property
>> Providers are able to protect NiFi's configuration files (e.g.,
>> nifi.properties, login-identity-providers.xml, and authorizers.xml).  In
>> the case of the Transit implementation, you would use the encrypt-config.sh
>> tool from the NiFi Toolkit to encrypt properties in these files using the
>> Vault Transit Engine, and these will be decrypted using Vault as NiFi
>> starts up.  The process is similar for the Key/Value implementation, but
>> the values of the properties are stored inside the Vault server instead of
>> being encrypted at rest in the configuration files.
>>
>> The properties in the flow.xml.gz file (e.g., your ConsumeMQTT processor
>> password) are protected by a different mechanism, and there is not
>> currently a Vault implementation that protects these.
>>
>> Hope this helps,
>> Joe
>>
>> On Tue, Feb 1, 2022 at 11:05 AM Cannon Palms 
>> wrote:
>>
>>> Hello,
>>>
>>> From what I understand from the documentation, the transit engine of
>>> Hashicorp Vault is definitely supported for system properties. It is also
>>> clear that the standard key/value engine of Hashicorp vault is supported
>>> for sensitive processor properties (e.g. the password used to connect to an
>>> MQTT broker in a ConsumeMQTT processor).
>>>
>>> What I cannot tell is if NiFi supports using the transit engine for
>>> these sensitive properties of processors.
>>>
>>> I'd like to ensure that these properties are encrypted at rest inside of
>>> the registry, but decrypted using the transit engine and a provided vault
>>> encryption key at runtime.
>>>
>>> Is this currently supported? Or is the only the standard key/value
>>> engine supported for such properties?
>>>
>>> Thanks,
>>> Cannon
>>>
>>