Re: ListSFTP doesn't follow symlinks
Thanks Mark and David. I will watch the Issue at Jira to be updated. Thanks again! Regards. El jue, 3 feb 2022 a las 18:28, David Handermann (< exceptionfact...@apache.org>) escribió: > Guille, > > Thanks for confirming the same behavior on ListSFTP, and thanks Mark for > reproducing the issue. > > Apparently the problem is in the NiFi SFTPTransfer class, which is not > distinguishing between a symbolic link for a file and for a directory. In > this case, SFTPTransfer is attempting to process the symlinked file as a > directory, causing the error. I have assigned NIFI-6699 and will plan on > submitting a pull request to resolve the problem soon. > > Thanks again for reporting this issue and helping track down the problem! > > Regards, > David Handermann > > On Thu, Feb 3, 2022 at 10:58 AM Mark Payne wrote: > >> Guille, >> >> Thanks for the extra details. >> >> I just tried again. In my case, all worked as expected when I had a >> symlink to a directory. But when I had a symlink to a file, I got the same >> error and stack trace as you. So looks like we are handling the case >> properly for symlinked directories but not symlinked files. >> >> Thanks >> -Mark >> >> >> >> On Feb 3, 2022, at 11:43 AM, Guillermo Muñoz < >> guillermo.munoz.salg...@gmail.com> wrote: >> >> Hi, David. >> >> Sorry for the misunderstanding, my fault. Firstly, we tried using >> ListSFTP and FetchSFTP, and when it didn't work, we tried another option >> (GetSFTP), and I pasted the wrong stack trace. So, i've done the following >> tests: >> >>- ListSFTP + FetchSFTP: Error in ListSFTP [1] >>- GetSFTP: Error [2] >>- Generate flowfile + FetchSFTP with the name of the symlink in the * >>Remote File* property: OK, the file is downloaded. >> >> So, it seems the issue is in ListSFTP and GetSFTP, but FetchSFTP works >> fine. >> >> Thanks. Regards >> >> -- >> Guille >> >> [1] >> 2022-02-03 17:36:45,466 ERROR [Timer-Driven Process Thread-8] >> o.a.nifi.processors.standard.ListSFTP >> ListSFTP[id=64443154-ac76-1736-9e49-f2ca388dfbdf] Unable to get listing >> from *.gz; skipping: java.io.FileNotFoundException: Could not >> perform listing on *.gz because could not find the file on the >> remote server >> java.io.FileNotFoundException: Could not perform listing on *.gz >> because could not find the file on the remote server >> at >> org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:350) >> at >> org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:365) >> at >> org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:262) >> at >> org.apache.nifi.processors.standard.ListFileTransfer.performListing(ListFileTransfer.java:120) >> at >> org.apache.nifi.processors.standard.ListSFTP.performListing(ListSFTP.java:150) >> at >> org.apache.nifi.processors.standard.ListFileTransfer.performListing(ListFileTransfer.java:112) >> at >> org.apache.nifi.processor.util.list.AbstractListProcessor.listByTrackingTimestamps(AbstractListProcessor.java:750) >> at >> org.apache.nifi.processor.util.list.AbstractListProcessor.onTrigger(AbstractListProcessor.java:525) >> at >> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) >> at >> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1273) >> at >> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214) >> at >> org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:63) >> at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) >> at >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) >> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) >> at java.lang.Thread.run(Thread.java:748) >> >> >> [2] >> 2022-02-03 17:39:14,714 ERROR [Timer-Driven Process Thread-27] >> o.a.nifi.processors.standard.GetSFTP >> GetSFTP[id=c0300c77-017e-1000--fa9c1f31] Unable to get listing from >> *.gz; skipping: java.io.FileNotFoundException: Could not perform >> listing on *.gz because could not find the file on the remote server >> java.io.FileNotFoundException: Could not perform listing on *.gz >> because could not find the file on the remote server >> at >>
Re: ListSFTP doesn't follow symlinks
Guille, Thanks for confirming the same behavior on ListSFTP, and thanks Mark for reproducing the issue. Apparently the problem is in the NiFi SFTPTransfer class, which is not distinguishing between a symbolic link for a file and for a directory. In this case, SFTPTransfer is attempting to process the symlinked file as a directory, causing the error. I have assigned NIFI-6699 and will plan on submitting a pull request to resolve the problem soon. Thanks again for reporting this issue and helping track down the problem! Regards, David Handermann On Thu, Feb 3, 2022 at 10:58 AM Mark Payne wrote: > Guille, > > Thanks for the extra details. > > I just tried again. In my case, all worked as expected when I had a > symlink to a directory. But when I had a symlink to a file, I got the same > error and stack trace as you. So looks like we are handling the case > properly for symlinked directories but not symlinked files. > > Thanks > -Mark > > > > On Feb 3, 2022, at 11:43 AM, Guillermo Muñoz < > guillermo.munoz.salg...@gmail.com> wrote: > > Hi, David. > > Sorry for the misunderstanding, my fault. Firstly, we tried using ListSFTP > and FetchSFTP, and when it didn't work, we tried another option (GetSFTP), > and I pasted the wrong stack trace. So, i've done the following tests: > >- ListSFTP + FetchSFTP: Error in ListSFTP [1] >- GetSFTP: Error [2] >- Generate flowfile + FetchSFTP with the name of the symlink in the * >Remote File* property: OK, the file is downloaded. > > So, it seems the issue is in ListSFTP and GetSFTP, but FetchSFTP works > fine. > > Thanks. Regards > > -- > Guille > > [1] > 2022-02-03 17:36:45,466 ERROR [Timer-Driven Process Thread-8] > o.a.nifi.processors.standard.ListSFTP > ListSFTP[id=64443154-ac76-1736-9e49-f2ca388dfbdf] Unable to get listing > from *.gz; skipping: java.io.FileNotFoundException: Could not > perform listing on *.gz because could not find the file on the > remote server > java.io.FileNotFoundException: Could not perform listing on *.gz > because could not find the file on the remote server > at > org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:350) > at > org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:365) > at > org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:262) > at > org.apache.nifi.processors.standard.ListFileTransfer.performListing(ListFileTransfer.java:120) > at > org.apache.nifi.processors.standard.ListSFTP.performListing(ListSFTP.java:150) > at > org.apache.nifi.processors.standard.ListFileTransfer.performListing(ListFileTransfer.java:112) > at > org.apache.nifi.processor.util.list.AbstractListProcessor.listByTrackingTimestamps(AbstractListProcessor.java:750) > at > org.apache.nifi.processor.util.list.AbstractListProcessor.onTrigger(AbstractListProcessor.java:525) > at > org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) > at > org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1273) > at > org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214) > at > org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:63) > at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > > > [2] > 2022-02-03 17:39:14,714 ERROR [Timer-Driven Process Thread-27] > o.a.nifi.processors.standard.GetSFTP > GetSFTP[id=c0300c77-017e-1000--fa9c1f31] Unable to get listing from > *.gz; skipping: java.io.FileNotFoundException: Could not perform > listing on *.gz because could not find the file on the remote server > java.io.FileNotFoundException: Could not perform listing on *.gz > because could not find the file on the remote server > at > org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:350) > at > org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:365) > at > org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:262) > at >
Re: ListSFTP doesn't follow symlinks
Guille, Thanks for the extra details. I just tried again. In my case, all worked as expected when I had a symlink to a directory. But when I had a symlink to a file, I got the same error and stack trace as you. So looks like we are handling the case properly for symlinked directories but not symlinked files. Thanks -Mark On Feb 3, 2022, at 11:43 AM, Guillermo Muñoz mailto:guillermo.munoz.salg...@gmail.com>> wrote: Hi, David. Sorry for the misunderstanding, my fault. Firstly, we tried using ListSFTP and FetchSFTP, and when it didn't work, we tried another option (GetSFTP), and I pasted the wrong stack trace. So, i've done the following tests: * ListSFTP + FetchSFTP: Error in ListSFTP [1] * GetSFTP: Error [2] * Generate flowfile + FetchSFTP with the name of the symlink in the Remote File property: OK, the file is downloaded. So, it seems the issue is in ListSFTP and GetSFTP, but FetchSFTP works fine. Thanks. Regards -- Guille [1] 2022-02-03 17:36:45,466 ERROR [Timer-Driven Process Thread-8] o.a.nifi.processors.standard.ListSFTP ListSFTP[id=64443154-ac76-1736-9e49-f2ca388dfbdf] Unable to get listing from *.gz; skipping: java.io.FileNotFoundException: Could not perform listing on *.gz because could not find the file on the remote server java.io.FileNotFoundException: Could not perform listing on *.gz because could not find the file on the remote server at org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:350) at org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:365) at org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:262) at org.apache.nifi.processors.standard.ListFileTransfer.performListing(ListFileTransfer.java:120) at org.apache.nifi.processors.standard.ListSFTP.performListing(ListSFTP.java:150) at org.apache.nifi.processors.standard.ListFileTransfer.performListing(ListFileTransfer.java:112) at org.apache.nifi.processor.util.list.AbstractListProcessor.listByTrackingTimestamps(AbstractListProcessor.java:750) at org.apache.nifi.processor.util.list.AbstractListProcessor.onTrigger(AbstractListProcessor.java:525) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1273) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214) at org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:63) at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) [2] 2022-02-03 17:39:14,714 ERROR [Timer-Driven Process Thread-27] o.a.nifi.processors.standard.GetSFTP GetSFTP[id=c0300c77-017e-1000--fa9c1f31] Unable to get listing from *.gz; skipping: java.io.FileNotFoundException: Could not perform listing on *.gz because could not find the file on the remote server java.io.FileNotFoundException: Could not perform listing on *.gz because could not find the file on the remote server at org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:350) at org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:365) at org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:262) at org.apache.nifi.processors.standard.GetFileTransfer.fetchListing(GetFileTransfer.java:299) at org.apache.nifi.processors.standard.GetFileTransfer.onTrigger(GetFileTransfer.java:126) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1273) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214) at org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:63) at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at
Re: ListSFTP doesn't follow symlinks
Hi, David. Sorry for the misunderstanding, my fault. Firstly, we tried using ListSFTP and FetchSFTP, and when it didn't work, we tried another option (GetSFTP), and I pasted the wrong stack trace. So, i've done the following tests: - ListSFTP + FetchSFTP: Error in ListSFTP [1] - GetSFTP: Error [2] - Generate flowfile + FetchSFTP with the name of the symlink in the *Remote File* property: OK, the file is downloaded. So, it seems the issue is in ListSFTP and GetSFTP, but FetchSFTP works fine. Thanks. Regards -- Guille [1] 2022-02-03 17:36:45,466 ERROR [Timer-Driven Process Thread-8] o.a.nifi.processors.standard.ListSFTP ListSFTP[id=64443154-ac76-1736-9e49-f2ca388dfbdf] Unable to get listing from *.gz; skipping: java.io.FileNotFoundException: Could not perform listing on *.gz because could not find the file on the remote server java.io.FileNotFoundException: Could not perform listing on *.gz because could not find the file on the remote server at org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:350) at org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:365) at org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:262) at org.apache.nifi.processors.standard.ListFileTransfer.performListing(ListFileTransfer.java:120) at org.apache.nifi.processors.standard.ListSFTP.performListing(ListSFTP.java:150) at org.apache.nifi.processors.standard.ListFileTransfer.performListing(ListFileTransfer.java:112) at org.apache.nifi.processor.util.list.AbstractListProcessor.listByTrackingTimestamps(AbstractListProcessor.java:750) at org.apache.nifi.processor.util.list.AbstractListProcessor.onTrigger(AbstractListProcessor.java:525) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1273) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214) at org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:63) at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) [2] 2022-02-03 17:39:14,714 ERROR [Timer-Driven Process Thread-27] o.a.nifi.processors.standard.GetSFTP GetSFTP[id=c0300c77-017e-1000--fa9c1f31] Unable to get listing from *.gz; skipping: java.io.FileNotFoundException: Could not perform listing on *.gz because could not find the file on the remote server java.io.FileNotFoundException: Could not perform listing on *.gz because could not find the file on the remote server at org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:350) at org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:365) at org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:262) at org.apache.nifi.processors.standard.GetFileTransfer.fetchListing(GetFileTransfer.java:299) at org.apache.nifi.processors.standard.GetFileTransfer.onTrigger(GetFileTransfer.java:126) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1273) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214) at org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:63) at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at
Re: ListSFTP doesn't follow symlinks
Guille, I did a quick test on my MacBook and things worked as expected following symlinks. If I set the property to “false” it didn’t get the files. If I set it to “true” it did retrieve the files. Either way it didn’t error, though - just didn’t follow the symlink. Of course, that’s not to say that there’s not some issue, just that it’s not obviously always broken :) One thing that I notice in the log message there, though: "Could not perform listing on testfile.gz…” There are a couple of spaces there - it’s not looking for “testfile.gz” but rather “testfile.gz” - are there actually spaces in the filename? Or any other sort of character there, that is perhaps not being properly escaped? Thanks -Mark On Feb 3, 2022, at 10:57 AM, Guillermo Muñoz Salgado mailto:mun...@gmail.com>> wrote: Hi all, We are developing a use case in which we have to get some files from a server. We have implemented it by the listSFTP + FetchSFTP way in a 3 nodes cluster running nifi 1.15.3. But we are having some issues when what we want to get are symlinks instead of files. We have set true the property "Follow symlink" but we get the same results. Are we doing something wrong? Or is it a bug or a known issue? We have found this issue [1] but it is old and resolved and this other one [2], that is older and unresolved. We're not sure if they are related to this behaviour or not. I paste our error log: 2022-02-03 16:27:41,002 ERROR [Timer-Driven Process Thread-18] o.a.nifi.processors.standard.GetSFTP GetSFTP[id=c0300c77-017e-1000--fff-ffa9c1f31] Unable to get listing from testfile.gz; skipping: java.io.FileNotFoundException: Could not perform listing on testfile.gz because could not find the file on the remote server java.io.FileNotFoundException: Could not perform listing on testfile.gz because could not find the file on the remote server at org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:350) at org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:365) at org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:262) at org.apache.nifi.processors.standard.GetFileTransfer.fetchListing(GetFileTransfer.java:299) at org.apache.nifi.processors.standard.GetFileTransfer.onTrigger(GetFileTransfer.java:126) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1273) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214) at org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:63) at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Thanks in advance -- Guille [1] https://issues.apache.org/jira/browse/NIFI-5560 [2] https://issues.apache.org/jira/browse/NIFI-6699
Re: ListSFTP doesn't follow symlinks
Hi Guille, Thanks for raising this issue and providing a stack trace. You mentioned using ListSFTP and FetchSFTP, but the stack references GetFileTransfer, which corresponds to GetSFTP. Can you confirm the same error using FetchSFTP? If you can confirm the same issue with FetchSFTP, it would be very helpful to add those details on the newer Jira issue NIFI-6699. NiFi SFTP processors switched to a different SSH library after the resolution of NIFI-5560, so it is possible that some changes may be necessary. However, it would be helpful to confirm whether this is an issue with FetchSFTP, GetSFTP, or both processors. Regards, David Handermann On Thu, Feb 3, 2022 at 9:57 AM Guillermo Muñoz Salgado wrote: > Hi all, > > We are developing a use case in which we have to get some files from a > server. We have implemented it by the listSFTP + FetchSFTP way in a 3 nodes > cluster running nifi 1.15.3. But we are having some issues when what > we want to get are symlinks instead of files. We have set true the property > *"Follow > symlink" *but we get the same results. Are we doing something wrong? Or > is it a bug or a known issue? We have found this issue [1] but it is old > and resolved and this other one [2], that is older and unresolved. We're > not sure if they are related to this behaviour or not. > > I paste our error log: > > 2022-02-03 16:27:41,002 ERROR [Timer-Driven Process Thread-18] > o.a.nifi.processors.standard.GetSFTP > GetSFTP[id=c0300c77-017e-1000--fff-ffa9c1f31] Unable to get listing > from testfile.gz; skipping: java.io.FileNotFoundException: Could not > perform listing on testfile.gz because could not find the file on the > remote server > java.io.FileNotFoundException: Could not perform listing on testfile.gz > because could not find the file on the remote server > at > org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:350) > at > org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:365) > at > org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:262) > at > org.apache.nifi.processors.standard.GetFileTransfer.fetchListing(GetFileTransfer.java:299) > at > org.apache.nifi.processors.standard.GetFileTransfer.onTrigger(GetFileTransfer.java:126) > at > org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) > at > org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1273) > at > org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214) > at > org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:63) > at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > > Thanks in advance > -- > Guille > > [1] https://issues.apache.org/jira/browse/NIFI-5560 > [2] https://issues.apache.org/jira/browse/NIFI-6699 >
ListSFTP doesn't follow symlinks
Hi all, We are developing a use case in which we have to get some files from a server. We have implemented it by the listSFTP + FetchSFTP way in a 3 nodes cluster running nifi 1.15.3. But we are having some issues when what we want to get are symlinks instead of files. We have set true the property *"Follow symlink" *but we get the same results. Are we doing something wrong? Or is it a bug or a known issue? We have found this issue [1] but it is old and resolved and this other one [2], that is older and unresolved. We're not sure if they are related to this behaviour or not. I paste our error log: 2022-02-03 16:27:41,002 ERROR [Timer-Driven Process Thread-18] o.a.nifi.processors.standard.GetSFTP GetSFTP[id=c0300c77-017e-1000--fff-ffa9c1f31] Unable to get listing from testfile.gz; skipping: java.io.FileNotFoundException: Could not perform listing on testfile.gz because could not find the file on the remote server java.io.FileNotFoundException: Could not perform listing on testfile.gz because could not find the file on the remote server at org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:350) at org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:365) at org.apache.nifi.processors.standard.util.SFTPTransfer.getListing(SFTPTransfer.java:262) at org.apache.nifi.processors.standard.GetFileTransfer.fetchListing(GetFileTransfer.java:299) at org.apache.nifi.processors.standard.GetFileTransfer.onTrigger(GetFileTransfer.java:126) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1273) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214) at org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:63) at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Thanks in advance -- Guille [1] https://issues.apache.org/jira/browse/NIFI-5560 [2] https://issues.apache.org/jira/browse/NIFI-6699
Re: Hashicorp vault transit engine for sensitive properties of processors
Cannon, There is also an ongoing effort to add a new framework-level component called a Parameter Provider [1], which will allow parameter values to be fetched into a NiFi Parameter Context from an external source. A HashiCorp Vault Key/Value Parameter Provider [2] has been proposed and is being worked. In this feature, sensitive parameter values are still encrypted as usual in the actual flow.xml.gz once fetched from the external source, so it doesn't exactly match your original question, but it seems to be one step closer. We hope to include this in a release sometime soon. [1] https://issues.apache.org/jira/browse/NIFI-8998 [2] https://issues.apache.org/jira/browse/NIFI-9401 Joe On Thu, Feb 3, 2022 at 9:06 AM David Handermann wrote: > Hi Cannon, > > The following NiFi Jira issue outlines a previous attempt to integrate > flow property storage and retrieval using HashiCorp Vault: > > https://issues.apache.org/jira/browse/NIFI-6825 > > I updated and closed the issue since the description itself outlines an > approach that no longer fits well in light of recent improvements to the > framework. > > Feel free to create a new issue and we can use that going forward. > > Regards, > David Handermann > > On Thu, Feb 3, 2022 at 12:23 AM Cannon Palms > wrote: > >> Thanks Joe! Do you know if there is an existing JIRA issue to track such >> a feature proposal (vault integration for flow secrets)? I’d be happy to >> create one if not. >> >> Cannon >> >> On Tue, Feb 1, 2022 at 3:46 PM Joe Gresock wrote: >> >>> Hi Cannon, >>> >>> Both the HashiCorp Vault Transit and Key/Value Sensitive Property >>> Providers are able to protect NiFi's configuration files (e.g., >>> nifi.properties, login-identity-providers.xml, and authorizers.xml). In >>> the case of the Transit implementation, you would use the encrypt-config.sh >>> tool from the NiFi Toolkit to encrypt properties in these files using the >>> Vault Transit Engine, and these will be decrypted using Vault as NiFi >>> starts up. The process is similar for the Key/Value implementation, but >>> the values of the properties are stored inside the Vault server instead of >>> being encrypted at rest in the configuration files. >>> >>> The properties in the flow.xml.gz file (e.g., your ConsumeMQTT processor >>> password) are protected by a different mechanism, and there is not >>> currently a Vault implementation that protects these. >>> >>> Hope this helps, >>> Joe >>> >>> On Tue, Feb 1, 2022 at 11:05 AM Cannon Palms >>> wrote: >>> Hello, From what I understand from the documentation, the transit engine of Hashicorp Vault is definitely supported for system properties. It is also clear that the standard key/value engine of Hashicorp vault is supported for sensitive processor properties (e.g. the password used to connect to an MQTT broker in a ConsumeMQTT processor). What I cannot tell is if NiFi supports using the transit engine for these sensitive properties of processors. I'd like to ensure that these properties are encrypted at rest inside of the registry, but decrypted using the transit engine and a provided vault encryption key at runtime. Is this currently supported? Or is the only the standard key/value engine supported for such properties? Thanks, Cannon >>>
Re: Hashicorp vault transit engine for sensitive properties of processors
Hi Cannon, The following NiFi Jira issue outlines a previous attempt to integrate flow property storage and retrieval using HashiCorp Vault: https://issues.apache.org/jira/browse/NIFI-6825 I updated and closed the issue since the description itself outlines an approach that no longer fits well in light of recent improvements to the framework. Feel free to create a new issue and we can use that going forward. Regards, David Handermann On Thu, Feb 3, 2022 at 12:23 AM Cannon Palms wrote: > Thanks Joe! Do you know if there is an existing JIRA issue to track such a > feature proposal (vault integration for flow secrets)? I’d be happy to > create one if not. > > Cannon > > On Tue, Feb 1, 2022 at 3:46 PM Joe Gresock wrote: > >> Hi Cannon, >> >> Both the HashiCorp Vault Transit and Key/Value Sensitive Property >> Providers are able to protect NiFi's configuration files (e.g., >> nifi.properties, login-identity-providers.xml, and authorizers.xml). In >> the case of the Transit implementation, you would use the encrypt-config.sh >> tool from the NiFi Toolkit to encrypt properties in these files using the >> Vault Transit Engine, and these will be decrypted using Vault as NiFi >> starts up. The process is similar for the Key/Value implementation, but >> the values of the properties are stored inside the Vault server instead of >> being encrypted at rest in the configuration files. >> >> The properties in the flow.xml.gz file (e.g., your ConsumeMQTT processor >> password) are protected by a different mechanism, and there is not >> currently a Vault implementation that protects these. >> >> Hope this helps, >> Joe >> >> On Tue, Feb 1, 2022 at 11:05 AM Cannon Palms >> wrote: >> >>> Hello, >>> >>> From what I understand from the documentation, the transit engine of >>> Hashicorp Vault is definitely supported for system properties. It is also >>> clear that the standard key/value engine of Hashicorp vault is supported >>> for sensitive processor properties (e.g. the password used to connect to an >>> MQTT broker in a ConsumeMQTT processor). >>> >>> What I cannot tell is if NiFi supports using the transit engine for >>> these sensitive properties of processors. >>> >>> I'd like to ensure that these properties are encrypted at rest inside of >>> the registry, but decrypted using the transit engine and a provided vault >>> encryption key at runtime. >>> >>> Is this currently supported? Or is the only the standard key/value >>> engine supported for such properties? >>> >>> Thanks, >>> Cannon >>> >>