Hi Nilesh, Based on that graph, I think each node in your cluster only has 173GB of content storage. It makes sense that you're having trouble transferring files greater than 100GB. Depending on the node it's assigned to and what else is going on in the cluster it may not be able to evict enough other content claims to create one for your 100GB or larger files. To do the file transfer through nifi you must increase the size of the content repository on each of your nifi nodes so that it's bigger than 400GB plus a wide safety margin. You should also put the FetchS3 -> putSFTP part of your flow within a process group and configure its concurrency settings so it only allows one flowfile per node into the process group at a time. There are no other processors or flow settings I know of which will allow you to stream the file transfer in smaller chunks, so I'm afraid there is no flow-only solution to this problem.
Here's some helpful additional reading: https://community.cloudera.com/t5/Community-Articles/Understanding-how-NiFi-s-Content-Repository-Archiving-works/ta-p/249418 -Eric On Tue., May 9, 2023, 9:07 a.m. Kumar, Nilesh, <[email protected]> wrote: > Hi Eric > > > > I see following for my content Repository. Can you please help me on how > to tweak it further. I have deployed nifi on K8s with 3 replica pod > cluster, with no limit of resource. But I guess the pod cpu/memory will be > throttled by node capacity itself. I noticed that single I have one single > file as 400GB all the load goes to any one of the node that picks up the > transfer. I wanted to know if we can do this any other way of configuring > the flow. If not please tell me the metrics for nifi to tweak. > > > > *From:* Eric Secules <[email protected]> > *Sent:* Tuesday, May 9, 2023 9:26 PM > *To:* [email protected]; Kumar, Nilesh <[email protected]> > *Subject:* [EXT] Re: Need Help in migrating Giant CSV from S3 to SFTP > > > > Hi Nilesh, > > > > Check the size of your content repository. If you want to transfer a 400GB > file through nifi, your content repository must be greater than 400GB, > someone else might have a better idea of how much bigger you need. But > generally it all depends on how many of these big files you want to > transfer at the same time. You can check the content repository metrics in > the Node Status from the hamburger menu in the top right corner of the > canvas. > > > > -Eric > > > > On Tue., May 9, 2023, 8:42 a.m. Kumar, Nilesh via users, < > [email protected]> wrote: > > Hi Team, > > I want to move a very large file like 400GB from S3 to SFTP. I have used > listS3 -> FetchS3 -> putSFTP. This works for smaller files till 30GB but > fails for larger(100GB) files. Is there any way to configure this flow so > that it handles very large single file. If there is any template that > exists please share. > > My configuration are all standard processor configuration. > > > > Thanks, > > Nilesh > > > > > > > > This message (including any attachments) contains confidential information > intended for a specific individual and purpose, and is protected by law. If > you are not the intended recipient, you should delete this message and any > disclosure, copying, or distribution of this message, or the taking of any > action based on it, by you is strictly prohibited. > > Deloitte refers to a Deloitte member firm, one of its related entities, or > Deloitte Touche Tohmatsu Limited ("DTTL"). Each Deloitte member firm is a > separate legal entity and a member of DTTL. DTTL does not provide services > to clients. Please see www.deloitte.com/about to learn more. > > v.E.1 > >
