Mike, I'm using snappy compression before loading files into S3. On Wed, 16 Dec 2020, 8:28 pm Mike Thomsen, <[email protected]> wrote:
> To add to that, you should compress the content before loading into S3 > or you will be paying a lot more than you have to. > > On Wed, Dec 16, 2020 at 6:49 AM Pierre Villard > <[email protected]> wrote: > > > > Yes it should work just fine. The relationship backpressure settings are > just soft limits: if backpressure is not enabled, then the upstream > processor can be triggered even if the processor generates a huge flow file > that would cause the backpressure to be enabled. The backpressure mechanism > is only at trigger time. > > > > Regarding memory, the record processors are processing data in a > streaming fashion, the data will never get fully loaded into memory. > > > > Generally speaking, NiFi is agnostic of the data size and can deal with > any kind of large/small files. > > > > Hope this helps, > > Pierre > > > > > > Le mer. 16 déc. 2020 à 06:39, naga satish <[email protected]> a > écrit : > >> > >> My team designed a NiFi flow to handle CSV files of size around 15GB. > But later we realised that files can be upto 500 GB. I set the queue size > limit to 25GB. This is a one time data load to S3. I'm converting each CSV > file to parquet in NiFi using a convert record processor. What happens in > these situations? Can NiFi be able to handle this kind of scenario? > >> > >> FYI, my NiFi has 40 gigs of memory and 2TB of storage. > >> > >> Regards > >> Satish >
