To add to that, you should compress the content before loading into S3
or you will be paying a lot more than you have to.

On Wed, Dec 16, 2020 at 6:49 AM Pierre Villard
<[email protected]> wrote:
>
> Yes it should work just fine. The relationship backpressure settings are just 
> soft limits: if backpressure is not enabled, then the upstream processor can 
> be triggered even if the processor generates a huge flow file that would 
> cause the backpressure to be enabled. The backpressure mechanism is only at 
> trigger time.
>
> Regarding memory, the record processors are processing data in a streaming 
> fashion, the data will never get fully loaded into memory.
>
> Generally speaking, NiFi is agnostic of the data size and can deal with any 
> kind of large/small files.
>
> Hope this helps,
> Pierre
>
>
> Le mer. 16 déc. 2020 à 06:39, naga satish <[email protected]> a écrit :
>>
>> My team designed a NiFi flow to handle CSV files of size around 15GB. But 
>> later we realised that files can be upto 500 GB. I set the queue size limit 
>> to 25GB. This is a one time data load to S3. I'm converting each CSV file to 
>> parquet in NiFi using a convert record processor. What happens in these 
>> situations? Can NiFi be able to handle this kind of scenario?
>>
>> FYI, my NiFi has 40 gigs of memory and 2TB of storage.
>>
>> Regards
>> Satish

Reply via email to