We have been using Nifi extensively on AWS for the last 9 months processing relatively high volumes of data. We have two primary uses cases for Nifi – ingest the data and process the data. We do that on separate instances with Kafka in the middle.
For just consuming the data we use at least m4.xlarge because of the “high” network performance. For processing the data it depends on how many processors we are running and how cpu intensive they are. We have several custom processors. We take advantage of the “concurrent tasks” option quite a bit so we try to scale accordingly going all the way up to the m4.10xlarge at times Memory has rarely been an issue We add a lot of storage to support the queues which as been a life saver! We have run into issues with provenance not keeping up. By using provisioned IOPS for the storage type we can bump up the IOPS accordingly without increasing storage. As James mentioned start small and increase as needed. Ralph From: James Wing <[email protected]<mailto:[email protected]>> Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Date: Thursday, June 30, 2016 at 8:38 AM To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Re: Best EC2 instance type for NiFi Stephane, I think too much will depend on the nature of your data and the flow gauntlet you run it through. Out of the box, NiFi can run on a t2.micro, although a modest flow will quickly exceed that. A flow doing a high volume of regular expressions in parallel might benefit from a compute-optimized instance. Some flows with simple processing of many large objects will be bound more by IO than CPU. And the performance of the systems NiFi connects with is likely to be a big factor. Learning which of these problems you will have requires developing and running the flow for a while. I recommend a general-purpose instance until you scale up enough to know which, if any, specialized instance optimized for compute, memory, or IO would help. You might also consider the disk configurations and provisioned IOPS options there. The great thing about EC2 is that you can start small and trade up to a bigger instance when you know more. Thanks, James On Wed, Jun 29, 2016 at 8:51 PM, Stéphane Maarek <[email protected]<mailto:[email protected]>> wrote: Hi, I'm wondering which instance on AWS EC2 is best suited for NiFi (let's say for a standalone). I'm wondering if it's a compute instance (c4), or something else? and why? Thanks for your help! Stephane
