I'd recommend looking at Ansible first, then Chef, for the configuration details. LDAP is pretty straight forward to configure unless you have a very nuanced hierarchy of user accounts you want to bring in (like you want to span multiple LDAP branches; was never able to get that to work for a client). What you'll want to have, if this is a semi-production or full production setup, is as follows:
3 Nodes to start, each on a separate machine. 3 ZooKeeper 1 Registry 1 Database instance. We run NiFi nodes on dedicated 8 core/16GB of RAM boxes, though we have one or two running in our development setup that are 16/64 respectively. We use a cheap RDS instance running Postgres to power the database for the Registry. I think that's 1 or 2 cores with about 2-4GB of RAM. On Mon, Dec 9, 2019 at 1:32 PM Márcio Sugar <[email protected]> wrote: > Hi, > > I'm looking for ways to automate the deployment of a NiFi 1.10 cluster to > an on-prem virutalized production environment which I have no sudo access > to. > > Our System Engineers prefer to deploy RPM pachages on our behalf, and > that, I believe, would be the easiest way for them. I know NiFi's build has > is an rpm profile, but I'm not sure how to automate the cluster and secure > configuration. (Also, I'd prefer to avoid having to build NiFi myself.) > > The alternatives I can see (some of them could be used together, I guess): > > - Post-installation Bash scripts > - Ansible > - Terraform > - Kubernetes (?) > > Does anybody in the group have any recommendations or, even better, > concrete examples you could share? (Pierre Villard's NiFi with OIDC using > Terraform on the Google Cloud Platform > <https://pierrevillard.com/2019/08/21/nifi-with-oidc-using-terraform-on-the-google-cloud-platform/> > is a great source, but it deploys to GCP and uses OpenID Connect (OIDC). > Our cluster is on-prem and we use LDAP for authentication.) > > We are planing to set a cluster with 3 small nodes at first to gain some > experience about the load and required resources when running our data > flows. We don't have many transfomations, NiFi is being used just to move > data from our local systems of record to the cloud. It's for data warehouse > project, and High Availability is not a major requirement at this moment. > In that scenario, what would be better? To have 3 NiFi instances, each with > an embedded Zookeeper? If so, is OK to run the NiFi Registry instance on > one of the nodes? Or should we have a separated node for an external > (standalone?) Zookeeper and NiFi Registry instances? > > Thank you, > > Marcio >
