Hi Mike,
Thank you for your recommendation and details about your cluster. They make me more confident to proceed.
Thanks,
Marcio
On Dec 17, 2019 8:41 AM, Mike Thomsen <[email protected]> wrote:
I'd recommend looking at Ansible first, then Chef, for the configuration details. LDAP is pretty straight forward to configure unless you have a very nuanced hierarchy of user accounts you want to bring in (like you want to span multiple LDAP branches; was never able to get that to work for a client). What you'll want to have, if this is a semi-production or full production setup, is as follows:3 Nodes to start, each on a separate machine.3 ZooKeeper1 Registry1 Database instance.We run NiFi nodes on dedicated 8 core/16GB of RAM boxes, though we have one or two running in our development setup that are 16/64 respectively. We use a cheap RDS instance running Postgres to power the database for the Registry. I think that's 1 or 2 cores with about 2-4GB of RAM.On Mon, Dec 9, 2019 at 1:32 PM Márcio Sugar <[email protected]> wrote:Hi,
I'm looking for ways to automate the deployment of a NiFi 1.10 cluster to an on-prem virutalized production environment which I have no sudo access to.Our System Engineers prefer to deploy RPM pachages on our behalf, and that, I believe, would be the easiest way for them. I know NiFi's build has is an rpm profile, but I'm not sure how to automate the cluster and secure configuration. (Also, I'd prefer to avoid having to build NiFi myself.)The alternatives I can see (some of them could be used together, I guess):
- Post-installation Bash scripts
- Ansible
- Terraform
- Kubernetes (?)
Does anybody in the group have any recommendations or, even better, concrete examples you could share? (Pierre Villard's NiFi with OIDC using Terraform on the Google Cloud Platform is a great source, but it deploys to GCP and uses OpenID Connect (OIDC). Our cluster is on-prem and we use LDAP for authentication.)We are planing to set a cluster with 3 small nodes at first to gain some experience about the load and required resources when running our data flows. We don't have many transfomations, NiFi is being used just to move data from our local systems of record to the cloud. It's for data warehouse project, and High Availability is not a major requirement at this moment. In that scenario, what would be better? To have 3 NiFi instances, each with an embedded Zookeeper? If so, is OK to run the NiFi Registry instance on one of the nodes? Or should we have a separated node for an external (standalone?) Zookeeper and NiFi Registry instances?
Thank you,
Marcio
