Re: Docker Image Improvements for Kubernetes

Pierre Villard Thu, 04 Jun 2020 06:47:23 -0700

Hi,

That's really good feedback and I'm also doing similar things with my own
k8s/contained based deployments. Let's try to see if there are low-hanging
fruits that could be easily added in NiFi. Happy to look at / review pull
requests if there are things you'd like to see upstream.


Thanks,
Pierre

Le jeu. 4 juin 2020 à 13:53, Chris Sampson
<[email protected]> a écrit :

> I've been using NiFi's Docker image for a while now and thought a few notes
> from the things we've done might be useful for your work:
>
>    - Using Docker Swarm (NiFi 1.9.2)
>       - Had to add some property file updates as part of a custom
>       Dockerfile build because the start.sh didn't cover them (some of
> these
>       might have already been addressed):
>          - nifi.cluster.protocol.is.secure needs to be set to true for
>          secure clusters
>          - allow for multiple NODE_IDENTITY entries to be specified in
>          authorizers.xml via environment variables (e.g. NODE_IDENTITY_1,
>          NODE_IDENTITY_2, etc.) - add as "Node Identity" and "Initial
> USer Identity"
>          elements
>          - allow configuration of ldap in authorizers.xml
>             - uncommenting sections of the file
>             - replacing element values/attributes with environment
> variables
>             - add User Group Providers (we had a composite of LDAP and File
>             based)
>          - update nifi.properties to set `nifi.security.identity.mapping`
>          related properties for LDAP <-> PKI mappings
>          - update nifi.properties to set appropriate `
>
>  nifi.web.http.network.interface`/`nifi.web.https.network.interface`
>          related entries that were found to be required to enable
> clustering,
>          site-to-site and external connections in our Swarm setup
> (hosted across
>          multiple AWS EC2s with two Swarm "networks" in play)
>
> Having been through some of the pain above, we later moved to a Kubernetes
> stack and re-implemented some of our approach. Once decision we made was to
> inject properties/configuration files instead of using the environment
> variable replacements via start.sh (because so many things we wanted
> weren't covered and we didn't want to continue trying to update the
> provided start.sh via sed/awk commands in our Dockerfile to add more
> commands as part of the container startup routine).
>
>    - Using Kubernetes (NiFi 1.11.4)
>       - custom Dockerfile that overrides the start.sh scripts to provide:
>          - overwrite of "static" config files injected into the k8s
>          StatefulSet (i.e. everything under conf/ that isn't generated
> at startup)
>             - we set non-dynamic & non-secure values in these files within
>             our git repo then inject them into the pod
>          - set dynamic properties, e.g. hostnames (for
>          `nifi.web.https.host`), similar to the provided start.sh script
> but a
>          different set or properties as what we need is different to
> what it provides
>          - create nifi-toolkit properties files (e.g. setting `baseUrl` and
>          `proxiedEntity`, etc. based on hostname & env vars)
>          - set secure properties (e.g. encryption.keys) that have provided
>          as files/env vars by k8s/STS
>          - add "Node Identity"/"Initial User Identity" entries based on the
>          k8s/STS setup (i.e. number of nodes in the cluster)
>          - setup "Initial Admin Identity" (based on env var)
>          - request node & initial admin certificates from a nifi-toolkit
>          instance (running in server mode) then configure them in
> nifi.properties &
>          nifi-toolkit properties
>          - create "common" keystore & truststore files in a known location
>          with a common password on each cluster node - this is
> required so we can
>          configure S2S reporting tasks with an SSL Controller Service
> (that can only
>          take a single file and password combination so has to be
> common across all
>          nodes)
>          - use nifi-toolkit to encrypt conf files (after they've been
>          updated)
>          - delete unwanted NARs from lib/
>          - download required extra (apache-nifi) NARs
>       - we have persisted volumes for
>          - some logs (that we don't output to STDOUT)
>          - persisted configuration, e.g. flow.xml.gz, users.xml,
>          authorisations.xml
>          - each of the repositories
>
> Retrospectively (things always look wrong when you look back, right? 😊),
> some of the stuff we've done with our custom startup scripts would have
> probably been better as init-containers (e.g. requesting certificates,
> dynamic config changes), but things that might be worth considering from a
> NiFi Docker point of view:
>
>    - cut-down image in terms of NARs with a way to inject/download extra
>    NARs as required at startup/as part of a custom build; but that said,
> the
>    current base is probably fine and anyone wanting to delete NARs should
> do
>    so with their own custom build, as we have
>    - providing a "base" set of config files but allowing for overrides
>    using files in a known directory; here I'm thinking mainly of things
> like
>    bootstrap.conf, where you could have a conf/conf.d/01-bootstrap.conf
> file
>    to provide extra JVM args, similar to Elasticsearch jvm.options.d
>    <
> https://www.elastic.co/guide/en/elasticsearch/reference/current/jvm-options.html
> >
>    setup
>    - as you already mentioned, more property/config settings via
>    environment variables
>    - ability to change logging config (again could this be done with
>    additional files in a separate directory maybe?)
>
>
> *Chris Sampson*
> IT Consultant
> [email protected]
>
>
>
> On Wed, 3 Jun 2020 at 13:57, Shawn Weeks <[email protected]>
> wrote:
>
> > I’m working on deploying NiFi to Kubernetes and I’ve ran across several
> > things that could be improved.
> >
> >
> >   1.  Currently flow.xml.gz is stored in ./conf by default which has been
> > designated a Docker volume. In Kubernetes volumes are not pre-populated
> > from the image so I’m left with some init container magic to copy the
> > contents of ./conf to another volume and then back again otherwise ./conf
> > is empty. Since we’re configuring everything via environment variables
> > anyway setting nifi.flow.configuration.file and designate a volume just
> for
> > flow.xml.gz would solve that. You could even reuse your existing conf
> > volume if you haven’t changed anything.
> >   2.  Expose more variables - NIFI-6232 already exists for this but
> hasn’t
> > had any work.
> >   3.  Support OpenID Login Provider
> >   4.  Expose logs besides nifi-app.log
> >
> >
> >
>

Re: Docker Image Improvements for Kubernetes

Reply via email to