Hi Erik, We are currently running in a similar manner. To run NiFi in like this, there must be some form of data persistence. External volumes or data mounts are needed to persist the data across restarts/container failures. What we have done is changed the default directory location for each of the "repositories" (content, database, provenance, flowfile), logs, and flow.xml files. We configured them all to a different directory that can then be easily mounted to some persistent volume (K8 would be a PVC). As Bryan said, ZK is used to manage the cluster state (distributed application and all). ZK should also be configured in a similar way if also deployed as a container or ephemeral using a persistent data mount of some kind. Basically anytime that you are deploying a stateful application in a containerized/ephemeral environment, there has to be some form of data persistence, whatever that looks like for the given environment.
Hope this helps, Ryan H. On Wed, Jan 16, 2019 at 12:37 PM Erik Anderson <[email protected]> wrote: > Ok, > > I guess the better question to as is > > If NiFi is running in an ephemeral environment or mode such as kubernetes > and docker with NO external volumes how can NiFi be backed up+recovered in > event of a tragedy? > > Why not a simple S3 endpoint integration? > > Erik Anderson > Bloomberg > > On Wed, Jan 16, 2019, at 10:25 AM, Bryan Bende wrote: > > Hello, > > ZooKeeper is not used at all in stand alone mode. > > In a cluster it used for leader election, as well as to store state for > processors that maintain state, such as source processors that keep track > of a time stamp or id. > > -Bryan > > On Wed, Jan 16, 2019 at 7:49 AM Erik Anderson <[email protected]> wrote: > > Our NiFi is running in docker container. I am using some cool ideas from > > > https://github.com/apache/nifi/blob/master/nifi-docker/dockerhub/sh/start.sh#L39 > > Simply set environment variables, stop and restart the container and the > systems are completely reconfigured. > > At the moment, I am using an external volume for the nifi/conf directory. > Inside this directory is so much contents. Provenance, flow files, canvas, > NiFi state? > > I know if I kill the container and restart it, because of the external > Docker volume, the system comes backup in its original state. The NiFi > runtime is ephemeral but due to the external volume, it all restores. > > My Question is: > If I start using Zookeeper, where even the nifi/conf directory becomes > ephemeral, what is Zookeeper doing for me in both the single instance and > clustered instances of NiFi. > > 1) Single Node NiFi > - Does Zookeeper store the provenance events > - Does Zookeeper store the state? > - Does Zookeeper store the flows and canvas themselves? > - Does Zookeeper also store the nifi.properties files and everything > under nifi/conf? > 2) Cluster mode - > https://github.com/apache/nifi/blob/master/nifi-docker/dockerhub/sh/start.sh#L34 > - What is Zookeeper doing in the cluster mode that its not doing in the > Single Node? Just leader election processes? > > > Thanks, > Erik Anderson > Bloomberg > > -- > Sent from Gmail Mobile > > >
