Hi Jayant, Running Flink in a Docker container should not have an impact on the performance in itself. Docker does not employ virtualization. To put it simply, Docker containers are processes on the host operating system that are isolated against each other using kernel features. See [1] for a more in-depth discussion.
Whether the state of your Flink Application remains consistent when containers get restarted depends on many factors, such as whether you have checkpointing and JobManager HA enabled [2][3]. Also the checkpoint files still need to be available for job recovery after container restarts. If you want to use the docker images published under https://hub.docker.com/_/flink/, you probably want to overwrite the provided flink-conf.yaml by setting the FLINK_CONF_DIR environment variable to enable a fault tolerant setup. Best, Gary [1] https://stackoverflow.com/questions/21889053/what-is-the-runtime-performance-cost-of-a-docker-container [2] https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/checkpoints.html [3] https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/jobmanager_high_availability.html On Wed, Dec 6, 2017 at 9:43 AM, Jayant Ameta <wittyam...@gmail.com> wrote: > Hi, > I wanted to explore docker-flink (using Ceph for state backend). before > opting for a standalone cluster. > > Has there been any comparative studies on the performance of docker-flink? > Would the states be consistent and performant if the docker containers go > down and respawn frequently? >