Hi Jayant,

Running Flink in a Docker container should not have an impact on the
performance
in itself. Docker does not employ virtualization. To put it simply, Docker
containers are processes on the host operating system that are isolated
against
each other using kernel features. See [1] for a more in-depth discussion.

Whether the state of your Flink Application remains consistent when
containers
get restarted depends on many factors, such as whether you have
checkpointing
and JobManager HA enabled [2][3]. Also the checkpoint files still need to be
available for job recovery after container restarts.

If you want to use the docker images published under
https://hub.docker.com/_/flink/, you probably want to overwrite the provided
flink-conf.yaml by setting the FLINK_CONF_DIR environment variable to
enable a
fault tolerant setup.

Best,
Gary

[1]
https://stackoverflow.com/questions/21889053/what-is-the-runtime-performance-cost-of-a-docker-container
[2]
https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/checkpoints.html
[3]
https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/jobmanager_high_availability.html

On Wed, Dec 6, 2017 at 9:43 AM, Jayant Ameta <wittyam...@gmail.com> wrote:

> Hi,
> I wanted to explore docker-flink (using Ceph for state backend). before
> opting for a standalone cluster.
>
> Has there been any comparative studies on the performance of docker-flink?
> Would the states be consistent and performant if the docker containers go
> down and respawn frequently?
>

Reply via email to