Thank you Gary. I know that theoretically there shouldn't be any performance issue. I was curious to know if any other users have tried out docker-flink and whether they have faced/reported any performance hit. I would want real time processing for some of the events, and was looking existing users' experience with docker-flink.
Jayant Ameta On Thu, Dec 7, 2017 at 4:37 PM, Gary Yao <g...@data-artisans.com> wrote: > Hi Jayant, > > Running Flink in a Docker container should not have an impact on the > performance > in itself. Docker does not employ virtualization. To put it simply, Docker > containers are processes on the host operating system that are isolated > against > each other using kernel features. See [1] for a more in-depth discussion. > > Whether the state of your Flink Application remains consistent when > containers > get restarted depends on many factors, such as whether you have > checkpointing > and JobManager HA enabled [2][3]. Also the checkpoint files still need to > be > available for job recovery after container restarts. > > If you want to use the docker images published under > https://hub.docker.com/_/flink/, you probably want to overwrite the > provided > flink-conf.yaml by setting the FLINK_CONF_DIR environment variable to > enable a > fault tolerant setup. > > Best, > Gary > > [1] https://stackoverflow.com/questions/21889053/what-is- > the-runtime-performance-cost-of-a-docker-container > [2] https://ci.apache.org/projects/flink/flink-docs- > release-1.3/setup/checkpoints.html > [3] https://ci.apache.org/projects/flink/flink-docs- > release-1.3/setup/jobmanager_high_availability.html > > On Wed, Dec 6, 2017 at 9:43 AM, Jayant Ameta <wittyam...@gmail.com> wrote: > >> Hi, >> I wanted to explore docker-flink (using Ceph for state backend). before >> opting for a standalone cluster. >> >> Has there been any comparative studies on the performance of >> docker-flink? Would the states be consistent and performant if the docker >> containers go down and respawn frequently? >> > >