Really interesting blog post. Thanks for writing this up! https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/RuncContainers.html
Just in case you don't know already, you might be interested in the RuncContainerRuntime as well. It has some advantages over DockerLinuxContainerRuntime in the fact that you don't have to deal with the docker daemon (potential bottleneck and root escalation attack vector) or have a dependency on Docker at all, images are distributed via the YARN distributed cache, and you don't have to spend time decompressing images up front, among some other smaller wins. If you have any questions, I'd be happy to explain the work a little more Eric On Thu, Jul 22, 2021 at 2:59 PM Mithun Mathew <mmat...@uber.com.invalid> wrote: > Hi all > > We wanted to share our story with the community about migrating the > majority of the Apache Hadoop production fleet at Uber to run in Docker > containers. > > Here's a link to our blog post that we published today: > > https://urldefense.proofpoint.com/v2/url?u=https-3A__eng.uber.com_hadoop-2Dcontainer-2Dblog_&d=DwIBaQ&c=sWW_bEwW_mLyN3Kx2v57Q8e-CRbmiT9yOhqES_g_wVY&r=KVdP1SUmHYb-tZP8tcigmw&m=ZCzEveVQ7BrTU-z0LhhshKNnBmVBEE-D2KqytQT_sdg&s=4Hd337BvWYh94n4APYFXMRuc4xP1bUqVMhJEm9kRzF8&e= > Hope our story helps the community in some way, as it has helped us in > the past for scaling Uber's Hadoop deployment. > > On behalf of Uber's Hadoop team, > *Matt* >