Yeah, exactly, co-locating for us means the pods that sit on the same nodes have network access to each other. We aren't using any dynamic overlay networks.
We are excited about Envoy/Istio and its service mesh concepts, so that access between pods is uniform. EJ On Sun, Aug 13, 2017 at 1:02 PM <terencek...@gmail.com> wrote: > EJ, > > Awesome info and I believe I follow the breakdown. For the record, we're > are nowhere near the scale you're working with - machines or employees :-). > > Just one last clarifying question about what you mean by co-locating. I > assume that means all the workloads (pods) within a BU's cluster are > allowed network-level access to each other. Service access is restricted > others ways (e.g. athenz), if necessary. Is that right? > > Best, > Terence > > On Saturday, August 12, 2017 at 4:21:52 PM UTC-7, EJ Campbell wrote: > > We have an internal auth mechanism that fronts our services and has also > been integrated with RBAC into K8s, which protects almost all our services: > https://github.com/yahoo/athenz > > > > > > Network / Host level ACL's (automated iptables) are used to prevent one > cluster from hitting another for our most sensitive systems (and for legacy > workloads). > > > > > > To your point below, we did have to setup more course grained acl's > (which is the minority of systems that don't support athenz) when moving to > kuberetes from static data center deployments where workloads don't move. > Kubernetes labeling is used to ensure workloads only run on nodes we know > have the proper network acl's. > > > > > > Our business units are pretty broad. We have about 100+ separate > services from about 30 teams deployed in one K8s cluster in one data > center. The rough rule has been if we give our central ops team in the BU > sudo across workloads, we are okay with those workloads co-locating. > > > > > > I do think the more fine grained isolation the public cloud can support > (especially with cluster auto-scaling) provides some advantages over the > model we settled on inside our data center. For us, it is nice to really > let K8s and our operators bin-pack our workloads optimally by having a > larger number of pods and nodes to work with. > > > > > > EJ > > > > > > On Sat, Aug 12, 2017 at 12:42 PM <teren...@gmail.com> wrote: > > EJ, > > > > > > > > Thanks for the insight into how you're handling your clusters! That's > very helpful. To your question, we are running in one of the public clouds > (AWS). > > > > > > > > Since you group clusters by business unit, I have a question about how > you handle your network security. Is it the case that: > > > > > > > > (1) All applications deployed by a business unit can share a common set > of network egress rules for traffic going outside the cluster, and all pods > deployed by a business unit are allowed network access to each other. > > > > > > > > (2) Applications deployed by a business unit have varying security > requirements and not all containers deployed by a business unit can have > network access to all the others. > > > > > > > > If (1) is true, then I would assume that each business unit deploys what > I would consider an "application" in our world and you're more or less > doing what we're hoping to set up. > > > > > > > > If (2) is true, then I'm curious how you're handling the security part. > Are you using a cluster network solution, like calico, for ingress/egress > rules? Are you using a traditional network solution, then ensuring certain > pods are only scheduled on nodes that will be subject to those rules? Are > you doing something else entirely? > > > > > > > > Thanks in advance! > > > > Terence > > > > > > > > > > > > On Friday, August 11, 2017 at 8:12:56 PM UTC-7, EJ Campbell wrote: > > > > > Curious if you are running your setup within a physical data center or > in one of the public clouds? > > > > > > > > > > > > > > > Where I am, we group clusters by large-scale business unit (many > applications deployed on behalf of hundreds of engineers) and run three > clusters per physical data center: canary, "corp" (special network access) > and prod. RBAC is used for authorization among those hundreds of engineers, > with cluster admin access for production engineers within that business > unit. > > > > > > > > > > > > > > > The separate clusters per data center and canary cluster within data > center keeps the "blast radius" small during deployments and chef keeps > things manageable between clusters. > > > > > > > > > > > > > > > EJ > > > > > > > > > > > > > > > On Fri, Aug 11, 2017 at 6:42 PM <teren...@gmail.com> wrote: > > > > > Tim, > > > > > > > > > > > > > > > > > > > > This is a hugely appreciated answer. Sorry the question was so generic! > > > > > > > > > > > > > > > > > > > > For now, we'll chose separate clusters - understanding the sub-optimal > side effects of that decision. We'll revisit the single-or-multiple cluster > question in a year or so, once we've had a reasonable amount of experience > running everything on it. I'll bet the work required to run our > infrastructure in a single cluster won't seem so daunting then :-). > > > > > > > > > > > > > > > > > > > > FWIW, we're huge fans of k8s so far and needing to run a few separate > clusters is a pittance compared to the gains. > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > Terence > > > > > > > > > > > > > > > > > > > > On Friday, August 11, 2017 at 4:55:34 PM UTC-7, Tim Hockin wrote: > > > > > > > > > > > This is not an easy question to answer without opining. I'll try. > > > > > > > > > > > > > > > > > > > > > > Kubernetes was designed to model Borg. This assumes a smaller number > > > > > > > > > > > of larger clusters, shared across applications and users and > > > > > > > > > > > environments. This design decision is visible in a number of places > > > > > > > > > > > in the system - Namespaces, ip-per-pod, Services, PersistentVolumes. > > > > > > > > > > > We really emphasized the idea that sharing is important, and the > > > > > > > > > > > consumption of global resources (such as ports on a node) is rare and > > > > > > > > > > > carefully guarded. The benefits of this are myriad: amortization of > > > > > > > > > > > overhead, good HA properties, efficient bin-packing, higher > > > > > > > > > > > utilization, and centralization of cluster administration, just to > > > > > > > > > > > name a few. > > > > > > > > > > > > > > > > > > > > > > What we see most often, though, is people using one cluster for one > > > > > > > > > > > application (typically a number of Deployments, Services, etc. but > one > > > > > > > > > > > logical app). This means that any user of non-trivial size is going > > > > > > > > > > > to have multiple clusters. This reduces the opportunities for > > > > > > > > > > > efficiency and overhead amortization, and increases the > administrative > > > > > > > > > > > burden (and likely decreases the depth of understanding any one admin > > > > > > > > > > > can reach). > > > > > > > > > > > > > > > > > > > > > > So, why? > > > > > > > > > > > > > > > > > > > > > > First, Kubernetes is, frankly, missing a few things that make the > Borg > > > > > > > > > > > model truly viable. Most clouds do not have sub-VM billing > abilities. > > > > > > > > > > > Container security is not well trusted yet (though getting better). > > > > > > > > > > > Linux's isolation primitives are not perfect (Google has hundreds or > > > > > > > > > > > thousands of patches) but they are catching up. The story around > > > > > > > > > > > identity and policy and authorization and security are not where we > > > > > > > > > > > want them to be. > > > > > > > > > > > > > > > > > > > > > > Second, it's still pretty early in the overall life of this system. > > > > > > > > > > > Best practices are still being developed / discovered. Books are > > > > > > > > > > > still being written. > > > > > > > > > > > > > > > > > > > > > > Third, the system is still evolving rapidly. People are not sure how > > > > > > > > > > > things like multi-tenancy are going to look as they emerge. Siloing > > > > > > > > > > > is a hedge against uncertainty. > > > > > > > > > > > > > > > > > > > > > > Fourth, upgrades-in-place are not as easy or robust as we want them > to > > > > > > > > > > > be. It's sometimes easier to just bring up a new cluster and flip > the > > > > > > > > > > > workload over. That is easier when the workload is more contained. > > > > > > > > > > > > > > > > > > > > > > All that said, I still believe the right eventual model is shared. I > > > > > > > > > > > fully understand why people are not doing that yet, but I think that > > > > > > > > > > > in a couple years time we will look back on this era as Kubernetes' > > > > > > > > > > > awkward adolescence. > > > > > > > > > > > > > > > > > > > > > > Tim > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 11, 2017 at 4:38 PM, wrote: > > > > > > > > > > > > First, I'm sorry if this question has already been asked & > answered. My search-foo may have failed me. > > > > > > > > > > > > > > > > > > > > > > > > We're in the process of moving to k8s and I'm not confident about > how many clusters I should setup. I know there are many possible options, > but I'd really appreciate feedback from people running k8s throughout their > company. > > > > > > > > > > > > > > > > > > > > > > > > Nearly everything we run is containerized, and that includes our > company-wide internal services like FreeIPA, Gitlab, Jenkins, etc. We also > have multiple, completely separate, applications with varying > security/auditing needs. > > > > > > > > > > > > > > > > > > > > > > > > Today, we schedule all of our containers via salt which only > allows for containers to be mapped to systems in a fixed way (not great). > We have a group of systems for each application environment and one group > for internal services. Each group of systems may be subject to different > network restrictions, depending on what they're running. > > > > > > > > > > > > > > > > > > > > > > > > The seemingly-obvious answer to replace our setup with k8s > clusters is the following configuration: > > > > > > > > > > > > > > > > > > > > > > > > - Create one cluster for internal services > > > > > > > > > > > > - Create one cluster per application, with environments managed by > namespaces whenever possible > > > > > > > > > > > > > > > > > > > > > > > > Great, that puts us with several clusters, but a smaller number of > clusters than our previous "system groups". And, our network rules will > mostly remain as-is. > > > > > > > > > > > > > > > > > > > > > > > > However, there is another option. It seems that a mix of calico > ingress/egress rules, namespaces, RBAC, and carefully crafted pod resource > definitions would allow us to have a single large cluster. Maybe it's just > my inexperience, but that path seems daunting. > > > > > > > > > > > > > > > > > > > > > > > > So, all that background leads me to the simple question: In > general, do you create one cluster per application? If not, do you have > some other general rule that's not just "when latency or redudancy require > it, make a new cluster"? > > > > > > > > > > > > > > > > > > > > > > > > Thanks in advance! > > > > > > > > > > > > Terence > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > You received this message because you are subscribed to the Google > Groups "Kubernetes user discussion and Q&A" group. > > > > > > > > > > > > To unsubscribe from this group and stop receiving emails from it, > send an email to kubernetes-use...@googlegroups.com. > > > > > > > > > > > > To post to this group, send email to kubernet...@googlegroups.com. > > > > > > > > > > > > Visit this group at > https://groups.google.com/group/kubernetes-users. > > > > > > > > > > > > For more options, visit https://groups.google.com/d/optout. > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > You received this message because you are subscribed to the Google > Groups "Kubernetes user discussion and Q&A" group. > > > > > > > > > > To unsubscribe from this group and stop receiving emails from it, send > an email to kubernetes-use...@googlegroups.com. > > > > > > > > > > To post to this group, send email to kubernet...@googlegroups.com. > > > > > > > > > > Visit this group at https://groups.google.com/group/kubernetes-users. > > > > > > > > > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Kubernetes user discussion and Q&A" group. To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-users+unsubscr...@googlegroups.com. To post to this group, send email to kubernetes-users@googlegroups.com. Visit this group at https://groups.google.com/group/kubernetes-users. For more options, visit https://groups.google.com/d/optout.