EJ,

Awesome info and I believe I follow the breakdown. For the record, we're are 
nowhere near the scale you're working with - machines or employees :-).

Just one last clarifying question about what you mean by co-locating. I assume 
that means all the workloads (pods) within a BU's cluster are allowed 
network-level access to each other. Service access is restricted others ways 
(e.g. athenz), if necessary. Is that right?

Best,
Terence

On Saturday, August 12, 2017 at 4:21:52 PM UTC-7, EJ Campbell wrote:
> We have an internal auth mechanism that fronts our services and has also been 
> integrated with RBAC into K8s, which protects almost all our services: 
> https://github.com/yahoo/athenz
> 
> 
> Network / Host level ACL's (automated iptables) are used to prevent one 
> cluster from hitting another for our most sensitive systems (and for legacy 
> workloads).
> 
> 
> To your point below, we did have to setup more course grained acl's (which is 
> the minority of systems that don't support athenz) when moving to kuberetes 
> from static data center deployments where workloads don't move. Kubernetes 
> labeling is used to ensure workloads only run on nodes we know have the 
> proper network acl's.
> 
> 
> Our business units are pretty broad. We have about 100+ separate services 
> from about 30 teams deployed in one K8s cluster in one data center. The rough 
> rule has been if we give our central ops team in the BU sudo across 
> workloads, we are okay with those workloads co-locating.
> 
> 
> I do think the more fine grained isolation the public cloud can support 
> (especially with cluster auto-scaling) provides some advantages over the 
> model we settled on inside our data center. For us, it is nice to really let 
> K8s and our operators bin-pack our workloads optimally by having a larger 
> number of pods and nodes to work with.
> 
> 
> EJ
> 
> 
> On Sat, Aug 12, 2017 at 12:42 PM <teren...@gmail.com> wrote:
> EJ,
> 
> 
> 
> Thanks for the insight into how you're handling your clusters! That's very 
> helpful. To your question, we are running in one of the public clouds (AWS).
> 
> 
> 
> Since you group clusters by business unit, I have a question about how you 
> handle your network security. Is it the case that:
> 
> 
> 
> (1) All applications deployed by a business unit can share a common set of 
> network egress rules for traffic going outside the cluster, and all pods 
> deployed by a business unit are allowed network access to each other.
> 
> 
> 
> (2) Applications deployed by a business unit have varying security 
> requirements and not all containers deployed by a business unit can have 
> network access to all the others.
> 
> 
> 
> If (1) is true, then I would assume that each business unit deploys what I 
> would consider an "application" in our world and you're more or less doing 
> what we're hoping to set up.
> 
> 
> 
> If (2) is true, then I'm curious how you're handling the security part. Are 
> you using a cluster network solution, like calico, for ingress/egress rules? 
> Are you using a traditional network solution, then ensuring certain pods are 
> only scheduled on nodes that will be subject to those rules? Are you doing 
> something else entirely?
> 
> 
> 
> Thanks in advance!
> 
> Terence
> 
> 
> 
> 
> 
> On Friday, August 11, 2017 at 8:12:56 PM UTC-7, EJ Campbell wrote:
> 
> > Curious if you are running your setup within a physical data center or in 
> > one of the public clouds?
> 
> >
> 
> >
> 
> > Where I am, we group clusters by large-scale business unit (many 
> > applications deployed on behalf of hundreds of engineers) and run three 
> > clusters per physical data center: canary, "corp" (special network access) 
> > and prod. RBAC is used for authorization among those hundreds of engineers, 
> > with cluster admin access for production engineers within that business 
> > unit.
> 
> >
> 
> >
> 
> > The separate clusters per data center and canary cluster within data center 
> > keeps the "blast radius" small during deployments and chef keeps things 
> > manageable between clusters.
> 
> >
> 
> >
> 
> > EJ
> 
> >
> 
> >
> 
> > On Fri, Aug 11, 2017 at 6:42 PM <teren...@gmail.com> wrote:
> 
> > Tim,
> 
> >
> 
> >
> 
> >
> 
> > This is a hugely appreciated answer. Sorry the question was so generic!
> 
> >
> 
> >
> 
> >
> 
> > For now, we'll chose separate clusters - understanding the sub-optimal side 
> > effects of that decision. We'll revisit the single-or-multiple cluster 
> > question in a year or so, once we've had a reasonable amount of experience 
> > running everything on it. I'll bet the work required to run our 
> > infrastructure in a single cluster won't seem so daunting then :-).
> 
> >
> 
> >
> 
> >
> 
> > FWIW, we're huge fans of k8s so far and needing to run a few separate 
> > clusters is a pittance compared to the gains.
> 
> >
> 
> >
> 
> >
> 
> > Best,
> 
> >
> 
> > Terence
> 
> >
> 
> >
> 
> >
> 
> > On Friday, August 11, 2017 at 4:55:34 PM UTC-7, Tim Hockin wrote:
> 
> >
> 
> > > This is not an easy question to answer without opining.  I'll try.
> 
> >
> 
> > >
> 
> >
> 
> > > Kubernetes was designed to model Borg.  This assumes a smaller number
> 
> >
> 
> > > of larger clusters, shared across applications and users and
> 
> >
> 
> > > environments.  This design decision is visible in a number of places
> 
> >
> 
> > > in the system - Namespaces, ip-per-pod, Services, PersistentVolumes.
> 
> >
> 
> > > We really emphasized the idea that sharing is important, and the
> 
> >
> 
> > > consumption of global resources (such as ports on a node) is rare and
> 
> >
> 
> > > carefully guarded.  The benefits of this are myriad: amortization of
> 
> >
> 
> > > overhead, good HA properties, efficient bin-packing, higher
> 
> >
> 
> > > utilization, and centralization of cluster administration, just to
> 
> >
> 
> > > name a few.
> 
> >
> 
> > >
> 
> >
> 
> > > What we see most often, though, is people using one cluster for one
> 
> >
> 
> > > application (typically a number of Deployments, Services, etc. but one
> 
> >
> 
> > > logical app).  This means that any user of non-trivial size is going
> 
> >
> 
> > > to have multiple clusters.  This reduces the opportunities for
> 
> >
> 
> > > efficiency and overhead amortization, and increases the administrative
> 
> >
> 
> > > burden (and likely decreases the depth of understanding any one admin
> 
> >
> 
> > > can reach).
> 
> >
> 
> > >
> 
> >
> 
> > > So, why?
> 
> >
> 
> > >
> 
> >
> 
> > > First, Kubernetes is, frankly, missing a few things that make the Borg
> 
> >
> 
> > > model truly viable.  Most clouds do not have sub-VM billing abilities.
> 
> >
> 
> > > Container security is not well trusted yet (though getting better).
> 
> >
> 
> > > Linux's isolation primitives are not perfect (Google has hundreds or
> 
> >
> 
> > > thousands of patches) but they are catching up.  The story around
> 
> >
> 
> > > identity and policy and authorization and security are not where we
> 
> >
> 
> > > want them to be.
> 
> >
> 
> > >
> 
> >
> 
> > > Second, it's still pretty early in the overall life of this system.
> 
> >
> 
> > > Best practices are still being developed / discovered.  Books are
> 
> >
> 
> > > still being written.
> 
> >
> 
> > >
> 
> >
> 
> > > Third, the system is still evolving rapidly.  People are not sure how
> 
> >
> 
> > > things like multi-tenancy are going to look as they emerge.  Siloing
> 
> >
> 
> > > is a hedge against uncertainty.
> 
> >
> 
> > >
> 
> >
> 
> > > Fourth, upgrades-in-place are not as easy or robust as we want them to
> 
> >
> 
> > > be.  It's sometimes easier to just bring up a new cluster and flip the
> 
> >
> 
> > > workload over.  That is easier when the workload is more contained.
> 
> >
> 
> > >
> 
> >
> 
> > > All that said, I still believe the right eventual model is shared.  I
> 
> >
> 
> > > fully understand why people are not doing that yet, but I think that
> 
> >
> 
> > > in a couple years time we will look back on this era as Kubernetes'
> 
> >
> 
> > > awkward adolescence.
> 
> >
> 
> > >
> 
> >
> 
> > > Tim
> 
> >
> 
> > >
> 
> >
> 
> > > On Fri, Aug 11, 2017 at 4:38 PM,   wrote:
> 
> >
> 
> > > > First, I'm sorry if this question has already been asked & answered. My 
> > > > search-foo may have failed me.
> 
> >
> 
> > > >
> 
> >
> 
> > > > We're in the process of moving to k8s and I'm not confident about how 
> > > > many clusters I should setup. I know there are many possible options, 
> > > > but I'd really appreciate feedback from people running k8s throughout 
> > > > their company.
> 
> >
> 
> > > >
> 
> >
> 
> > > > Nearly everything we run is containerized, and that includes our 
> > > > company-wide internal services like FreeIPA, Gitlab, Jenkins, etc. We 
> > > > also have multiple, completely separate, applications with varying 
> > > > security/auditing needs.
> 
> >
> 
> > > >
> 
> >
> 
> > > > Today, we schedule all of our containers via salt which only allows for 
> > > > containers to be mapped to systems in a fixed way (not great). We have 
> > > > a group of systems for each application environment and one group for 
> > > > internal services. Each group of systems may be subject to different 
> > > > network restrictions, depending on what they're running.
> 
> >
> 
> > > >
> 
> >
> 
> > > > The seemingly-obvious answer to replace our setup with k8s clusters is 
> > > > the following configuration:
> 
> >
> 
> > > >
> 
> >
> 
> > > > - Create one cluster for internal services
> 
> >
> 
> > > > - Create one cluster per application, with environments managed by 
> > > > namespaces whenever possible
> 
> >
> 
> > > >
> 
> >
> 
> > > > Great, that puts us with several clusters, but a smaller number of 
> > > > clusters than our previous "system groups". And, our network rules will 
> > > > mostly remain as-is.
> 
> >
> 
> > > >
> 
> >
> 
> > > > However, there is another option. It seems that a mix of calico 
> > > > ingress/egress rules, namespaces, RBAC, and carefully crafted pod 
> > > > resource definitions would allow us to have a single large cluster. 
> > > > Maybe it's just my inexperience, but that path seems daunting.
> 
> >
> 
> > > >
> 
> >
> 
> > > > So, all that background leads me to the simple question: In general, do 
> > > > you create one cluster per application? If not, do you have some other 
> > > > general rule that's not just "when latency or redudancy require it, 
> > > > make a new cluster"?
> 
> >
> 
> > > >
> 
> >
> 
> > > > Thanks in advance!
> 
> >
> 
> > > > Terence
> 
> >
> 
> > > >
> 
> >
> 
> > > > --
> 
> >
> 
> > > > You received this message because you are subscribed to the Google 
> > > > Groups "Kubernetes user discussion and Q&A" group.
> 
> >
> 
> > > > To unsubscribe from this group and stop receiving emails from it, send 
> > > > an email to kubernetes-use...@googlegroups.com.
> 
> >
> 
> > > > To post to this group, send email to kubernet...@googlegroups.com.
> 
> >
> 
> > > > Visit this group at https://groups.google.com/group/kubernetes-users.
> 
> >
> 
> > > > For more options, visit https://groups.google.com/d/optout.
> 
> >
> 
> >
> 
> >
> 
> > --
> 
> >
> 
> > You received this message because you are subscribed to the Google Groups 
> > "Kubernetes user discussion and Q&A" group.
> 
> >
> 
> > To unsubscribe from this group and stop receiving emails from it, send an 
> > email to kubernetes-use...@googlegroups.com.
> 
> >
> 
> > To post to this group, send email to kubernet...@googlegroups.com.
> 
> >
> 
> > Visit this group at https://groups.google.com/group/kubernetes-users.
> 
> >
> 
> > For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Kubernetes user discussion and Q&A" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to kubernetes-users+unsubscr...@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.
  • [kubernetes... terencekent
    • Re: [k... 'Tim Hockin' via Kubernetes user discussion and Q&A
      • Re... Brandon Philips
      • Re... 'David Oppenheimer' via Kubernetes user discussion and Q&A
      • Re... terencekent
        • ... 'EJ Campbell' via Kubernetes user discussion and Q&A
          • ... terencekent
            • ... 'EJ Campbell' via Kubernetes user discussion and Q&A
              • ... terencekent
                • ... 'EJ Campbell' via Kubernetes user discussion and Q&A
        • ... 'Timo Reimann' via Kubernetes user discussion and Q&A
          • ... terencekent

Reply via email to