Re: [kubernetes-users] Help me understand Kubernetes/Google LB options and architectures

'Tim Hockin' via Kubernetes user discussion and Q&A Tue, 16 May 2017 21:59:06 -0700

On Tue, May 16, 2017 at 11:08 AM, Joe Auty <joea...@gmail.com> wrote:
> The only real choice there is "ClientIP", which makes sense in an L4
> context.
>
>
> But wouldn't the IP need to be forwarded as an HTTP header? How does it know
> what the IP is?


When you use an L7 frontend like GCLB and an L4 Service, it will
affine to the IP of the load-balancer -- not what you want!

If you were just using Service load-balancers, which are L4, you would
have the client IP all the way thru.

> Thanks for these great posts, these concepts are really starting to click
> now!
>
> 'Tim Hockin' via Kubernetes user discussion and Q&A
> May 16, 2017 at 11:45 AM
>
> On Tue, May 16, 2017 at 7:02 AM, Joe Auty <joea...@gmail.com> wrote:
>
> This is very helpful, thanks, this makes sense!
>
> If services are layer 4 though, what does service.spec.sessionAffinity do?
>
> The only real choice there is "ClientIP", which makes sense in an L4
> context.
>
> If I'm understanding you, NGinx and HAProxy become useful things inside the
> cluster to provide layer 7 LB, whereas otherwise a more application/pod
> specific perspective (including client IP http headers) will be lost with
> the layer 4 services provided by Kubernetes? If so, I guess putting HAProxy
> or NGinx outside of the cluster would be somewhat limited since traffic
> would still need to pass through the layer 4 services?
>
> There's a ton of flexibility in implementation, once you go to
> something like nginx or haproxy.  They can bypass Service IPs and go
> straight to endpoints, for example, so avoiding the 2nd level of LB at
> the cost of managing a kubernetes API watch.
>
> Most cloud LBs can't go direct to pods, just VMs, which is why we need
> that second-level de-mux.
>
> Rodrigo Campos
> May 15, 2017 at 8:08 PM
>
> On Sunday, May 14, 2017, Joe Auty <joea...@gmail.com> wrote:
>
> Sorry for such a vague subject, but I think I need some help breaking
> things down here.
>
> I think I understand how the Google layer 7 LBs work (this diagram helped
> me:
> https://storage.googleapis.com/static.ianlewis.org/prod/img/750/gcp-lb-objects2.png)
> , I understand NGinx and HAProxy LBs independently, and I believe I also
> understand the concepts of NodePort, Ingress controllers, services, etc.
>
> What I don't understand is why when I research things like socket.io
> architectures in Kubernetes (for example), or features like IP whitelisting,
> session affinity, etc. I see people putting NGinx or HAProxy into their
> clusters. It is hard for me to keep straight all of the different levels of
> load balancing and their controls:
>
> Google backend services (i.e. Google LB)
> Kubernetes service LB
> HAProxy/NGinx
>
>
> The rationale for HAProxy and NGinx seems to involve compensating for
> missing features and/or bugs (kube-proxy, etc.) and it is hard to keep
> straight what is a reality today and what the best path is?
>
> Service is layer 4, so no session affinity with services. Except in some
> cases annotations (just some keywords on the service yaml) can be used. For
> example, rudimentary support was added with annotations, for some L7
> features, using service load balancer in AWS using annotations.
>
> But as services are supposed to be L4, it's not really the right layer to
> add some layer 7 specific things, that are specific for HTTP.
>
> So, then, ingress come in. Ingress is layer 7, so it knows about these stuff
> (or can know :)).
>
> So this is the chronological evolution of things, that might shed some light
> to understand why things are like this now.
>
> Google's LBs support session affinity, and there are session affinity
> Kubernetes service settings, so for starters, when and why is NGinx or
> HAProxy necessary, and are there outstanding issues with tracking source IPs
> and setting/respecting proper headers?
>
> When you use a service type nodeport, it works like this: some port X is
> opened on all the hosts. When a packet arrives to that port, it is routed to
> the appropriate pods in a round robin fashion.
>
> When you create a service type load balancer, you want to balance the load
> between pods and not between instances. As there might be more than one pod
> in one instance, or more instances than pods. So, how do you do this?
>
> If the load balancer doesn't know about pods but only knows about instances,
> then you can't give it the task to load balance between pods. And, also, if
> you want to give it the responsibility of load balancing, then it should
> route to the subset of instances running the pods. And this might change
> often.
>
> So, it is handled like this: the service types build one over the other, so
> a load balancer is also a nodeport. And the load balancer backends are
> configured as ALL the kubernetes nodes using this nodeport port. Then, when
> a packet arrives to the LB, some instance is chosen for that packet and
> then, on the node, kube-proxy (that runs in all nodes) makes the trick to
> forward it to the node running the pod in a balanced fashion. As all
> services have different nodePort, it's easy for kube-proxy to know to which
> pods this packet is for: just looking at the port the packet arrived.
>
> But that has a price: the L4 connection as seen from the pod POV might come
> from a node in the cluster (that is just kube-proxy doing the "redirection")
> and the real source IP is lost.
>
> In some protocols, like HTTP, there is a header that can be set and is just
> used and the client IP seems like the real client IP. But for some other
> protocols, this can be a problem.
>
> There is a proposal (and implementation I think) to keep the source IP with
> services (I think that is basically avoiding the packet to arrive to a node
> not running the pod in the first place, to avoid the redirection, but not
> sure). And I think is only with Google load balancer for now.
>
> Does this clarify something?
>
> Sorry if I'm not clear, I'm not good at expressing myself sometimes. So,
> please, let me know if I wasn't clear :)
>
> I'm happy to get into what sort of features I need if this will help steer
> the discussion, but at this point I'm thinking maybe it is best to start at
> a more basic level where you treat me like I'm 6 years old :)
>
> Quite advanced questions for a 6 year old :)
> --
> You received this message because you are subscribed to the Google Groups
> "Kubernetes user discussion and Q&A" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to kubernetes-users+unsubscr...@googlegroups.com.
> To post to this group, send email to kubernetes-users@googlegroups.com.
> Visit this group at https://groups.google.com/group/kubernetes-users.
> For more options, visit https://groups.google.com/d/optout.
> Joe Auty
> May 14, 2017 at 1:28 PM
> Sorry for such a vague subject, but I think I need some help breaking things
> down here.
>
> I think I understand how the Google layer 7 LBs work (this diagram helped
> me:
> https://storage.googleapis.com/static.ianlewis.org/prod/img/750/gcp-lb-objects2.png)
> , I understand NGinx and HAProxy LBs independently, and I believe I also
> understand the concepts of NodePort, Ingress controllers, services, etc.
>
> What I don't understand is why when I research things like socket.io
> architectures in Kubernetes (for example), or features like IP whitelisting,
> session affinity, etc. I see people putting NGinx or HAProxy into their
> clusters. It is hard for me to keep straight all of the different levels of
> load balancing and their controls:
>
> Google backend services (i.e. Google LB)
> Kubernetes service LB
> HAProxy/NGinx
>
>
> The rationale for HAProxy and NGinx seems to involve compensating for
> missing features and/or bugs (kube-proxy, etc.) and it is hard to keep
> straight what is a reality today and what the best path is?
>
> Google's LBs support session affinity, and there are session affinity
> Kubernetes service settings, so for starters, when and why is NGinx or
> HAProxy necessary, and are there outstanding issues with tracking source IPs
> and setting/respecting proper headers?
>
> I'm happy to get into what sort of features I need if this will help steer
> the discussion, but at this point I'm thinking maybe it is best to start at
> a more basic level where you treat me like I'm 6 years old :)
>
> Thanks in advance!
> --
> You received this message because you are subscribed to the Google Groups
> "Kubernetes user discussion and Q&A" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to kubernetes-users+unsubscr...@googlegroups.com.
> To post to this group, send email to kubernetes-users@googlegroups.com.
> Visit this group at https://groups.google.com/group/kubernetes-users.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Kubernetes user discussion and Q&A" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to kubernetes-users+unsubscr...@googlegroups.com.
> To post to this group, send email to kubernetes-users@googlegroups.com.
> Visit this group at https://groups.google.com/group/kubernetes-users.
> For more options, visit https://groups.google.com/d/optout.
>
> Joe Auty
> May 16, 2017 at 10:02 AM
> This is very helpful, thanks, this makes sense!
>
> If services are layer 4 though, what does service.spec.sessionAffinity do?
>
> If I'm understanding you, NGinx and HAProxy become useful things inside the
> cluster to provide layer 7 LB, whereas otherwise a more application/pod
> specific perspective (including client IP http headers) will be lost with
> the layer 4 services provided by Kubernetes? If so, I guess putting HAProxy
> or NGinx outside of the cluster would be somewhat limited since traffic
> would still need to pass through the layer 4 services?
>
>
> Rodrigo Campos
> May 15, 2017 at 8:08 PM
>
> On Sunday, May 14, 2017, Joe Auty <joea...@gmail.com> wrote:
>>
>> Sorry for such a vague subject, but I think I need some help breaking
>> things down here.
>>
>> I think I understand how the Google layer 7 LBs work (this diagram helped
>> me:
>> https://storage.googleapis.com/static.ianlewis.org/prod/img/750/gcp-lb-objects2.png)
>> , I understand NGinx and HAProxy LBs independently, and I believe I also
>> understand the concepts of NodePort, Ingress controllers, services, etc.
>>
>> What I don't understand is why when I research things like socket.io
>> architectures in Kubernetes (for example), or features like IP whitelisting,
>> session affinity, etc. I see people putting NGinx or HAProxy into their
>> clusters. It is hard for me to keep straight all of the different levels of
>> load balancing and their controls:
>>
>> Google backend services (i.e. Google LB)
>> Kubernetes service LB
>> HAProxy/NGinx
>>
>>
>> The rationale for HAProxy and NGinx seems to involve compensating for
>> missing features and/or bugs (kube-proxy, etc.) and it is hard to keep
>> straight what is a reality today and what the best path is?
>
>
> Service is layer 4, so no session affinity with services. Except in some
> cases annotations (just some keywords on the service yaml) can be used. For
> example, rudimentary support was added with annotations, for some L7
> features, using service load balancer in AWS using annotations.
>
> But as services are supposed to be L4, it's not really the right layer to
> add some layer 7 specific things, that are specific for HTTP.
>
> So, then, ingress come in. Ingress is layer 7, so it knows about these stuff
> (or can know :)).
>
> So this is the chronological evolution of things, that might shed some light
> to understand why things are like this now.
>
>>
>>
>> Google's LBs support session affinity, and there are session affinity
>> Kubernetes service settings, so for starters, when and why is NGinx or
>> HAProxy necessary, and are there outstanding issues with tracking source IPs
>> and setting/respecting proper headers?
>
>
> When you use a service type nodeport, it works like this: some port X is
> opened on all the hosts. When a packet arrives to that port, it is routed to
> the appropriate pods in a round robin fashion.
>
> When you create a service type load balancer, you want to balance the load
> between pods and not between instances. As there might be more than one pod
> in one instance, or more instances than pods. So, how do you do this?
>
> If the load balancer doesn't know about pods but only knows about instances,
> then you can't give it the task to load balance between pods. And, also, if
> you want to give it the responsibility of load balancing, then it should
> route to the subset of instances running the pods. And this might change
> often.
>
> So, it is handled like this: the service types build one over the other, so
> a load balancer is also a nodeport. And the load balancer backends are
> configured as ALL the kubernetes nodes using this nodeport port. Then, when
> a packet arrives to the LB, some instance is chosen for that packet and
> then, on the node, kube-proxy (that runs in all nodes) makes the trick to
> forward it to the node running the pod in a balanced fashion. As all
> services have different nodePort, it's easy for kube-proxy to know to which
> pods this packet is for: just looking at the port the packet arrived.
>
> But that has a price: the L4 connection as seen from the pod POV might come
> from a node in the cluster (that is just kube-proxy doing the "redirection")
> and the real source IP is lost.
>
> In some protocols, like HTTP, there is a header that can be set and is just
> used and the client IP seems like the real client IP. But for some other
> protocols, this can be a problem.
>
> There is a proposal (and implementation I think) to keep the source IP with
> services (I think that is basically avoiding the packet to arrive to a node
> not running the pod in the first place, to avoid the redirection, but not
> sure). And I think is only with Google load balancer for now.
>
> Does this clarify something?
>
> Sorry if I'm not clear, I'm not good at expressing myself sometimes. So,
> please, let me know if I wasn't clear :)
>
>>
>>
>> I'm happy to get into what sort of features I need if this will help steer
>> the discussion, but at this point I'm thinking maybe it is best to start at
>> a more basic level where you treat me like I'm 6 years old :)
>
>
> Quite advanced questions for a 6 year old :)
> --
> You received this message because you are subscribed to the Google Groups
> "Kubernetes user discussion and Q&A" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to kubernetes-users+unsubscr...@googlegroups.com.
> To post to this group, send email to kubernetes-users@googlegroups.com.
> Visit this group at https://groups.google.com/group/kubernetes-users.
> For more options, visit https://groups.google.com/d/optout.
> Joe Auty
> May 14, 2017 at 1:28 PM
> Sorry for such a vague subject, but I think I need some help breaking things
> down here.
>
> I think I understand how the Google layer 7 LBs work (this diagram helped
> me:
> https://storage.googleapis.com/static.ianlewis.org/prod/img/750/gcp-lb-objects2.png)
> , I understand NGinx and HAProxy LBs independently, and I believe I also
> understand the concepts of NodePort, Ingress controllers, services, etc.
>
> What I don't understand is why when I research things like socket.io
> architectures in Kubernetes (for example), or features like IP whitelisting,
> session affinity, etc. I see people putting NGinx or HAProxy into their
> clusters. It is hard for me to keep straight all of the different levels of
> load balancing and their controls:
>
> Google backend services (i.e. Google LB)
> Kubernetes service LB
> HAProxy/NGinx
>
>
> The rationale for HAProxy and NGinx seems to involve compensating for
> missing features and/or bugs (kube-proxy, etc.) and it is hard to keep
> straight what is a reality today and what the best path is?
>
> Google's LBs support session affinity, and there are session affinity
> Kubernetes service settings, so for starters, when and why is NGinx or
> HAProxy necessary, and are there outstanding issues with tracking source IPs
> and setting/respecting proper headers?
>
> I'm happy to get into what sort of features I need if this will help steer
> the discussion, but at this point I'm thinking maybe it is best to start at
> a more basic level where you treat me like I'm 6 years old :)
>
> Thanks in advance!
> --
> You received this message because you are subscribed to the Google Groups
> "Kubernetes user discussion and Q&A" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to kubernetes-users+unsubscr...@googlegroups.com.
> To post to this group, send email to kubernetes-users@googlegroups.com.
> Visit this group at https://groups.google.com/group/kubernetes-users.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Kubernetes user discussion and Q&A" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to kubernetes-users+unsubscr...@googlegroups.com.
> To post to this group, send email to kubernetes-users@googlegroups.com.
> Visit this group at https://groups.google.com/group/kubernetes-users.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Kubernetes user discussion and Q&A" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to kubernetes-users+unsubscr...@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.

Re: [kubernetes-users] Help me understand Kubernetes/Google LB options and architectures

Reply via email to