On Tue, May 16, 2017 at 11:08 AM, Joe Auty <joea...@gmail.com> wrote: > The only real choice there is "ClientIP", which makes sense in an L4 > context. > > > But wouldn't the IP need to be forwarded as an HTTP header? How does it know > what the IP is?
When you use an L7 frontend like GCLB and an L4 Service, it will affine to the IP of the load-balancer -- not what you want! If you were just using Service load-balancers, which are L4, you would have the client IP all the way thru. > Thanks for these great posts, these concepts are really starting to click > now! > > 'Tim Hockin' via Kubernetes user discussion and Q&A > May 16, 2017 at 11:45 AM > > On Tue, May 16, 2017 at 7:02 AM, Joe Auty <joea...@gmail.com> wrote: > > This is very helpful, thanks, this makes sense! > > If services are layer 4 though, what does service.spec.sessionAffinity do? > > The only real choice there is "ClientIP", which makes sense in an L4 > context. > > If I'm understanding you, NGinx and HAProxy become useful things inside the > cluster to provide layer 7 LB, whereas otherwise a more application/pod > specific perspective (including client IP http headers) will be lost with > the layer 4 services provided by Kubernetes? If so, I guess putting HAProxy > or NGinx outside of the cluster would be somewhat limited since traffic > would still need to pass through the layer 4 services? > > There's a ton of flexibility in implementation, once you go to > something like nginx or haproxy. They can bypass Service IPs and go > straight to endpoints, for example, so avoiding the 2nd level of LB at > the cost of managing a kubernetes API watch. > > Most cloud LBs can't go direct to pods, just VMs, which is why we need > that second-level de-mux. > > Rodrigo Campos > May 15, 2017 at 8:08 PM > > On Sunday, May 14, 2017, Joe Auty <joea...@gmail.com> wrote: > > Sorry for such a vague subject, but I think I need some help breaking > things down here. > > I think I understand how the Google layer 7 LBs work (this diagram helped > me: > https://storage.googleapis.com/static.ianlewis.org/prod/img/750/gcp-lb-objects2.png) > , I understand NGinx and HAProxy LBs independently, and I believe I also > understand the concepts of NodePort, Ingress controllers, services, etc. > > What I don't understand is why when I research things like socket.io > architectures in Kubernetes (for example), or features like IP whitelisting, > session affinity, etc. I see people putting NGinx or HAProxy into their > clusters. It is hard for me to keep straight all of the different levels of > load balancing and their controls: > > Google backend services (i.e. Google LB) > Kubernetes service LB > HAProxy/NGinx > > > The rationale for HAProxy and NGinx seems to involve compensating for > missing features and/or bugs (kube-proxy, etc.) and it is hard to keep > straight what is a reality today and what the best path is? > > Service is layer 4, so no session affinity with services. Except in some > cases annotations (just some keywords on the service yaml) can be used. For > example, rudimentary support was added with annotations, for some L7 > features, using service load balancer in AWS using annotations. > > But as services are supposed to be L4, it's not really the right layer to > add some layer 7 specific things, that are specific for HTTP. > > So, then, ingress come in. Ingress is layer 7, so it knows about these stuff > (or can know :)). > > So this is the chronological evolution of things, that might shed some light > to understand why things are like this now. > > Google's LBs support session affinity, and there are session affinity > Kubernetes service settings, so for starters, when and why is NGinx or > HAProxy necessary, and are there outstanding issues with tracking source IPs > and setting/respecting proper headers? > > When you use a service type nodeport, it works like this: some port X is > opened on all the hosts. When a packet arrives to that port, it is routed to > the appropriate pods in a round robin fashion. > > When you create a service type load balancer, you want to balance the load > between pods and not between instances. As there might be more than one pod > in one instance, or more instances than pods. So, how do you do this? > > If the load balancer doesn't know about pods but only knows about instances, > then you can't give it the task to load balance between pods. And, also, if > you want to give it the responsibility of load balancing, then it should > route to the subset of instances running the pods. And this might change > often. > > So, it is handled like this: the service types build one over the other, so > a load balancer is also a nodeport. And the load balancer backends are > configured as ALL the kubernetes nodes using this nodeport port. Then, when > a packet arrives to the LB, some instance is chosen for that packet and > then, on the node, kube-proxy (that runs in all nodes) makes the trick to > forward it to the node running the pod in a balanced fashion. As all > services have different nodePort, it's easy for kube-proxy to know to which > pods this packet is for: just looking at the port the packet arrived. > > But that has a price: the L4 connection as seen from the pod POV might come > from a node in the cluster (that is just kube-proxy doing the "redirection") > and the real source IP is lost. > > In some protocols, like HTTP, there is a header that can be set and is just > used and the client IP seems like the real client IP. But for some other > protocols, this can be a problem. > > There is a proposal (and implementation I think) to keep the source IP with > services (I think that is basically avoiding the packet to arrive to a node > not running the pod in the first place, to avoid the redirection, but not > sure). And I think is only with Google load balancer for now. > > Does this clarify something? > > Sorry if I'm not clear, I'm not good at expressing myself sometimes. So, > please, let me know if I wasn't clear :) > > I'm happy to get into what sort of features I need if this will help steer > the discussion, but at this point I'm thinking maybe it is best to start at > a more basic level where you treat me like I'm 6 years old :) > > Quite advanced questions for a 6 year old :) > -- > You received this message because you are subscribed to the Google Groups > "Kubernetes user discussion and Q&A" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to kubernetes-users+unsubscr...@googlegroups.com. > To post to this group, send email to kubernetes-users@googlegroups.com. > Visit this group at https://groups.google.com/group/kubernetes-users. > For more options, visit https://groups.google.com/d/optout. > Joe Auty > May 14, 2017 at 1:28 PM > Sorry for such a vague subject, but I think I need some help breaking things > down here. > > I think I understand how the Google layer 7 LBs work (this diagram helped > me: > https://storage.googleapis.com/static.ianlewis.org/prod/img/750/gcp-lb-objects2.png) > , I understand NGinx and HAProxy LBs independently, and I believe I also > understand the concepts of NodePort, Ingress controllers, services, etc. > > What I don't understand is why when I research things like socket.io > architectures in Kubernetes (for example), or features like IP whitelisting, > session affinity, etc. I see people putting NGinx or HAProxy into their > clusters. It is hard for me to keep straight all of the different levels of > load balancing and their controls: > > Google backend services (i.e. Google LB) > Kubernetes service LB > HAProxy/NGinx > > > The rationale for HAProxy and NGinx seems to involve compensating for > missing features and/or bugs (kube-proxy, etc.) and it is hard to keep > straight what is a reality today and what the best path is? > > Google's LBs support session affinity, and there are session affinity > Kubernetes service settings, so for starters, when and why is NGinx or > HAProxy necessary, and are there outstanding issues with tracking source IPs > and setting/respecting proper headers? > > I'm happy to get into what sort of features I need if this will help steer > the discussion, but at this point I'm thinking maybe it is best to start at > a more basic level where you treat me like I'm 6 years old :) > > Thanks in advance! > -- > You received this message because you are subscribed to the Google Groups > "Kubernetes user discussion and Q&A" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to kubernetes-users+unsubscr...@googlegroups.com. > To post to this group, send email to kubernetes-users@googlegroups.com. > Visit this group at https://groups.google.com/group/kubernetes-users. > For more options, visit https://groups.google.com/d/optout. > > > -- > You received this message because you are subscribed to the Google Groups > "Kubernetes user discussion and Q&A" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to kubernetes-users+unsubscr...@googlegroups.com. > To post to this group, send email to kubernetes-users@googlegroups.com. > Visit this group at https://groups.google.com/group/kubernetes-users. > For more options, visit https://groups.google.com/d/optout. > > Joe Auty > May 16, 2017 at 10:02 AM > This is very helpful, thanks, this makes sense! > > If services are layer 4 though, what does service.spec.sessionAffinity do? > > If I'm understanding you, NGinx and HAProxy become useful things inside the > cluster to provide layer 7 LB, whereas otherwise a more application/pod > specific perspective (including client IP http headers) will be lost with > the layer 4 services provided by Kubernetes? If so, I guess putting HAProxy > or NGinx outside of the cluster would be somewhat limited since traffic > would still need to pass through the layer 4 services? > > > Rodrigo Campos > May 15, 2017 at 8:08 PM > > On Sunday, May 14, 2017, Joe Auty <joea...@gmail.com> wrote: >> >> Sorry for such a vague subject, but I think I need some help breaking >> things down here. >> >> I think I understand how the Google layer 7 LBs work (this diagram helped >> me: >> https://storage.googleapis.com/static.ianlewis.org/prod/img/750/gcp-lb-objects2.png) >> , I understand NGinx and HAProxy LBs independently, and I believe I also >> understand the concepts of NodePort, Ingress controllers, services, etc. >> >> What I don't understand is why when I research things like socket.io >> architectures in Kubernetes (for example), or features like IP whitelisting, >> session affinity, etc. I see people putting NGinx or HAProxy into their >> clusters. It is hard for me to keep straight all of the different levels of >> load balancing and their controls: >> >> Google backend services (i.e. Google LB) >> Kubernetes service LB >> HAProxy/NGinx >> >> >> The rationale for HAProxy and NGinx seems to involve compensating for >> missing features and/or bugs (kube-proxy, etc.) and it is hard to keep >> straight what is a reality today and what the best path is? > > > Service is layer 4, so no session affinity with services. Except in some > cases annotations (just some keywords on the service yaml) can be used. For > example, rudimentary support was added with annotations, for some L7 > features, using service load balancer in AWS using annotations. > > But as services are supposed to be L4, it's not really the right layer to > add some layer 7 specific things, that are specific for HTTP. > > So, then, ingress come in. Ingress is layer 7, so it knows about these stuff > (or can know :)). > > So this is the chronological evolution of things, that might shed some light > to understand why things are like this now. > >> >> >> Google's LBs support session affinity, and there are session affinity >> Kubernetes service settings, so for starters, when and why is NGinx or >> HAProxy necessary, and are there outstanding issues with tracking source IPs >> and setting/respecting proper headers? > > > When you use a service type nodeport, it works like this: some port X is > opened on all the hosts. When a packet arrives to that port, it is routed to > the appropriate pods in a round robin fashion. > > When you create a service type load balancer, you want to balance the load > between pods and not between instances. As there might be more than one pod > in one instance, or more instances than pods. So, how do you do this? > > If the load balancer doesn't know about pods but only knows about instances, > then you can't give it the task to load balance between pods. And, also, if > you want to give it the responsibility of load balancing, then it should > route to the subset of instances running the pods. And this might change > often. > > So, it is handled like this: the service types build one over the other, so > a load balancer is also a nodeport. And the load balancer backends are > configured as ALL the kubernetes nodes using this nodeport port. Then, when > a packet arrives to the LB, some instance is chosen for that packet and > then, on the node, kube-proxy (that runs in all nodes) makes the trick to > forward it to the node running the pod in a balanced fashion. As all > services have different nodePort, it's easy for kube-proxy to know to which > pods this packet is for: just looking at the port the packet arrived. > > But that has a price: the L4 connection as seen from the pod POV might come > from a node in the cluster (that is just kube-proxy doing the "redirection") > and the real source IP is lost. > > In some protocols, like HTTP, there is a header that can be set and is just > used and the client IP seems like the real client IP. But for some other > protocols, this can be a problem. > > There is a proposal (and implementation I think) to keep the source IP with > services (I think that is basically avoiding the packet to arrive to a node > not running the pod in the first place, to avoid the redirection, but not > sure). And I think is only with Google load balancer for now. > > Does this clarify something? > > Sorry if I'm not clear, I'm not good at expressing myself sometimes. So, > please, let me know if I wasn't clear :) > >> >> >> I'm happy to get into what sort of features I need if this will help steer >> the discussion, but at this point I'm thinking maybe it is best to start at >> a more basic level where you treat me like I'm 6 years old :) > > > Quite advanced questions for a 6 year old :) > -- > You received this message because you are subscribed to the Google Groups > "Kubernetes user discussion and Q&A" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to kubernetes-users+unsubscr...@googlegroups.com. > To post to this group, send email to kubernetes-users@googlegroups.com. > Visit this group at https://groups.google.com/group/kubernetes-users. > For more options, visit https://groups.google.com/d/optout. > Joe Auty > May 14, 2017 at 1:28 PM > Sorry for such a vague subject, but I think I need some help breaking things > down here. > > I think I understand how the Google layer 7 LBs work (this diagram helped > me: > https://storage.googleapis.com/static.ianlewis.org/prod/img/750/gcp-lb-objects2.png) > , I understand NGinx and HAProxy LBs independently, and I believe I also > understand the concepts of NodePort, Ingress controllers, services, etc. > > What I don't understand is why when I research things like socket.io > architectures in Kubernetes (for example), or features like IP whitelisting, > session affinity, etc. I see people putting NGinx or HAProxy into their > clusters. It is hard for me to keep straight all of the different levels of > load balancing and their controls: > > Google backend services (i.e. Google LB) > Kubernetes service LB > HAProxy/NGinx > > > The rationale for HAProxy and NGinx seems to involve compensating for > missing features and/or bugs (kube-proxy, etc.) and it is hard to keep > straight what is a reality today and what the best path is? > > Google's LBs support session affinity, and there are session affinity > Kubernetes service settings, so for starters, when and why is NGinx or > HAProxy necessary, and are there outstanding issues with tracking source IPs > and setting/respecting proper headers? > > I'm happy to get into what sort of features I need if this will help steer > the discussion, but at this point I'm thinking maybe it is best to start at > a more basic level where you treat me like I'm 6 years old :) > > Thanks in advance! > -- > You received this message because you are subscribed to the Google Groups > "Kubernetes user discussion and Q&A" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to kubernetes-users+unsubscr...@googlegroups.com. > To post to this group, send email to kubernetes-users@googlegroups.com. > Visit this group at https://groups.google.com/group/kubernetes-users. > For more options, visit https://groups.google.com/d/optout. > > > -- > You received this message because you are subscribed to the Google Groups > "Kubernetes user discussion and Q&A" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to kubernetes-users+unsubscr...@googlegroups.com. > To post to this group, send email to kubernetes-users@googlegroups.com. > Visit this group at https://groups.google.com/group/kubernetes-users. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Kubernetes user discussion and Q&A" group. To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-users+unsubscr...@googlegroups.com. To post to this group, send email to kubernetes-users@googlegroups.com. Visit this group at https://groups.google.com/group/kubernetes-users. For more options, visit https://groups.google.com/d/optout.