sgtm. Thanks much for checking. Julien.
On Mon, Jan 30, 2017 at 1:16 PM, Mark D. Roth <r...@google.com> wrote: > FYI, I chatted with Bill about this today, and he agreed that this > approach should be fine w.r.t. properly balancing load across the grpclb > balancers. > > The only down-side we can think of is that when a client is talking to a > balancer that goes down, it will take the client longer to connect to a > different balancer than it probably would if the client were not going > through a proxy, because the client does not already have other subchannels > already configured that it can try out. Instead, it will need to reconnect > to the proxy and wait for the proxy to connect to a balancer task that is > up, which will likely be slower (especially if multiple attempts are needed > to find a balancer task that is up). However, this is probably acceptable. > > On Fri, Jan 27, 2017 at 3:53 PM, Mark D. Roth <r...@google.com> wrote: > >> (+l...@google.com) >> >> During the course of this discussion, it occured to me that the proposed >> solution for case 3 may affect how the clients are distributed across the >> grpclb load balancers. >> >> Bill, in this scenario, the client will not know the IP addresses of the >> balancers directly; instead, it will just know the proxy address and will >> depend on the proxy to resolve the internal names of the balancers. This >> means that even if there are many balancers, the client will not have >> connections to all of them, but every time it connects to the proxy, it's >> likely to get a different balancer. So if the individual balancer that it >> happens to be talking to goes down, it will create a new connection to the >> proxy, which will pick a new balancer task to talk to. >> >> I think this is probably fine w.r.t. our conversations about properly >> balancing load across the balancers, but I want to make sure that there's >> no problem here that I might be missing. Does this sound reasonable to you? >> >> Thanks! >> >> >> On Fri, Jan 27, 2017 at 3:41 PM, Mark D. Roth <r...@google.com> wrote: >> >>> On Fri, Jan 27, 2017 at 1:31 PM, Eric Anderson <ej...@google.com> wrote: >>> >>>> On Fri, Jan 27, 2017 at 12:16 PM, 'Mark D. Roth' via grpc.io < >>>> grpc-io@googlegroups.com> wrote: >>>> >>>>> Yes. And that seems to agree with how the different proxy choosing >>>>>> logic will work; the first primarily consumes hostnames and returns proxy >>>>>> hostnames (which is http_proxy in C) and the second one primarily >>>>>> consumes >>>>>> IPs and returns proxy IPs. >>>>>> >>>>> >>>>> I don't think that's actually entirely correct. The first case >>>>> doesn't consume anything; it unconditionally sets the hostname to be >>>>> resolved. >>>>> >>>> >>>> The first case will consume a hostname in Java. Observing the hostname >>>> is necessary to fix the mixed internal/external in an expanded view of case >>>> 1. Since the Java APIs support that mixed case, Java ends up needing to >>>> support them. And if C ever needed to support the mixed case (which seems >>>> likely to me), then it would also need to use the hostname. >>>> >>>> >>>>> And the second case can consume either the hostname or the IP. >>>>> >>>> >>>> And I wouldn't be surprised if only IP were used. We're not aware of a >>>> user of it. >>>> >>>> This is more philosophical than practical, >>>>> >>>> >>>> My further explanation there was meant to be more philosophical, as an >>>> explanation that this "special case" is pretty normal and sort of agrees >>>> with the rest of the design. >>>> >>>> But that philosophical debate aside, I think that we should focus on >>>>>>> case 3, because that's a concrete case that we do want to support. So >>>>>>> far, >>>>>>> at least, I have not heard a workable proposal that does not require the >>>>>>> proxy mapper to control the CONNECT argument (although I'm certainly >>>>>>> still >>>>>>> open to new proposals). >>>>>>> >>>>>> >>>>>> I've provided two proposals. Neither of which seem debunked as of >>>>>> yet. I could totally agree they may be worse than what you are proposing, >>>>>> but the discussion hasn't gotten to that point. The mentioned security >>>>>> issue of the first proposal seemed to ignore the fact that a reverse >>>>>> proxy >>>>>> could be used to "protect" the LB, in an identical fashion to any forward >>>>>> proxy. >>>>>> >>>>> >>>>> I don't quite understand the proposed reverse proxy approach. Can you >>>>> explain how that would work in more detail? >>>>> >>>> >>>> Case 3 as stated today (for contrasting) >>>> >>>> 1. client wants to connect to service.example.com >>>> 2. do DNS SRV resolution for _grpclb._tcp.service.example.com; you >>>> find it is a LB with name lb.example.com >>>> 3. do a DNS resolution for lb.example.com, get IP 1.2.3.4 >>>> 4. ask the proxy mapper about IP 1.2.3.4, it recognizes the IP as >>>> the proxy and says to use "CONNECT service.example.com" via proxy >>>> IP 1.2.3.4 >>>> 5. connect to proxy 1.2.3.4, it performs internal resolution of >>>> service.example.com and connects to one of the hosts >>>> >>>> That's not actually an accurate representation of how case 3 is >>> proposed to work in the current document. The document is actually >>> proposing the following: >>> >>> 1. client wants to connect to service.example.com >>> 2. do DNS SRV resolution for _grpclb._tcp.service.example.com; you >>> find it is a LB with name lb.example.com >>> 3. do a DNS resolution for lb.example.com, get IP 1.2.3.4 >>> 4. ask the proxy mapper about IP 1.2.3.4; it recognizes the IP as >>> the proxy and says to use "CONNECT lb.example.com" via proxy IP >>> 1.2.3.4 >>> 5. connect to proxy 1.2.3.4 with "CONNECT lb.example.com"; proxy >>> does internal name resolution and connects to one of the load balancers >>> 6. send grpclb request; get response indicating that the backend >>> server is 5.6.7.8 >>> 7. ask the proxy mapper about IP 5.6.7.8; it recognizes it as an >>> internal IP address and says to use "CONNECT 5.6.7.8" via proxy IP >>> 1.2.3.4 >>> 8. connect to proxy 1.2.3.4 with "CONNECT 5.6.7.8"; proxy connects >>> to the specified backend server >>> >>> Remember that the goal of case 3 is to allow client-side per-call load >>> balancing, despite not being able to resolve the internal names of the >>> backend servers. Instead of getting those from DNS, we get them from the >>> grpclb balancer. >>> >>> >>> >>>> Case 3 using reverse proxy for LB >>>> >>>> 1. client wants to connect to service.example.com >>>> 2. do DNS SRV resolution for _grpclb._tcp.service.example.com; you >>>> find it is a LB with name lb.example.com >>>> 3. do a DNS resolution for lb.example.com, get IP 1.2.3.4 >>>> 4. (different starting here) connect to 1.2.3.4, which is a >>>> transparent reverse proxy >>>> 5. Perform an RPC to 1.2.3.4. Host header is lb.example.com. The >>>> proxy performs internal mapping of lb.example.com to internal >>>> addresses and connects to one of the hosts, forwarding the RPC. >>>> >>>> The reverse proxy approach is essentially what I originally suggested >>> for case 3, but Julien argued that it would be a security problem. >>> >>> Keep in mind that in case 3, the grpclb load balancers and the server >>> backends are in the same internal domain, with the same access >>> restrictions. If we can't use a reverse proxy to access the server >>> backends, I don't think we'll be able to do that for the grpclb balancers >>> either. >>> >>> That having been said, as a security issue, Julien can address this >>> directly. >>> >>> >>> >>>> >>>> I agree that case 3 requires different parts of the system to be >>>>> coordinated. For example, assuming that your proxy mapper implementation >>>>> is getting the list of proxy addresses from a local file, you would need >>>>> to >>>>> first push an updated list that contains the new proxy address to all >>>>> clients. Then, once all clients have been updated, you can add the new >>>>> proxy to DNS. >>>>> >>>> >>>> And the file needs to contain old proxy addresses that should be used >>>> for detection but not be used. >>>> >>>> Okay. So we're on the same page there. >>>> >>>> I agree that this is cumbersome, but I think it's an inherent problem >>>>> with case 3, because you need *some* way to configure the clients. >>>>> >>>> >>>> I agree you need to be able to configure the clients. I understand that >>>> something needs to tell the client what to do. My concern was the pain of >>>> updating the proxy mapping list in concert with name resolution. And >>>> because of that I would recommend implementors to use the magic IP, because >>>> it has less operational overhead and less likelihood of failing. >>>> >>>> If you assume "one proxy" which has "one static IP" and everything is >>>>>> hard-coded, then the design is fine. But that seems unlikely to describe >>>>>> a >>>>>> productionized system. And that's why I would feel forced to use the >>>>>> "magic >>>>>> IP" that it seems you have previously rejected. >>>>>> >>>>> >>>>> There are a couple of reasons that I don't like the "magic IP" >>>>> approach. First, it requires writing a custom resolver in addition to a >>>>> custom proxy mapper, >>>>> >>>> >>>> No, I'd just have DNS return the trash IP. >>>> >>> >>> This actually makes me even less happy with the sentinel-value approach, >>> because now we wouldn't just be using the value internally in a particular >>> piece of software; we'd actually be publishing it in a way that would be >>> very confusing when people were trying to debug the system from an >>> operational perspective. ("Wait, why is the client even attempting to >>> connect to the proxy, since DNS points it at this bogus IP address?") >>> >>> >>> >>>> >>>> I'm not a big fan of "sentinel" values, since it's often hard to find a >>>>> value that will never be used in real life. >>>>> >>>> >>>> I would gladly accept a magic value instead of needing to make sure two >>>> systems stay in sync and rollouts happen properly. And I would quickly >>>> recommend that to others. And if I started explaining the gotchas of the >>>> alternative, I'd expect them to quickly be thankful for the recommendation >>>> since it is less code to write and less operational complexity. >>>> >>> >>> I do see your point, but I think that the sentinel-value approach has >>> operational downsides of its own. There are pros and cons here, so it >>> basically boils down to a judgement call, and personally, I prefer the >>> alternative that's currently outlined in the doc. >>> >>> >>> Just thinking out loud here about whether there's another alternative -- >>> this is a purely brainstorming-level idea, so please feel free to shoot >>> holes in it. What if we had another type of SRV record specifically for >>> HTTP CONNECT proxy use? The presence of that record would tell the client >>> to connect to that address and issue a CONNECT request using the originally >>> looked up name. With that, case 3 would look something like this: >>> >>> 1. client wants to connect to service.example.com >>> 2. do DNS SRV resolution for _grpclb._tcp.service.example.com; you >>> find it is a LB with name lb.example.com >>> 3. do DNS SRV resolution for _grpc_proxy._tcp.lb.example.com; you >>> find it is a proxy with name proxy.example.com >>> 4. do DNS lookup for proxy.example.com; get IP 1.2.3.4 >>> 5. connect to proxy 1.2.3.4 with "CONNECT lb.example.com"; proxy >>> does internal name resolution and connects to one of the load balancers >>> 6. send grpclb request; get response indicating that the backend >>> server is 5.6.7.8 >>> 7. ask the proxy mapper about IP 5.6.7.8; it recognizes it as an >>> internal IP address and says to use "CONNECT 5.6.7.8" via proxy IP >>> 1.2.3.4 >>> 8. connect to proxy 1.2.3.4 with "CONNECT 5.6.7.8"; proxy connects >>> to the specified backend server >>> >>> In this case, there's no proxy mapper involved in the grpclb connection, >>> only for the backend connections, so the proxy mapper doesn't need to sync >>> up with the resolver result (which would seem to ameliorate your concern). >>> The down-sides are that there are more DNS lookups involved, and that we >>> might need to extend the resolver API so that it can pass down richer >>> information (not sure about that -- would need to think about this more >>> fully). >>> >>> I'm not sure that this approach is really worth the additional >>> complexity, but I figured I'd shoot it out there and see what you think. >>> Thoughts...? >>> >>> -- >>> Mark D. Roth <r...@google.com> >>> Software Engineer >>> Google, Inc. >>> >> >> >> >> -- >> Mark D. Roth <r...@google.com> >> Software Engineer >> Google, Inc. >> > > > > -- > Mark D. Roth <r...@google.com> > Software Engineer > Google, Inc. > -- You received this message because you are subscribed to the Google Groups "grpc.io" group. To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+unsubscr...@googlegroups.com. To post to this group, send email to grpc-io@googlegroups.com. Visit this group at https://groups.google.com/group/grpc-io. To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/CAAvOVOddKF8hSpasfDp41fJN_CUKbTaw_yBrtyr%3DNHOWamFsPA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.