On Thu, Jan 26, 2017 at 12:42 PM, Rudi Chiarito <[email protected]> wrote:

> On Thu, Jan 26, 2017 at 12:52 PM, 'Mark D. Roth' via grpc.io <
> [email protected]> wrote:
>
>> We might be able to ameliorate some of that by allowing some sort of
>> simple pattern-matching language, although I'd prefer to avoid taking an
>> external dependency on a regexp library, so it would probably need to be
>> something very simple -- like maybe simple wildcard matching triggered by a
>> '*' character.  So, for example, if your three different frontend services
>> run on hosts with the names as follows:
>>
>> Frontend service "Foo": foofrontend1, foofrontend2, foofrontend3, ...
>> Frontend service "Bar": barfrontend1, barfrontend2, barfrontend3, ...
>> Frontend service "Baz": bazfrontend1, bazfrontend2, bazfrontend3, ...
>>
>> Then you could select only the frontends from the first service by saying
>> something like "foofrontend*".  But we probably would *not* allow
>> something like "foofrontend[12]" to get just the first two frontends of
>> service "Foo"; instead, you would need to list them separately.  Would
>> something like that be useful in your use case?
>>
>
> In our use case, pods (think Borg tasks) typically have their own IP and
> hostname, like myservice-RANDOMFIVELETTERHASH. The only exception are
> StatefulSets, an abstraction that lets pods in a replica set have unique
> per-instance settings, including, in this case, pod names: myservice-0,
> myservice-1, etc.
>
> So the wildcard syntax would be of some use to us, while the [12] regex
> one wouldn't be yet, until the day we actually start using StatefulSets.
> I'm sure you'll find others that have more control over their IPs and would
> be interested. Even with StatefulSets, I don't think we'd use the feature,
> because they are typically used for services that have a small number of
> replicas.
>
> The subnet notation is interesting, too, but in our case, IP addresses
> are, for most purposes, random. Plus, most people don't want to think about
> addresses. :-)
>
> I would imagine that, in terms of scenarios covered, you'd probably see
> diminishing returns, from highest to lowest:
>
>  - raw list
>  - trailing wildcards
>  - free wildcards (anywhere in the string)
>  - regexes
>  - network masks
>
> I would implement them in that order, but only when actual user demand
> materialises.
>

Okay, it sounds like we should add the ability to select based on the
client hostname for now, and then wait to see if we need the other options
later.  I've added this to the doc.


>
> I agree with you that bloat is not just a hypothetical concern. That's why
> I suggested that pushes progress e.g. from one host, to two hosts, then
> three, then through percentages. One twist I forgot to mention is that once
> a client has been picked through explicit mention, the config should be
> sticky, i.e. it should keep the new one and shouldn't roll the dice when
> the config changes to a percentage. I guess you would have the same issue
> when you ramp from e.g. 1% to 10%: do you really want a client to
> potentially alternate between new and old config whenever the percentage
> changes? Unless you require people to always canary at only X%, then go
> straight to 100%.
>

It sounds like the doc needs to be more explicit about the semantics of the
selector fields, especially when they're used in combination.  Here's how I
am expecting that it will work.

In order for a config choice to be selected, all of the selectors must be
considered a match for the client.  If a selector field is unset (or is set
to an empty list), then it is considered a match for all clients.  If a
selector field is non-empty, then the client must match the value (or, in
the case of a list, one of the values) in order to be considered a match.

In other words, the code to determine which choice to use will look
something like this (pseudo-code):

for each choice {
  for each selector {
    if selector does not match, then skip to next choice
  }
  if we are still here (i.e., all selectors matched), then use this choice
}

So, the net effect of this is that if you use both the client host selector
and the percentage selector in the same choice, then the choice will only
be used if both selectors match, which means that you won't be guaranteed
that the specified hosts will use the choice as the percentage changes.
However, if you do want to guarantee that the specified hosts will use the
same data unconditionally, then you can specify two choice: first a choice
that specifies the hosts, and then one that specifies the percentage.  This
requires you to duplicate the config data, but it provides a generic way
for the config author to control whether they get "AND" or "OR" semantics
between multiple selector fields, which I suspect will be important as new
selector fields are added in the future.

I've added some comments to the doc to describe the matching algorithm.


>
>
>> Do we actually want to select the client port in any of these cases?  I'm
>> not sure that's useful, since the client port would presumably be different
>> for each backend it's connected to, and it would change any time it
>> reconnected to a given backend.  Is there a use-case where selecting on the
>> client port is useful?
>>
>
> I guess this would be theoretically useful if you run e.g. four different
> client processes on the same host. But I was really thinking of server
> host:ports, which are the ones that get advertised and discovered. Client
> ports are not advertised and are usually random, as you point out, unless
> you explicitly bind to them. So ignore the port part of my comment. (At
> some point, though, someone else will come up with the scenario I just
> described.)
>

Okay, sounds like we don't need to worry about ports.

Just to be clear, the hostname selector we discussed above will be the
client hostname, not the server hostname.  I don't think it makes sense to
allow selecting on the server hostname, because the service config
parameters are not things that we would want to be different depending on
which backend the RPC happens to be sent to.  (Different RPCs can be sent
to different backends at the discretion of the load balancing policy, and
using different defaults for different backends would cause confusion for
policies like round_robin.)


>
>
>> In terms of how this is encoded in JSON, I would probably want it to be a
>> list of strings rather than a single string with a delimiter character.  In
>> other words, instead of 'hosts': 'host1,host2,...', it would be
>> something like 'hosts': ['host1','host2',...].
>>
>> What do you think?
>>
>
> That sounds like a good start. Perhaps we can reserve the right in the
> future to add ports if enough people show compelling uses for it, but for
> now we don't parse them or use them, only mention that in docs.
>

I think that I won't bother mentioning this at all for now.  We can add
this functionality later if and when it becomes necessary.


>
> Thanks!
>
> --
> Rudi Chiarito — Infrastructure — Clarifai, Inc.
> "Trust me, I know what I'm doing." (Sledge Hammer!)
>
> --
> You received this message because you are subscribed to the Google Groups "
> grpc.io" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/grpc-io.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/grpc-io/CAPzY349UjDMBJXY-H5Zzkt%2BQTHkFCgpHqFq-hTBxFQ9O9eMyxA%
> 40mail.gmail.com
> <https://groups.google.com/d/msgid/grpc-io/CAPzY349UjDMBJXY-H5Zzkt%2BQTHkFCgpHqFq-hTBxFQ9O9eMyxA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Mark D. Roth <[email protected]>
Software Engineer
Google, Inc.

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/CAJgPXp5pKKrqt%2B0H3siFu29VAGLe64OQSnFfyccNfdF4vKqsdg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to