>From this proposal, I can understand the effect you want to achieve, but
there are some doubts, maybe it is not a good choice for the gateway

1. The gray level of the configuration will make the configuration of
APISIX stateful. The reason why it is considered to be stateful is that the
configuration takes effect on some nodes, which will cause inconsistent
behavior among APISIX nodes.
  a. Annotation alone cannot precisely control which APISIX node the
traffic is allocated to; the same request may flow on a normal node, and
then on a gray node; this will bring some usage burden, and it will also be
wrong. The requester causes trouble
  b. If we add flow control to the annotation, it will be functionally
duplicated with APISIX's existing grayscale capabilities. It is better to
verify the configuration through flow control directly

2. Any small-scale verification we do is aimed at the granularity of
traffic. The granularity of the proposal is too rough in terms of the
number of nodes in APISIX. Even if the production environment only affects
the configuration of a node, a lot of traffic will be affected

 If APISIX supports mesh in the future, the traffic granularity of the
sidecar is much smaller than that of the gateway, and then the gray-scale
configuration will be a matter of course.


Zhang Chao <[email protected]> 于2020年10月25日周日 下午12:33写道:

> Hello, community!
>
> As we all know, the configuration synchronization in APISIX resorts to
> ETCD, once administrator creates/updates/deletes a config instance, it will
> be detected by all APISIX instances immediately, that’s cool but the scope
> is ALL INSTANCES, which also means all instances might suffer breakdown if
> the config instance is malformed (maybe lack of check), that’s not
> ops-friendly.
>
> We’re familiar with the grayscale for server instances, use a small
> fraction of traffic to verify the work of new release, to reduce the
> influence of faults. So why not just using this way to verify the new
> issued config instance? What I named it as "configuration grayscale".
>
> The way to use "configuration grayscale" is simple, what we need is an
> indication to tell the current APISIX instance whether it should apply this
> config instance, so obviously we can add a new item in each configuration
> (like route, upstream):
>
>
> {
>     "upstream": {
>         "nodes": {
>             "127.0.0.1:8080": 1
>         }
>     },
>
>     "annotations": {
>         "grayscale": {
>             "hostname": [
>                 "apisix-node1",
>                 "apisix-node3"
>             ]
>         }
>     }
> }
>
> Here we put the "grayscale" into a more general field "annotations" rather
> than flattening it, that's more flexible and clear. The above example tells
> the APISIX instance to verify the grayscale firstly, just compare its
> hostname and the grayscale targets (wheter it's in the hostname list). If
> the grayscale hits, the APISIX instance is willing to use it, or on the
> contrary, it ignores this config instance just like it doens't receive it.
> The hostname comparsion is just a simple example and that not means we can
> only use this type of grayscale conditions. For instance, we may use the
> Nginx built-in variables systems to support more flexible grayscale.
>
> {
>     "upstream": {
>         "nodes": {
>             "127.0.0.1:8080": 1
>         }
>     },
>
>     "annotations": {
>         "grayscale": {
>             "vars": [
>                 { "$pid", "==", "12349" }
>             ]
>         }
>     }
> }
>
> We need to discuss the most suitable grayscale way for APISIX, which can
> cover almost demands that an APISIX administrator needs.
>
> Situtation will be complicated if grayscale is present in the config
> dependency (e.g. route depends on upstream), to better describe this
> problem, let's say we have two kinds of config A and B, and A depends on B.
> There are several situations we need to consider.
>
> 1) Both A and B have the grayscale conditions
>
> In such a case, the grayscale conditions must same or there will have some
> instances cannot apply both A and B, requests on those instances cannot be
> handled properly.
>
> 2) A has grayscale conditions but B not
>
> Since A depends on B and B can be applied unconditionally, there is no
> problem when A has grayscale conditions.
>
> 3) B has grayscale conditions but A not
>
> Which means for APISIX instances that outside of B's apply scope, they
> cannot find B, and requests cannot be handled rightly.
>
> So based on these situations, we should add some limitations to avoid
> these complicated situations, for example, don't gray release two config
> instances when they have relations, testing the "leaf" config instance
> firstly (B in abovementioned example) and make sure it's stable then try
> next.
>
> Let's say a more concrete example, Alice needs to create a new route, for
> those requests which uri is prefixed by "/api/v1/trade", proxy them to
> upstream "trade-system", head first she adds the upstream and no other
> Route in APISIX use this upstream, then she tries to create the route that
> will use this upstream, but she is'nt sure whether the upstream, the route
> are absolute right, so when she creating the Route on APISIX dashboard, in
> turn she marks this Route as grayscale, and only node which name is
> "apigw-sh-1" can apply this route, after creating it, she starts to monitor
> the behaivor in that node for a while, one day later, all related requests
> in "apigw-sh-1" meets the expectations, then she cancels the grayscale and
> now each APISIX instance applies these routes.
>
> The support of configuration scale can be gradual, we may support the core
> configurations like Route firstly, and let's users to try this feature and
> get more feedbacks.
>
>
> Chao Zhang
> [email protected]
>
>
>
>

Reply via email to