Overall looks good.

Can you please convert this to a PIP and push a PR
for it? Then I think we can go to implement.

--
Yong

On Mon, 26 Jun 2023 at 15:23, Enrico Olivelli <eolive...@gmail.com> wrote:

> Il giorno lun 26 giu 2023 alle ore 09:21 Yubiao Feng
> <yubiao.f...@streamnative.io.invalid> ha scritto:
> >
> > Hi Yan,Asaf
> >
> > > I want to add only one step to your plan.
> > > If you introduce this flag in Y.X, then in Y.(X+1),
> > > let's remove this flag
> > > and keep the "true" value as the behavior.
> >
> > I agree with Asaf
> +1
>
> Enrico
>
> >
> > Thanks
> > Yubiao Feng
> >
> > On Mon, Jun 19, 2023 at 9:57 AM horizonzy <horizo...@apache.org> wrote:
> >
> > > Background
> > >
> > > In the Pulsar, it has two features:
> > >
> > >    -
> > >
> > >    The first feature allows users to set group and rack information for
> > >    bookies using pulsar-admin bookies set-bookie-rack.
> > >
> > > Here, users set bookie1 to bookie5 to the default group and bookie6 to
> > > bookie10 to the share group using commands, they don't care about rack
> > > information, they only care about which group the bookie belongs to.
> > >
> > > default={bookie1:3181=BookieInfoImpl(rack=default-rack,
> > > hostname=null), bookie2:3181=BookieInfoImpl(rack=default-rack,
> > > hostname=null), bookie3:3181=BookieInfoImpl(rack=default-rack,
> > > hostname=null), bookie4:3181=BookieInfoImpl(rack=default-rack,
> > > hostname=null), bookie5:3181=BookieInfoImpl(rack=default-rack,
> > > hostname=null)}
> > >
> > > _shared_={bookie6:3181=BookieInfoImpl(rack=default-rack,
> > > hostname=null), bookie7:3181=BookieInfoImpl(rack=default-rack,
> > > hostname=null), bookie8:3181=BookieInfoImpl(rack=default-rack,
> > > hostname=null), bookie9:3181=BookieInfoImpl(rack=default-rack,
> > > hostname=null), bookie10:3181=BookieInfoImpl(rack=default-rack,
> > > hostname=null)}
> > >
> > >
> > >    -
> > >
> > >    The second feature allows users to set the priority of traffic for a
> > >    namespace, where traffic is directed to the primary group first and
> > > then to
> > >    the secondary group. Users can set this priority using pulsar-admin
> > >    ns-isolation-policy set --namespaces public/default --primary
> "group"
> > >    --secondary "group".
> > >
> > > Here, users set the primary group of the /public/default namespace to
> > > "share" using a command.
> > >
> > > {
> > >   "bookkeeperAffinityGroupPrimary" : "share"
> > > }
> > >
> > > After this work is completed, all traffic under the /public/default
> > > namespace will be directed to bookie6-10 in the "share" group.
> > >
> > > Drawbacks
> > >
> > > After a period of time, users added some new bookies [bk11, bk12, bk13,
> > > bk14, bk15] to the bookie cluster, they found that some traffic under
> the
> > > /public/default namespace was directed to the newly added machines.
> After
> > > investigation, we eventually found that this was a defect in the
> working
> > > mechanism of bookkeeperAffinityGroupPrimary.
> > >
> > > *bookkeeperAffinityGroupPrimary work mechanism*
> > >
> > > All bookies in the cluster: bk1-bk15.
> > >
> > > Here are the steps of the broker pick bookies.
> > >
> > >    1.
> > >
> > >    Get the bookie rack info config default: [bk1, bk2, bk3, bk4, bk5];
> > > share:
> > >    [bk6, bk7, bk8, bk9, bk10]
> > >    2.
> > >
> > >    Exclude the bookies which are not the bookkeeperAffinityGroupPrimary
> > >    (share).
> > >    3.
> > >
> > >    Exclude the default group bookies [bk1, bk2, bk3, bk4, bk5].
> > >    4.
> > >
> > >    Pick bookies from the remaining bookies [bk6, bk7, bk8, bk9, bk10,
> bk11,
> > >    bk12, bk13, bk14, bk15]
> > >
> > > Therefore, some traffic may go to bk11-bk15, which is not what the
> users
> > > expect. The reason is that the new bookies, bk11 to bk15, did not have
> rack
> > > information set and were not part of any group.
> > >
> > > We provided a workaround for users to set the rack information for
> bk11 to
> > > bk15 in advance using the command pulsar-admin bookies set-bookie-rack
> > > before starting them. After user adopting this workaround, the traffic
> > > worked as expected.
> > >
> > > For user, it may be a bit inconvenient as they need to set rack
> information
> > > in advance before bringing new bookies online. In scenarios where
> there are
> > > strict limitations on traffic, if the bookie operation and maintenance
> > > personnel overlook this step, it could cause problems.
> > >
> > > Improvement
> > >
> > > I would like to introduce a new configuration strict for
> > > bookkeeperAffinityGroupPrimary. The default value for this
> configuration is
> > > false, which means that for old users upgrading to the new version, the
> > > logic will remain the same and bookies without rack information will
> not be
> > > constrained.
> > >
> > > If users manually set strict to true using the command pulsar-admin
> > > ns-isolation-policy set --namespaces public/default --primary "group"
> > > --secondary "group" --strict true, when the broker selects a bookie, it
> > > will only choose from the bookies in the primary group. If there are
> not
> > > enough bookies in the primary group, it will choose from the bookies
> in the
> > > secondary group. If there are not enough bookies in either group, an
> > > exception will be thrown. Bookies without rack information set using
> > > pulsar-admin
> > > bookies set-bookie-rack will not be selected.
> > >
> > > Compatibility
> > >
> > > When users upgrade from the old version to the new version, the working
> > > mechanism of bookkeeperAffinityGroupPrimary remains the same as before.
> > > When users upgrade to the new version and set strict to true using the
> > > command pulsar-admin ns-isolation-policy set --namespaces
> public/default
> > > --primary "group" --secondary "group" --strict true, and then roll
> back to
> > > the old version, the broker should be able to correctly parse the
> > > ns-isolation-policy configuration.
> > >
>

Reply via email to