Hi Jiwei Great advice. Thanks for your suggestions and additions.
Thanks, Xiangying On Fri, Dec 15, 2023 at 9:41 AM guo jiwei <techno...@apache.org> wrote: > Hi Xiangying, > I think we can rename this PIP to: *Introduce `allowed-clusters` and > `topic-policy-synchronized-clusters` to fully support replication on > message and topic level* > Currently, we can set replication clusters on the message and topic > level, but the replication clusters should be a subset of the namespace > replication clusters. which means : > If we set namespace replication clusters: cluster1, cluster2, cluster3, > at most, these three or two clusters can be set on message or topic. If the > user wanna set cluster4 or others, the replication > can't work as expected. > It's easy to reproduce by this test: > > > @Test > > public void testEnableReplicationInTopicLevel() throws Exception { > > // 1. Create namespace and topic > > String namespace = > > BrokerTestUtil.newUniqueName("pulsar/testEnableReplicationInTopicLevel"); > > String topic1 = "persistent://" + namespace + "/topic-1"; > > admin1.namespaces().createNamespace(namespace); > > admin1.topics().createNonPartitionedTopic(topic1); > > // 2. Configure replication clusters for the topic. > > admin1.topics().setReplicationClusters(topic1, List.of("r1", > "r2")); > > // 3. Check if the replicator connected successfully. > > Awaitility.await().atMost(5, TimeUnit.MINUTES).untilAsserted(() -> > { > > List<String> keys = pulsar1.getBrokerService() > > .getTopic(topic1, false).get().get() > > .getReplicators().keys(); > > assertEquals(keys.size(), 1); > > assertTrue(pulsar1.getBrokerService() > > .getTopic(topic1, false).get().get() > > .getReplicators().get(keys.get(0)).isConnected()); > > }); > > } > > > To fully support the replication, we find out an easy way to solve it. > Introduce `allowed-clusters` on namespace policies, which Xiangying > explains above. > How could this work and solve the issue? The same example : > If we set namespace replication clusters: cluster1, cluster2, cluster3, > and > set topic1 replication clusters: cluster2, cluster4. > set topic2 replication clusters: cluster1, cluster4. > We must set `allowed-clusters` with cluster1, cluster2, cluster3, and > cluster4. The broker side will validate the topic or message replication > clusters from the `allowed-cluster.` > In this way, we can simplify more codes and logic here. > For *`topic-policy-synchronized-clusters` *we also add examples in the > PIP. > > Hope the explanation could help @Rajan @Girish > > > > > Regards > Jiwei Guo (Tboy) > > > On Thu, Dec 7, 2023 at 10:29 PM Xiangying Meng <xiangy...@apache.org> > wrote: > > > Hi Girish, > > > > I'm very pleased that we have reached some consensus now. Pulsar already > > supports geo-replication at the topic level, but the existing > > implementation of this topic level replication does not match our > > expectations. > > > > At the moment, I can think of three directions to solve this problem: > > > > 1. Treat this issue as a bug and fix it so that Pulsar can truly support > > replication at the topic level. > > 2. Limit the replication topic policy, so that the replication clusters > at > > the topic level must be included in the replication clusters configured > at > > the namespace level. In this case, the topic level replication would > serve > > as a supplement to the namespace replication, rather than a true topic > > level policy. > > 3. Remove topic level replication. > > > > I lean towards the first option, as it would make Pulsar's replication > > configuration more flexible and would not break the previous behavior > > logic. > > > > >Yes, that's my viewpoint. In case that's not your view point, then in > your > > >use cases do you ever have more than one namespace inside a tenant? > > >With every property coming at topic level, it makes no sense for the > > >namespace hierarchy to exist anymore. > > > > I didn't propose this from the perspective of a user, but from the > > perspective of a Pulsar maintainer. The replication cluster at the topic > > level cannot function independently like other topic policies, and I > > attempted to fix it after finding the reason. > > > > From the user's perspective, I could modify my system to put topics with > > the same replication strategy under the same namespace. From the > > maintainer's perspective, if a feature can help users use Pulsar more > > flexibly and conveniently without introducing risks, then this feature > > should be implemented. Perhaps business systems do not want to maintain > too > > many namespaces, as they would need to configure multiple namespace > > policies or it might make their business logic complex. The other > > configurations for topics under this namespace might be consistent, with > > only a few topics needing to enable replication. In this case, topic > level > > replication becomes valuable. Therefore, I lean towards the first option, > > to solve this problem and make it a truly expected topic policy. > > > > On Thu, Dec 7, 2023 at 12:45 PM Girish Sharma <scrapmachi...@gmail.com> > > wrote: > > > > > Hello Xiangying, > > > > > > > > > On Thu, Dec 7, 2023 at 6:32 AM Xiangying Meng <xiangy...@apache.org> > > > wrote: > > > > > > > Hi Girish, > > > > > > > > What you are actually opposing is the implementation of true > > topic-level > > > > geo-replication. You believe that topics should be divided into > > different > > > > namespaces based on replication. Following this line of thought, what > > we > > > > should do is restrict the current topic-level replication settings, > not > > > > allowing the replication clusters set at the topic level to exceed > the > > > > range of replication clusters set in the namespace. > > > > > > > > > > Yes, that's my viewpoint. In case that's not your view point, then in > > your > > > use cases do you ever have more than one namespace inside a tenant? > > > With every property coming at topic level, it makes no sense for the > > > namespace hierarchy to exist anymore. > > > > > > > > > > > > > > One point that confuses me is that we provide a setting for > topic-level > > > > replication clusters, but it can only be used to amend the namespace > > > > settings and cannot work independently. Isn't this also a poor design > > for > > > > Pulsar? > > > > > > > > > > This feature was originally added in pulsar without a PIP. And the PR > [0] > > > also doesn't have much context around why it was needed and why it is > > being > > > added.. So I can't comment on why this was added.. > > > But my understanding is that even in a situation when the topics are > > > divided into proper namespaces based on use cases and suddenly there is > > an > > > exceptional need for one of the existing topics to have lesser > > replication, > > > then instead of following a long exercise of moving that topic to a new > > > namespace, you can use this feature. > > > > > > [0] - https://github.com/apache/pulsar/pull/12136 > > > > > > > > > > > > > > > > > On Thu, Dec 7, 2023 at 2:28 AM Girish Sharma < > scrapmachi...@gmail.com> > > > > wrote: > > > > > > > > > Hello, replies inline. > > > > > > > > > > On Wed, Dec 6, 2023 at 5:28 PM Xiangying Meng < > xiangy...@apache.org> > > > > > wrote: > > > > > > > > > > > Hi Girish, > > > > > > > > > > > > Thank you for your explanation. Because Joe's email referenced > the > > > > > current > > > > > > implementation of Pulsar, I misunderstood him to be saying that > > this > > > > > > current implementation is not good. > > > > > > > > > > > > A possible use case is where there is one or a small number of > > topics > > > > in > > > > > > the namespace that store important messages, which need to be > > > > replicated > > > > > to > > > > > > other clusters. Meanwhile, other topics only need to store data > in > > > the > > > > > > local cluster. > > > > > > > > > > > > > > > > Is it not possible to simply have the other topics in a namespace > > which > > > > > allows for that other cluster, and the local topics remain in the > > > > namespace > > > > > with local cluster needs. Seems to me like a proper use case of two > > > > > different namespaces as the use case is different in both cases. > > > > > > > > > > > > > > > > > > > > > > > > > > > For example, only topic1 needs replication, while topic2 to > > topic100 > > > do > > > > > > not. According to the current implementation, we need to set > > > > replication > > > > > > clusters at the namespace level (e.g. cluster1 and cluster2), and > > > then > > > > > set > > > > > > the topic-level replication clusters (cluster1) for topic2 to > > > topic100 > > > > to > > > > > > exclude them. It's hard to say that this is a good design. > > > > > > > > > > > > > > > > No, all you need is to put topic1 in namespace1 and topic2 to > > topic100 > > > in > > > > > namespace2 . This is exactly what me and Joe were saying is a bad > > > design > > > > > choice that you are clubbing all 100 topics in same namespace. > > > > > > > > > > > > > > > > > > > > > > > > > > > Best regards. > > > > > > > > > > > > On Wed, Dec 6, 2023 at 12:49 PM Joe F <joefranc...@gmail.com> > > wrote: > > > > > > > > > > > > > Girish, > > > > > > > > > > > > > > Thank you for making my point much better than I did .. > > > > > > > > > > > > > > -Joe > > > > > > > > > > > > > > On Tue, Dec 5, 2023 at 1:45 AM Girish Sharma < > > > > scrapmachi...@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > Hello Xiangying, > > > > > > > > > > > > > > > > I believe what Joe here is referring to as "application > design" > > > is > > > > > not > > > > > > > the > > > > > > > > design of pulsar or namespace level replication but the > design > > of > > > > > your > > > > > > > > application and the dependency that you've put on topic level > > > > > > > replication. > > > > > > > > > > > > > > > > In general, I am aligned with Joe from an application design > > > > > > standpoint. > > > > > > > A > > > > > > > > namespace is supposed to represent a single application use > > case, > > > > > topic > > > > > > > > level override of replication clusters helps in cases where > > there > > > > > are a > > > > > > > few > > > > > > > > exceptional topics which do not need replication in all of > the > > > > > > namespace > > > > > > > > clusters. This helps in saving network bandwidth, storage, > CPU, > > > RAM > > > > > etc > > > > > > > > > > > > > > > > But the reason why you've raised this PIP is to bring down > the > > > > actual > > > > > > > > replication semantics at a topic level. Yes, namespace level > > > still > > > > > > exists > > > > > > > > as per your PIP as well, but is basically left only to be a > > > > "default > > > > > in > > > > > > > > case topic level is missing". > > > > > > > > This brings me to a very basic question - What's the use case > > > that > > > > > you > > > > > > > are > > > > > > > > trying to solve that needs these changes? Because, then > what's > > > > > stopping > > > > > > > us > > > > > > > > from bringing every construct that's at a namespace level > > > > (bundling, > > > > > > > > hardware affinity, etc) down to a topic level? > > > > > > > > > > > > > > > > Regards > > > > > > > > > > > > > > > > On Tue, Dec 5, 2023 at 2:52 PM Xiangying Meng < > > > > xiangy...@apache.org> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi Joe, > > > > > > > > > > > > > > > > > > You're correct. The initial design of the replication > policy > > > > leaves > > > > > > > room > > > > > > > > > for improvement. To address this, we aim to refine the > > cluster > > > > > > settings > > > > > > > > at > > > > > > > > > the namespace level in a way that won't impact the existing > > > > system. > > > > > > The > > > > > > > > > replication clusters should solely be used to establish > full > > > mesh > > > > > > > > > replication for that specific namespace, without having any > > > other > > > > > > > > > definitions or functionalities. > > > > > > > > > > > > > > > > > > BR, > > > > > > > > > Xiangying > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Dec 4, 2023 at 1:52 PM Joe F < > joefranc...@gmail.com> > > > > > wrote: > > > > > > > > > > > > > > > > > > > >if users want to change the replication policy for > > > > > > > > > > topic-n and do not change the replication policy of other > > > > topics, > > > > > > > they > > > > > > > > > need > > > > > > > > > > to change all the topic policy under this namespace. > > > > > > > > > > > > > > > > > > > > This PIP unfortunately flows from attempting to solve > bad > > > > > > > application > > > > > > > > > > design > > > > > > > > > > > > > > > > > > > > A namespace is supposed to represent an application, and > > the > > > > > > > namespace > > > > > > > > > > policy is an umbrella for a similar set of policies that > > > > applies > > > > > > to > > > > > > > > all > > > > > > > > > > topics. The exceptions would be if a topic had a need > for > > a > > > > > > deficit, > > > > > > > > The > > > > > > > > > > case of one topic in the namespace sticking out of the > > > > namespace > > > > > > > policy > > > > > > > > > > umbrella is bad application design in my opinion > > > > > > > > > > > > > > > > > > > > -Joe. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Sun, Dec 3, 2023 at 6:00 PM Xiangying Meng < > > > > > > xiangy...@apache.org> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hi Rajan and Girish, > > > > > > > > > > > Thanks for your reply. About the question you > mentioned, > > > > there > > > > > is > > > > > > > > some > > > > > > > > > > > information I want to share with you. > > > > > > > > > > > >If anyone wants to setup different replication > clusters > > > then > > > > > > > either > > > > > > > > > > > >those topics can be created under different namespaces > > or > > > > > > defined > > > > > > > at > > > > > > > > > > topic > > > > > > > > > > > >level policy. > > > > > > > > > > > > > > > > > > > > > > >And users can anyway go and update the namespace's > > cluster > > > > > list > > > > > > to > > > > > > > > add > > > > > > > > > > the > > > > > > > > > > > >missing cluster. > > > > > > > > > > > Because the replication clusters also mean the clusters > > > where > > > > > the > > > > > > > > topic > > > > > > > > > > can > > > > > > > > > > > be created or loaded, the topic-level replication > > clusters > > > > can > > > > > > only > > > > > > > > be > > > > > > > > > > the > > > > > > > > > > > subset of namespace-level replication clusters. > > > > > > > > > > > Just as Girish mentioned, the users can update the > > > > namespace's > > > > > > > > cluster > > > > > > > > > > list > > > > > > > > > > > to add the missing cluster. > > > > > > > > > > > But there is a problem because the replication clusters > > as > > > > the > > > > > > > > > namespace > > > > > > > > > > > level will create a full mesh replication for that > > > namespace > > > > > > across > > > > > > > > the > > > > > > > > > > > clusters defined in > > > > > > > > > > > replication-clusters if users want to change the > > > replication > > > > > > policy > > > > > > > > for > > > > > > > > > > > topic-n and do not change the replication policy of > other > > > > > topics, > > > > > > > > they > > > > > > > > > > need > > > > > > > > > > > to change all the topic policy under this namespace. > > > > > > > > > > > > > > > > > > > > > > > Pulsar is being used by many legacy systems and > > changing > > > > its > > > > > > > > > > > >semantics for specific usecases without considering > > > > > consequences > > > > > > > are > > > > > > > > > > > >creating a lot of pain and incompatibility problems > for > > > > other > > > > > > > > existing > > > > > > > > > > > >systems and let's avoid doing it as we are struggling > > with > > > > > such > > > > > > > > > changes > > > > > > > > > > > and > > > > > > > > > > > >breaking compatibility or changing semantics are just > > not > > > > > > > > acceptable. > > > > > > > > > > > > > > > > > > > > > > This proposal will not introduce an incompatibility > > > problem, > > > > > > > because > > > > > > > > > the > > > > > > > > > > > default value of the namespace policy of > allowed-clusters > > > and > > > > > > > > > > > topic-policy-synchronized-clusters are the > > > > > replication-clusters. > > > > > > > > > > > > > > > > > > > > > > >Allowed clusters defined at tenant level > > > > > > > > > > > >will restrict tenants to create namespaces in > > > > regions/clusters > > > > > > > where > > > > > > > > > > they > > > > > > > > > > > >are not allowed. > > > > > > > > > > > >As Rajan also mentioned, allowed-clusters field has a > > > > > different > > > > > > > > > > > meaning/purpose. > > > > > > > > > > > > > > > > > > > > > > Allowed clusters defined at the tenant level will > > restrict > > > > > > tenants > > > > > > > > from > > > > > > > > > > > creating namespaces in regions/clusters where they are > > not > > > > > > allowed. > > > > > > > > > > > Similarly, the allowed clusters defined at the > namespace > > > > level > > > > > > will > > > > > > > > > > > restrict the namespace from creating topics in > > > > regions/clusters > > > > > > > where > > > > > > > > > > they > > > > > > > > > > > are not allowed. > > > > > > > > > > > What's wrong with this? > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > Xiangying > > > > > > > > > > > > > > > > > > > > > > On Fri, Dec 1, 2023 at 2:35 PM Girish Sharma < > > > > > > > > scrapmachi...@gmail.com> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Hi Xiangying, > > > > > > > > > > > > > > > > > > > > > > > > Shouldn't the solution to the issue mentioned in > #21564 > > > [0] > > > > > > > mostly > > > > > > > > be > > > > > > > > > > > > around validating that topic level replication > clusters > > > are > > > > > > > subset > > > > > > > > of > > > > > > > > > > > > namespace level replication clusters? > > > > > > > > > > > > It would be a completely compatible change as even > > today > > > > the > > > > > > case > > > > > > > > > > where a > > > > > > > > > > > > topic has a cluster not defined in namespaces's > > > > > > > > replication-clusters > > > > > > > > > > > > doesn't really work. > > > > > > > > > > > > And users can anyway go and update the namespace's > > > cluster > > > > > list > > > > > > > to > > > > > > > > > add > > > > > > > > > > > the > > > > > > > > > > > > missing cluster. > > > > > > > > > > > > > > > > > > > > > > > > As Rajan also mentioned, allowed-clusters field has a > > > > > different > > > > > > > > > > > > meaning/purpose. > > > > > > > > > > > > Regards > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 30, 2023 at 10:56 AM Xiangying Meng < > > > > > > > > > xiangy...@apache.org> > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Hi, Pulsar Community > > > > > > > > > > > > > > > > > > > > > > > > > > I drafted a proposal to make the configuration of > > > > clusters > > > > > at > > > > > > > the > > > > > > > > > > > > namespace > > > > > > > > > > > > > level clearer. This helps solve the problem of > > > > > > geo-replication > > > > > > > > not > > > > > > > > > > > > working > > > > > > > > > > > > > correctly at the topic level. > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/pulsar/pull/21648 > > > > > > > > > > > > > > > > > > > > > > > > > > I'm looking forward to hearing from you. > > > > > > > > > > > > > > > > > > > > > > > > > > BR > > > > > > > > > > > > > Xiangying > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > Girish Sharma > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Girish Sharma > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Girish Sharma > > > > > > > > > > > > > > > > > > -- > > > Girish Sharma > > > > > >