👍 Best Regards.
Jia Zhai Beijing, China Mobile: +86 15810491983 On Fri, May 8, 2020 at 4:29 AM Sijie Guo <guosi...@gmail.com> wrote: > Dezhi, thank you for sharing the proposal! > > It is great to see Tencent started contributing this great feature back to > Pulsar! This feature will unlock a lot of new capabilities of Pulsar. > > I have moved the proposal to > > https://github.com/apache/pulsar/wiki/PIP-63:-Readonly-Topic-Ownership-Support > > - Sijie > > > On Thu, May 7, 2020 at 5:23 AM dezhi liu <liudezhi2...@gmail.com> wrote: > > > Hi all, > > Here is a suggest (PIP) ReadOnly Topic Ownership Support > > ------------ > > # PIP-63: ReadOnly Topic Ownership Support > > > > * Author: Penghui LI, Jia Zhai, Sijie Guo, Dezhi Liu > > > > ## Motivation > > People usually use Pulsar as an event-bus or event center to unify all > > their message data or event data. > > One same set of event data will usually be shared across multiple > > applications. Problems occur when the number of subscriptions of same > topic > > increased. > > > > - The bandwidth of a broker limits the number of subscriptions for a > single > > topic. > > - Subscriptions are competing the network bandwidth on brokers. Different > > subscription might have different level of severity. > > - When synchronizing cross-city message reading, cross-city access needs > to > > be minimized. > > > > This proposal is proposing adding readonly topic ownership support. If > > Pulsar supports readonly ownership, users can then use it to setup a > (few) > > separated broker clusters for readonly, to segregate the consumption > > traffic by their service severity. And this would also allow Pulsar > > supporting large number of subscriptions. > > > > ## Changes > > There are a few key changes for supporting readonly topic ownership. > > > > - how does readonly topic owner read data > > - how does readonly topic owner keep metadata in-sync > > - how does readonly topic owner handle acknowledges > > > > The first two problems have been well addressed in DistributedLog. We can > > just add similar features in managed ledger. > > > > ### How readonly topic owner read data > > > > In order for a readonly topic owner keep reading data in a streaming way, > > the managed ledger should be able to refresh its LAC. The easiest change > > is to call `readLastAddConfirmedAsync` when a cursor requests entries > > beyond existing LAC. A more advanced approach is to switch the regular > read > > entries request to bookkeeper’s long poll read requests. However long > poll > > read requests are not support in the bookkeeper v2 protocol. > > > > Required Changes: > > > > - Refresh LastAddConfirmed when a managed cursor requests entries beyond > > known LAC. > > - Enable `explicitLac` at managed ledger. So the topic writable owner > will > > periodically advance LAC, which will make sure readonly owner will be > able > > to catch with the latest data. > > > > ### How readonly topic owner keep metadata in-sync > > > > Ledgers are rolled at a given interval. Readonly topic owner should find > a > > way to know the ledgers has been rolled. There are a couple of options. > > These options are categorized into two approaches : notification vs > > polling. > > > > *Notification* > > > > A) use zookeeper watcher. Readonly topic owner will set a watcher at the > > managed ledger’s metadata. So it will be notified when a ledger is > rolled. > > B) similar as A), introduce a “notification” request between readonly > topic > > owner and writable topic owner. Writable topic owner notifies readonly > > topic owner with metadata changes. > > > > *Polling* > > > > C) Readonly Broker polling zookeeper to see if there is new metadata, > > *only* when LAC in the last ledger has not been advanced for a given > > interval. Readonly Broker checks zookeeper to see if there is a new > ledger > > rolled. > > D)Readonly Broker polling new metadata by read events from system topic > of > > write broker cluster, write broker add the ledger meta change events to > the > > system topic when mledger metadata update. > > > > Solution C) will be the simplest solution to start with > > > > ### How does readonly topic owner handle acknowledges > > > > Currently Pulsar deploys a centralized solution for managing cursors and > > use cursors for managing data retention. This PIP will not change this > > solution. Instead, readonly topic owner will only maintains a cursor > cache, > > all the actual cursor updates will be sent back to the writable topic > > owner. > > > > This requires introducing a set of “cursor” related RPCs between writable > > topic owner and readonly topic owners. > > > > - Read `Cursor` of a Subscription > > > > So readonly topic owner will handle following requests using these new > > cursor RPCs > > > > - Subscribe : forward the subscribe request to writable topic owner. Upon > > successfully subscribe, readonly topic owner caches the corresponding > > cursor. > > - Unsubscribe: remove cursor from cursor cache, and forward the > unsubscribe > > request to writable topic owner. > > - Consume: when a consumer is connected, it will then `read` the cursor > > from writable topic owner and cache it locally. > > - Ack: forward the ack request to the writable topic owner, and update > the > > cursor locally in the cache. > > > > ## Compatibility, Deprecation and Migration Plan > > Since most of the changes are internally changes to managed ledger, and > it > > is a new feature which doesn’t change pulsar’s wire protocol and public > > api. There is no backward compatibility issue. > > > > It is a newly added feature. So there is nothing to deprecate or migrate. > > > > ## Test Plan > > - Unit tests for each individual change > > - Integration tests for end-to-end pipeline > > - Chaos testing to ensure correctness > > - Load testing for ensuring performance > > > > ## Rejected Alternatives > > ### Use Geo Replication to replicate data between clusters > > > > A simplest alternative solution would be using Pulsar’s built-in > > geo-replication mechanism to replicate data from one cluster to the other > > cluster. > > > > #### Two completely separated clusters > > > > The idea is pretty straightforward - You created two separated clusters, > > one cluster is for your online services - `Cluster-A`, while the other > > cluster is for your analytical workloads - `Cluster-B`. `ClusterA` is > used > > for serving both write (produce) and read (consume) traffic, while > > `ClusterB` is used for serving readonly (consume) traffic. Both > `Cluster-A` > > and `Cluster-B` have their own zookeeper cluster, bookkeeper cluster, and > > brokers. In order to make sure a topic’s data can be replicated between > > `Cluster-A` and `Cluster-B`, we need do make sure `Cluster-A` and > > `Cluster-B` sharing same configuration storage. There are two approaches > to > > do so: > > > > a) a completely separated zookeeper cluster as configuration storage. > > > > In this approach, everything is completely separated. So you can treat > > these two clusters just as two different regions, and follow the > > instructions in [Pulsar geo-replication · Apache Pulsar]( > > http://pulsar.apache.org/docs/en/administration-geo/) to setup data > > replication between these two clusters. > > > > b) `ClusterB` and `ClusterA` share same configuration storage. > > > > The approach in a) requires setting up a separate zookeeper cluster as > > configuration storage. But since `ClusterA` and `ClusterB` already have > > their own zookeeper clusters, you don’t want to setup another zookeeper > > cluster. You can let both `ClusterA` and `ClusterB` use `ClusterA`’s > > zookeeper cluster as the configuration store. You can achieve it using > > zookeeper’s chroot mechanism to put configuration data in a separate root > > in `ClusterA`’s zookeeper cluster. > > > > For example: > > > > - Command to initialize `ClusterA`’s metadata > > > > ``` > > $ bin/pulsar initialize-cluster-metadata \ > > --cluster ClusterA \ > > --zookeeper zookeeper.cluster-a.example.com:2181 \ > > --configuration-store > > zookeeper.cluster-a.example.com:2181/configuration-store \ > > --web-service-url http://broker.cluster-a.example.com:8080/ \ > > --broker-service-url pulsar://broker.cluster-a.example.com:6650/ > > ``` > > > > - Command to initialize `ClusterB`’s metadata > > ``` > > $ bin/pulsar initialize-cluster-metadata \ > > --cluster ClusterB \ > > --zookeeper zookeeper.cluster-b.example.com:2181 \ > > --configuration-store > > zookeeper.cluster-a.example.com:2181/configuration-store \ > > --web-service-url http://broker.cluster-b.example.com:8080/ \ > > --broker-service-url pulsar://broker.cluster-b.example.com:6650/ > > ``` > > > > #### Shared bookkeeper and zookeeper cluster, but separated brokers > > > > Sometimes it is unaffordable to have two completely separated clusters. > You > > might want to share the existing infrastructures, such as data storage > > (bookkeeper) and metadata storage (zookeeper). Similar as the b) solution > > described above, you can use zookeeper chroot to achieve that. > > > > Let’s assume there is only one zookeeper cluster and one bookkeeper > > cluster. The zookeeper cluster is `zookeeper.shared.example.com:2181`. > > You have two clusters of brokers, one cluster of broker is ` > > broker-a.example.com`, and the other broker cluster is ` > > broker-b.example.com`. > > So when you create the clusters, you can use ` > > zookeeper.shared.example.com:2181/configuration-store` > <http://zookeeper.shared.example.com:2181/configuration-store> > > <http://zookeeper.shared.example.com:2181/configuration-store> as the > > shared > > configuration storage, and use ` > > zookeeper.shared.example.com:2181/cluster-a`for > <http://zookeeper.shared.example.com:2181/cluster-afor> > > <http://zookeeper.shared.example.com:2181/cluster-afor> `ClusterA`’s > > local metadata > > storage, and use `zookeeper.shared.example.com:2181/cluster-b` > <http://zookeeper.shared.example.com:2181/cluster-b> > > <http://zookeeper.shared.example.com:2181/cluster-b> for > > `ClusterB`’s local metadata storage. > > > > This would allows you have two “broker-separated” clusters sharing same > > storage cluster (both zookeeper and bookkeeper). > > > > No matter how the physical clusters are setup, there is a downside of > using > > geo-replications for isolating the online workloads and analytics > workloads > > - data has to be replicated at least twice, if you have configured pulsar > > topics to store data in 3 replicas, you will end up have at least 6 > copies > > of data. So “geo-replication” might not be ideal for addressing this use > > case. > > > > ------------ > > > > > > Thanks, > > > > Dezhi > > >